User Story Location Mapping: Bus Journey #1

Summary: overlappign journey. Distinct points for boarding & alighting.

A is infected, B is uninfected.

Sceario:

  • A gets on a Bus at stop X at 10:00

  • The bus travels at 10m/second between stops, and stops for 1 minute, every 2-3 mins (flat random distribution).

  • B gets on the same Bus at later stop Y , at 10:10, 10 minutes later, and sits next to A.

  • At 10:20 A gets off the bus

  • A’s location pings every 4-6 mins (flat random distribution) with a starting offset of K seconds pasr 10:00 (K is a flat random distribution 0 to 300)

  • B’s location pings every 4-6 mins (flat random distribution) with a starting offset of M seconds pasr 10:00 (M is a flat random distribution 0 to 300)

 

Assume algorithm triggers a match if a ping from A and a ping from B land within 20m of each other (don’t worry about time, as we currently use a 4 hour window there).

 

Question: in what % of scenarios does B get a positive match against A.

 

 

NOTE1: Looking for someone who can build a statsitical model for a scenario like the above. Ideally this would be flexible and applicable to a range of different scenarios involving fast & slow movements, turning phones on & off etc.

 

NOTE2: one of the things that makes modelling complicated is that location pings do not occur rigidly at a particular consistent time (e.g. 10:00, 10:05, 10:10),

See CONCERN: Relative rather than absolute time risks missing overlaps · Issue #516 · Path-Check/safeplaces-dct-app

If we were to make that change in the product, then a lot of this modelling might prove unecessary.

However, as the moment, I only have one pathological case (as per #516). I’d like to assess whether in fact a wide range of common scenarios will lead to misses. This may lead to a model with absolute pings, which might greatly simplify all the modelling needed.

 

NOTE3: However, we also have an ambition to work with locaton histories from 3rd parties, e.g. Google. Those will be triggered at relative time intervals, not Absolute, so arguably we have to accommodate relative-time data sets, which might inevitably be in anti-phase.

 

Also: to explore: how does increasing the variability of intervals between pings help? E.g. 3-7 mins, rather than 4-6 mins…?