Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • We should move away from a "redacted trail" model, to a "zones of concern" model. Just as effective & much more privacy protective.

  • "zones of concern" do not need to be comprised of location/time pairs from the patient's original trail. They could be made up of newly synthesized location/time pairs to better match the matching needs of the HA for a particular environment.

  • Location/time pairs can & should have a "criticality" associated with them.

  • Negative criticality may be a useful concept (e.g. in the theater/cinema case, anyone who had their phone on in the middle of the show at this location, was probably not watching the show).

  • It would be nice to have a much richer model for describing "zones of concern" (e.g. vector-based rather than point-based, and factoring in speed/bearing) to help with cases like bus journeys, . But I can't see any way to do that in a manner that would enable encryption, and I am doubtful we are going to be able to invent such techniques quickly.


Plan for MVP1

If we agree on all of the above as our correct overall direction, what’s the bare minimum we need to do for MVP1

  • Ensure that “Redaction” guidelines are up-to-date to ensure that all data that is likely to be ineffective for exposure detection (e.g. walking o the street outside) is redacted.

  • Consider renaming “Redaction” to shift emphasis from privacy towards efficacy. This will include in privacy language used towards users.

  • All data points are stored as one-way hashes - see Hashing details below, to either +/-19m or +/-76m accuracy (TBC). Also whether or not to include salt in MVP1 is TBC.

  • Update Safe Paths App to match based on hashed geohashes of:

  1. The recorded GPS point

  2. Points 20m to the N, NE, E, SE, S, SW, W & NW (if these generate different Geohashes)

  • Update Safe Paths App to log a minimum number of points of concern before generating a notification (value tbD. 6 = 30 mins?)

  • Reduce default exposure time for a point of concern from (0 mins to 4 hours) to (-5 mins to +5 mins).

Hashing Details

Published data points should be geohashes (less-precise than specific GPS points), and stored as a SHA-256 hash of (geohash, time-bin) (where time-bin is a 5 minute rounded-down time interval in UTC).

Geohash accuracy (this is at the equator, slightly more accurate further from the equator)

Number of digits

m accuracy

6

+/-610

7

+/-76

8

+/-19

9

+/-2.4

https://gis.stackexchange.com/questions/115280/what-is-the-precision-of-a-geohash

For additional security a salt can be added to the hash, Ideally this is:

  • Specific to a single HA

  • Changes daily

  • Can be published by the HA alongside the points of concern

Future Phases

If we deliver MVP1 as above, what would future phases look like? (we can also conaiser whether any of these is so important it should be in MVP1

  • Variable geohash blurring depending on geography (urban vs. rural) and number of points of concern.

  • Add a “criticality” value to a point of concern, to allow the contribution a given point of concern makes towards hitting the threshold for notification to be different from the default value.

  • Add a “time-window” value to a point of concern: to allow the time-window that counts for an overlap to be different from the default value.

  • Add basic tools to Safe Places to allow “criticality” and “time window” to be set on individual data points.

  • Add targeted toools to Safe Paths to replace user-provided data points with synthetic data points that are optimal for generating user matches, for example:

  • (e.g.) A “Bus” tool, which traces a bus route with a much finer set of data points, each with a very low “time window”

  • (e.g.) A “Cinema/Theater” tool, which sets high-criticality points of concern at the start and end times of a given show, and sets negative-criticality points of concern during the middle of the show.