...
By dramatically reducing the amount of personal data we share, moving from a “redacted trail” model to a “zones of concern” model, the personal nature of the data is massively reduced, and therefore the privacy issues are reduced.
But I don’t think the privacy risks are eliminated - we need to think not only about the privacy of the infected patient, but also the privacy of businesses, and other individuals known to frequent affected locations. So I don’t think the encryption requirement goes away.
Potentially we need to serve up a much more semantically rich set of descriptions of “zones of concern” - not just space/time boxes like the restaurant example, but more sophisticated examples like the cinema/theatre and bus example.
Our current encryption proposals assume that the data served by the HA is a homegenous set of place/time data points. A different approach may be needed for these more sophisticated examples.
It’s not clear to me how a user’s space-time points can be assessed against a rich description of a “zone of concern”, without either the space-time point being disclosed to a server (server-side comparison), or the description of the “zone of concern” being disclosed to the App (client-side comparison). The problem being that a cryptographic hash function will not preserve any of the topology of the space-time region being hashed.
A possible solution would be for the “zone of concern” to be resolved into a discrete set of space-time points, which could be served to the client in a hashed form, which the client could compare this with their own hashes of their location data. These hashed space-time points could retain a “criticality” value without any obvious loss of privacy. Matching based on speed/bearing as well gets complicated, though!
Summing Up
The key points I want to pull out from the above are:
We should move away from a "redacted trail" model, to a "zones of concern" model. Just as effective & much more privacy protective.
"zones of concern" do not need to be comprised of location/time pairs from the patient's original trail. They could be made up of newly synthesized location/time pairs to better match the matching needs of the HA for a particular environment.
Location/time pairs can & should have a "criticality" associated with them.
Negative criticality may be a useful concept (e.g. in the theatre/ceiname case, anyone who had their phone on in the middle of the show at this location, was probably not watching the show).
It would be nice to have a much richer model for describing "zones of concern" (e.g. vector-based rather than point-based, and factoring in speed/bearing) to help with cases like bus journeys, . But I can't see any way to do that in a manner that would enable encryption, and I am doubtful we are going to be able to invent such techniques quickly.