Update on concerns - 18 May 2020

One month on, a review of the concerns - what’s been addressed, what’s still outstanding.

The vast majority of this still need to be addressed in public documentation. A minority of items still need substantial additional thought & analysis. Note that the additional set of unanswered concerns raised on Keith Klain’s podcast has been folded in with the other sets of concerns.

Headlines

At a high-level, what do we still need to do to adequately address these concerns?

Build an Ethical Model that identifies key risks. Determine how to mitigate these risk, document that.
Develop & publish a Security Threat Model
Develop & publish Security by Design Guidelines
Publish a register of organizations contributing to the project, and their interests.
Define the tests we use to determine whether a given organization qualifies as a “Health Authority” for the purposes of deploying our technology, and how we will maintain this standard over time.
Publish a detailed positon on Privacy, including Privacy Principles & implementation details.
Review the above to ensure they cover all points of concern raised below, and fill in any other gaps that may exist.
Once everything is published, explicitly drive external review from relevant interested experts, and act on their feedback.

Some questions that I don’t think we have good enough answers to yet…

What are we doing to ensure that the behaviour of our organization in future matches the standards that we have today?
How do we constrain what 3rd parties can do with our Open Source software, that may not align with our principles?

(for everything else, I think we have reasonable answers, though further flaws may be identified in the process of working through & documenting all the details).

Ethics

As well as Privacy concerns, there are a wide range of other Ethical concerns. If the project is Privacy-first, where do Ethics come?

Ethical risks need to not only be identified - there needs to be a clearly articulated & executed follow-through plan to mitigate. Publish a list of all ethics-related risks & our mitigation plans
An Ethical model would be useful - Fiona Charles suggested helping to build one (I didn’t reach out to take her up on this - I suspect it would be too late to ask her to do this, but we could ask her to review something we had created). Invite Fiona Charles & others to review out published ethical model
Concern about risk of harm to people who may be identified as COVID-infected. Discuss in our list of ethical risks.
Still no public evidence that there has been a comprehensive review of ethical considerations, and plans created to mitigate ethical risks - with this having happened early enough in the project that it can influence outcomes. Publish an ethical model + mitigations for all risks.
Specifically in the US, some specific issues with the cost to an individual of testing (even if the government says tests are paid for, the reality is not so. How is this issue handled? Articulate how our solution depends on testing - and explain what can be done if there are issues with availability/costs of testing
Not clear what all the organizations involved in the project are, and what their motivation are - with a particular concern about the involvement of for-profit companies: what is their real motivation for helping? Genuine disbelief that private companies can be contributing out of altruism. Transparency: we need a position on what private companies are contributing, and with what motivation. Detail all the entities involved in the project, and explain the nature & extent of their involvement & the reasons for being involved.
Not enough clarity over who exactly who is responsible for the systems that contain the data. Publish a clear position on exactly what data exists in what systems, and who is responsible.
Concern about harm to small businesses, because of information being spread that there was a COVID infection there. Include this in our list of risks of harm, and explain the steps we are taking to mitiagte it.
For specific risks of harm, see also: Risks of Harm

Impossibility of constraining future behaviour of involved parties

Whatever an organization’s intentions now, that can always change in future. How do we protect against that? Can we answer to this?

Concern that eventually the technology will end up used by law enforcement / security services, or for some other currently unintended purpose, which we cannot predict or control. Can we answer to this?
People cannot give informed consent, because there can be no guarantees about what will happen to their data once they have handed it over. To be addressed as part of a published position on privacy, consent & how proper use of data is regulated.
Concerns about how long we continue to collect data for. When does this end? (the alternative to it ending is that it becomes the new normal) To be addressed as part of a published position on privacy

Transparency & Communications

More project information should be in the public domain so that it is open to scrutiny by anyone with an interest, not just people who have signed up to support the project. See other points. Publishing on all points raised will go a long way here.
If we are already following particular guidelines, e.g. Privacy By Design guidelines, this should be clearly stated & evidenced in public. Publish PBD & SBD guidelines used on the project.
White paper is vague & high level - not enough technical details to review properly. Publish detailed technical information about our implementation & how it aligns with Privacy & Security principles
In a video, Ramesh Raskar talks about what the Health Authority can do with the data, without explictly highlighting the fact that this must all be done only with the user’s participation & consent. Ensure we publish clear statements on these topics.
Precise language is very important, e.g. data creation vs. data collection; “government agencies” “your town’s website” (plus see specific previous point) Ensure published information is very clear & unambiguous in use of terminology.

Trust of Health Authorities

The idea that what Health Authorities do with the data they receive will be constrained by the what we write in our Requirements docs is naive. Publish information about HA obligations under HIPAA, as they relate Safe Paths.
It appears we have no mechanism to oblige the HA to act with the consent of the user. Health Authorities in the US may be tightly regulated, but this may not be true in other jurisdictions. Explain how we are engaging with Health Authorities across the world to ensure they deploy in line with our principles on privacy & consent.
What protections do we have against coercion of users to provide data against their will? Address this in published text.
How do we define what is a Health Authority? How can we be sure that we won’t change that definition over time & include other agencies (e.g. security)? Present our definition of a Health Authority, and our safeguards against this definition changing.
Concerns about how long data is kept for Explain how this is enforced.
Concern about Open Source model. Anyone can come in & modify the project from the original vision. The technology could also be forked and re-used for some other purpose beyind the original design intent. We need to present this issue & what our plans are to overcome it.

Location Data, Redaction & Anonymization

There is no good reason for unredacted data to be shared with the Health Authority. Maybe it makes the implementation simpler, but if we are Privacy-first, Privacy should come first. I believe we think there are good reasons why the HA should have access to the unredacted data. Let’s explain why.

Couldn't the data be anonymous even prior to the contact trace interview, in the sense that the health official, and the systems have no idea what the user’s identity is? I believe the contact tracer will make an outgoing call to the patient & therefore will know who they are - but this is an HA implementation decision, and not down to us. We should document as part of privacy documentation
Certain locations such as “home” could be configured on the device and geo-fenced such that data is not even recorded on the device in these locations. That would improve privacy I believe this has been considered by there are arguments against? Lets articulate them clearly & pubicly.
Location services might be better implemented as a variable precision parameter, rather than a binary on/off. (one for Apple / Google primarily; though we could pioneer this approach in this app?) We could consider this as a privacy option - geohashes could work well to offer different resolutions

There are high risks of false positives or false negatives - in all cases, but specifically in dense urban situations, with mixed-use and single-use multi-storey buildings. We should talk about how the sitiation will work in high-rise buildings.

Social Inequality

By focussing on users with access to smartphones, doesn’t this project simply augment existing inequality, and cause other social issues? Publicly articulate our strategy for users without smartphones
Concern that data heat maps could lead to “kettling” of deprived communities to contain the spread. Explicitly cover this in our list of risks & our mitigation for them. This is a delicate one though, ultimately I think we have to argue that getting good data to Public Health Departments is a good thing, and we have to trust them to take the correct actions.
If the real issue is social inequality, it could be argued that Digital Contact Tracing cannot improve things, and it certainly risks makig things worse for certain communities. We can try to talk to this. I think the argument is that COVID itself is doing disproportionate harm to deprived communitied, and by slowing the COVD Pandemic we are therefore helping to reduce thsi aspect of social inequality.
COVID-19 is disproportionately hitting ethnic minority communities in the US. Poor people will use this app (and therefore be at risk of any harms it may cause), while rich people never will. Not sure about this one: sits oddly alongside the complaint that the app is not available to users who don’t have smartphones. However it may be a valid concern, and we should try to address it.

People & Processes

Threat modelling, in particular the STRIDE threat model. We are stepping up our Security analysis & this absolutely needs to be a part of that. Develop and Publish our Security Threat Model
Security & Privacy need to be fundamental concerns of everyone on the project, not just specialists. Solution here is probably to have PBD & SBD guidelines that everyone in the project reads & understands. These should also be public.
Being inside the project creates a cognitive bias - outside perspectives are valuable because they do not have this bias. Solve this by publishing key resources, and actively soliciting external input / criticism.
If we do get good testers on the project, they will ask awkward questions. Does testing have an important enough seat at the table on this project? Is it listened too when decisions are taken? I’d rather not try to address directly, but rather make this evident through publishing good answers to all the other questions addressed.
Concern about diversity on the project Open Source in general has diversity issues. Poorer people can’t make time around their job & other commitments to volunteer on a project like this. What can we do here? Can we get funding for sponsorship to enable people who could not otherwise contibute, to do so?

Adequately Addressed (IMO)

But we still need to publish documentation that talks to these points.

Access to location data can be used to gain all sorts of important intelligence about another person. Hence location privacy matters a lot. We are addressing this by hashing the public data.
Data must be redacted, since e.g. home address makes you easily identifiable. We are redacting data, and giving user final consent before publishing.
Bluetooth-only apps are flawed in a number of ways. We are fully aware of this, and our solution works alongside human contact tracers
There is a specific issue with 3rd party location-tracking apps, which may leak data, which could then be correlated to public data published by a Health Authority, de-anonymizing the user. It may not be safe to deploy Safe Paths alongside 3rd party apps that have location data enabled. Addressed by geohashes & hashing of public data
“Anonymized location data” is a myth. It is quite possible to extract individual trails from an anonymized pool of data points. Addressed as best we can by hashing of the data
There are many ways a curious individual or agency could explore & pick apart the public data - especially in e.g. a small island community. Ditto any security agency. Addressed as best we can by hashing of the data