User Stories: What info about diagnosed cases & their trails do users really want from an App?

 

TL;DR:

  • Terminology matters: we are filtering (not matching) the infection data det when we present it to the user.

  • User stories need to focus on what the user actually wants, not what we already specified before we really thought about the user

  • Geography matters. GPS lets us leverage Geographical information. Let’s leverage that as much as we can.

 

Full length version….

I’ve seen two different mechanics for sharing info about diagnosed users wiith an undiagnosed user.

  • First is a heat map, as offered in Privacy Kit

  • Second is a set of contact points, where we effectively filter the data set, and only show those points where there was actually a risk of transmission.

  • (or try to: in fact the 5 min granularity of the GPS pings, which may be in anti-phase makes this a non-trivial problem to solve, see GitHub #516).

But there are many ways we could filter & expose this data set to match the user’s interest….

… why just these two?

… what do users really want?

(I notice that I have previously talked about matching data points between the user & the infected data set. That’s a bad way to talk about this, because it already presumes a particular apprach. Better to talk abut filtering the infection data, based on information that we have about the user of the app. That’s what’s really going on, and matchign data points is just one of many possible ways of filtering the data)


In the Quality Map, I wrote this:

Accuracy (False Positives)

I don’t want to be warned about encounters where there was no plausible risk of infection

Classic error: I read the spec, and assumed that it articulated the user’s actual need. Recalls this classic post:: “As a user… I do not want to Register”

 

Do we have evidence that this is not what the user actually want? Try this one and see how you feel about it…

 

If an infected person came up to your front door, would you like the app to tell you:

  1. only if I was at home at the time, or returned home within 4 hours.

  2. in all cases

Maybe I’m unusual, but if my house was visited by someone with COVID-19, I would want to be notified even if I only returned home a day later.

Why? Because it’s my home! Because I’m going to be touching the door handle etc. a lot of times: the same things my visiot may have touched. Maybe he left a note in my mailbox, which I then opened and picked up? The fact that someone visited my home while I was out, even if I returned more than 4 hours later, is for me a significant factor in terms of my COVID-19 risk.


So what do users actually want?

”As a user I want to avoid catchign COVID-19”. Sure - that’s what the heatmap view sort of promises. But given the lag in the data , it’s actually misleading to suggest that apps can help you avoid catchign COVID-19 - and we are not snake oil salesmen.

”As a user I want to be able assess the risk that I have already caught COVID-19, so that I can take appropriate actions to protect my friends, family, colleagues and community.” - that’s about what the user story is…

Things are getting clearer. The user does not want to know “a list of all points where they have been within 20 meteers & 4 hours of someone who had COVID-19”.

That’s an algorithm someone made up to try to meet the true user story. But it doesn’t mean it’s a great solution to the actual user story we are trying to meet.

Why not? Because geography is not uniform.

 

Geography is not uniform is the whole reason we are using GPS rather than Bluetooth as our primary technology. Bluetooth is location-agnostic, and so has to assume that geography is uniform. That is it’s key weakness.

The key strength of GPS is that we can match the locations where infected patients have been, to real physical locations - and by adding that information, we can dramatically improve the risk assessment we can do.

Nobody catchs COVID-19 in the middle of a freeway (unless they are on a Greyhound maybe….)

So we know in our solution that geography is not uniform, yet we deploy an geographically-uniform algorithm to decide what infection data to share with a user. And we only employ geographically-non-uniform reasoning when it comes to the user’s reasoning about the data we choose to share.

But maybe we already filtered out some of the valuable data.

 

So what would a geographically non-uniform filtering algorithm look like?

  • It could be based on explicit user input of key locations: their home address, their work address - places they spend a lot of time. We could share all infection data in those locations, regardless of how close in time they are to our undiagnosed user.

  • The same sort of thing could be done dynamically, based on analysis of where the user spends most of their time - the more pings we get in a particular area, the greater a time window we open in terms of which points of concern we share with the user.

  • Is the converse also valid? If I register a single ping (i.e. ~10 mins max, but probably just a few seconds at a location), do I really care about infectio points 2-4 hour previously? What if I add velocity data? I know we don’t want to publish velocity data publicly, but we might be able to make good use of it recorded on the local phone, to decide how far back in time to bother searching for infection points.


What other geography is relevant to the risk of infection?

  • Highways (mentioned above). Maybe we just ignore all infection points on highways (slight care needed for public transit such as a Geyhound)

  • Stores, cafes etc. seem like they might be relatively high risk points for infection. Peopel stop, they handle stuff. risk of infection from fomites in a store is probably much higher than risk of infection from fomites on the street.

  • But the location of certain bits of street furniture might be signifcicant - cross-walks? If a user crosses a street near the cross-walk, the risk associated with any nearby infection points increases.

  • .. I’m not constructing a long list, but it does seem like there is a lot we could do with detailed geographical data (both personal & public), not only in helping the user to interpret the isk associated with the infection points that we share, but also in deciding which seubset of infection points to share in the first place.