SafePaths <-> SafePlaces Data Flow Overview

Document Overview

The goal of this document is to layout the flow of data from the SafePaths mobile client to the SafePlaces web application for redaction and data aggregation and back to the SafePaths mobile application for intersection analysis.

Flow Overview

Contact Tracer (CT) begins CT interview over phone
CT prompts interviewee to upload data from SafePaths app to SafePlaces
CT works with interviewee to redact, update, and add points of concern
CT concludes interview and transitions case data from “In Progress” state to “Staged for Publishing”
Admin user of SafePlaces/HA selects cases that are in a state of “Staged for Publishing”
Admin user performs further redaction on aggregated case data
After further redaction, the admin user transitions the cases from “Staged for Publishing” to a “Publishing” state.
Transitioning a set of cases to a “Published” state will result in the creation publishing record. of a JSON file containing anonymized aggregated case data (see The File that is Downloaded).

After being generated, the file is programmatically transferred to the location the HA will host the JSON file from (specified in the organization’s settings). The HA is responsible for implementing this functionality within their own API or by extending the existing example application. How an HA serves up the file could be differ from one HA to the next (could be AWS s3, gcloud, standard file server, etc.).
SafePaths mobile application hits the endpoint the HA has configured to host published data from and performs intersection logic

The File that is Downloaded

With the new proposed chunking of the data for scalability, the file that is going to be downloaded is a zip file. Inside the zip file is a cursor.json file, the separately chunked files. All of these files should be uploaded to the HA’s api_endpoint that is defined in their settings.

Devices should pull down the cursor.json file as it will give the device a list of files it needs to download.

Example of the Cursor File:

{
  version: 1.0,
  files: [
    {
      id: '1590395806_1590399405',
      startTimestamp: 1590395806,
      endTimestamp: 1590399405,
      filename: 'https://api.wowza.com/safe_paths/1590395806_1590399405.json'
    },
    {
      id: '1590399406_1590403005',
      startTimestamp: 1590399406,
      endTimestamp: 1590403005,
      filename: 'https://api.wowza.com/safe_paths/1590399406_1590403005.json'
    },
    {
      id: '1590403006_1590406605',
      startTimestamp: 1590403006,
      endTimestamp: 1590406605,
      filename: 'https://api.wowza.com/safe_paths/1590403006_1590406605.json'
    },
    {
      id: '1590406606_1590410205',
      startTimestamp: 1590406606,
      endTimestamp: 1590410205,
      filename: 'https://api.wowza.com/safe_paths/1590406606_1590410205.json'
    },
    {
      id: '1590410206_1590413805',
      startTimestamp: 1590410206,
      endTimestamp: 1590413805,
      filename: 'https://api.wowza.com/safe_paths/1590410206_1590413805.json'
    }
  ]
}

As the device iterates through this list it will pull down the filename which will contain the contact points data. Those files will contain the following.

{
  version: 2.0,
  authority_name: 'My Example Organization',
  publish_date_utc: 1590268155,
  info_website: 'http://sample.com',
  notification_threshold_percent: 66,
  notification_threshold_count: 6,
  concern_point_hashes: [ '416aa7c7caef6032', '201c55540de7155c' ],
  pages_name: 'https://api.wowza.com/safe_paths/1590395806_1590399405.json'
}

By doing this the device can keep track of the id that it is currently on, and if it loses connection, it knows where it left off.

I do suggest, a button on the frontend web app, that an HA can use to test to make sure they uploaded the files correctly. It would basically reach out to the API endpoint the have put into the settings, and check for cursor file. Maybe even go through the cursor and check the other files.

Open Questions

How often will the admin of the HA be releasing data for publishing? This seems like more of a question about the HA’s workflow but has impact on how often the SafePaths mobile app pulls data to perform intersection analysis on.
What does the data in the file the admin downloads look like? There seems to be some confusion around what data is actually contained in the JSON file that the admin user downloads. We know that the file should contain aggregate anonymized points of concern. What’s the (date) range of data that reflected in the file? There have been talks that the file should contain one of the following but it is unclear on where we’ve landed:
1. The file that the admin user downloads contains all points of concern that have been published since the start of them using the SafePlaces workflow tool – a “master” file. After the admin selects the cases to publish they are simply appended to this file.
2. The file that the admin user downloads only contains points data about the cases they are currently publishing. The file would be representative of the points of concern added since the last time they published a file.