This document outlines the roadmap for the Freshness project.
Q4
...
2018
1. Dev Team: CKAN API change to allow rapid touching of datasets
Since it was proposed that freshness directly alter the HDX CKAN instance data, it needs a way to touch resources. Currently touching resources using resource_patch is very slow. A new method of fixing of resource_patch is needed. Jira Legacy server System JIRA serverId efab48d4-6578-3042-917a-8174481cd056 key HDX-5579
2
...
.
...
CJ, Dev Team: Establish Mixpanel measures for freshness
Create events and funnel created in MixPanel for tracking use of freshness workflow.
...
3. Design, Dev Team: Expose per-dataset freshness info and tools to users via interface
Design The design and implement implementation of an indication of freshness in the HDX interface (and api). Key decisions:
- Terminology. We have to be careful what we mean by fresh or up to date. Do we refer simply to the data being the latest available or does it matter if the data is the latest available but from a long time ago. Currently we say that something that never updates is always fresh but would fresh also mean recent to users?
...
has been done but further improvements could be made.
- Freshness in the API
- Sorting and/or filtering by freshness?
- Add a "Make Fresh" button for data contributors? If we do this, we may want to implement as an api call so that contributors could click a link in the overdue email to have the same effect.
- Related to this is this old Jira about making revision_last_updated visible in the UI
.Jira Legacy server JIRA (humanitarian.atlassian.net) serverId efab48d4-6578-3042-917a-8174481cd056 key HDX-4894
5. Data Partnerships Team: Define freshness workflow for handling delinquent datasets and ones with broken urls
Design workflows for dealing with delinquent datasets and datasets with broken urls looking at our tools like Zoho, JIRA etc. and any freshness backend tools that would simplify/automate workflows around handling broken and delinquent datasets.
eg.
- a freshness dashboard listing delinquent/broken datasets and tracking contacts
- an overall freshness metric(s) so we can monitor freshness as an OKR (% fresh, % overdue, % delinquent)
DP have a draft of their process here https://docs.google.com/document/d/1b77sksL5UiDF1jrMAU9bxN8BTVrM5XXgj1gOS6ZgN-E/edit and from their feedback, the first task is to create Tasks (issues) in Zoho. The assignee is obtained by reading the DP weekly duty roster, here https://docs.google.com/document/d/1uFe6uWS9iMSHsiFyvlYYmP1b_C2xc8F0wTowm2LuajE/edit to see who is on duty.
Plan:
- Examine Zoho api and try to make a Task programmatically
- Read assignee from DP duty roster
- Create freshness database query and generate Task
- Create Docker image and deploy
6. Mike: Implement emailer for broken urls
Construct SQL queries and write Python code for emailer of broken urls.
...
4. Serban/Mike: Implement back up of freshness database
The freshness database needs to be backed up to another server in case of failure.
T1
...
2019
5. Test and adjust freshness process
See Test and Adjust Freshness Process
6. Expose high level freshness metrics to HDX users
Consider exposing:
- Overall HDX Freshness. "HDX is 73% fresh today"
- Per-org Freshness. We could show an overall metric for an org to all users or just to org members.
...