This page is intended as a place to document ideas that should be considered for inclusion in our Data Systems road map or OKRs. For ideas that need a broader HDX-wide discussion, please use the brainstorming Trello. When adding an idea here, please indicate who put it there in case of questions.
Plan for Improving Freshness in HDX (Mike/CJ)
Test and Adjust Freshness Process
Archiving and Superceding/Versioning/Fixed Data URLs etc. (Mike/CJ)
There are a lot of ideas bundled into this - see Fixed Data URLs Idea
Archiving/Versioning best practices (Mike/CJ)
Increasing our automation of visualizations and other tools is limited by the unpredictable nature of how many datasets are updated (new datasets, new resources, replacing resources). We should both document the ideal best practice and promote that to our users - this requires solving the versioning problem in the first place.
Not sure if the ones below are still relevant:
...
Better handling of resource formats and preview button (CJ)
The resource format determines if the "preview" button is available. For a zipped CSV, this means that to avoid having the preview button show (which won't work) the format has to be set to zip, which is not ideal for search purposes. We should make the "preview" button functionality work based on the actual file, not the file format set by the user, and we should encourage users to set the file format to something other than zip.
Create a comms channel for updating users of the API when changes happen (CJ)
We broke the UNOSAT updates for a couple of months when we made a change to the CKAN API. We could also use this as a channel for promoting new capabilities or other automation-related tools/resources that we deploy.
End-to-end video for kobo to quick charts (David M)
(mentioned on update call with UNHCR 13-mar-2018)
Show a complete data flow: form design (with HXL tags) → KoBo deployment → offline data collection → KoBo Collect → download as Excel & offline cleaning → upload to HDX → Quick Charts (we could also show an alternate flow for private data and HDX Connect)
Delete dead indicator datasets on HDX
Consider deleting all datasets in this list that have a quality checked flag and an updated date of 2015 or earlier. https://data.humdata.org/group/rou?groups=bol&q=&ext_page_size=100
Add date filtering and download capability to the organization stats page (and maybe the resource downloads page)
JIPS has asked for this a couple of times to support there reporting (see email from Corina Demottz on 19-feb-2018)
Develop a channel for comms around HXL Proxy (CJ)
There seems to be a steady stream of improvements. Should we be benefitting from those via comms like a twitter channel just for HXL stuff?
Metric for monitoring occurrence and handling of delinquent datasets (CJ)
Metric for closing dataset messages sent by users to contributors (CJ)
We get a small but steady stream of these messages. I wonder if we still follow up to make sure users get a reply. Should we have a metric to monitor this?
Improve embedding of tableau visualizations (CJ/Dan)
Our current approach doesn't give us much control over how the visualizations appear. However using the full tableau embed code comes along with javascript risks. Let's consider how our relationship (and our user's relationship) with tableau is likely to develop an decide if/how to invest in improving the embeds on HDX
Integrate Freshness into HDX (Mike/CJ)
There are several tasks to be done to finalize freshness and make the information there available to users. This will be an essential for v1 of data grid also. Here is the Data Freshness Roadmap.
Archiving/Versioning best practices (Mike/CJ)
Increasing our automation of visualizations and other tools is limited by the unpredictable nature of how many datasets are updated (new datasets, new resources, replacing resources). We should both document the ideal best practice and promote that to our users.
Archiving/Versioning HDX tooling (Mike/CJ)
We should consider adding some kind of archiving function that would make it easier for users to keep a history of previous versions of their resource removing the need for creating new resources for updates.
Resource View functionality in HDX Python API (Mike)
In order to set up QuickCharts for multiple similar datasets, HDX Python needs functionality to handle a new type - "resource view". A "resource" can have one or more "resource views" of which HXL Preview aka QuickCharts is one. By writing a resource view, it is possible to set up a default view of QuickCharts (Dan suggestion, now proven in prototype by Mike.)
HDX Python Country - extend to use new feed (Mike)
https://en.wikipedia.org/wiki/List_of_sovereign_states_and_dependent_territories_by_continent_(data_file) - HDX Python Country could be extended to read from this Wikipedia page which does not have the political limitations of M49 (but uses M49 as a starting point).
Is it time to resurrect the idea of a curated data service? (Mike/CJ)
With a general increasing increasing trend in the number of APIs and a general systematizing going on in the humanitarian community, could HDX serve as an aggregator/standardizer/hxlator for other humanitarian data sources? Will HXL need to become a bit more restrictive in order to take advantage of the systematizing that is happening?
IATI collaboration (David)
Work on the official collaboration among the Centre, Development Initiatives, and FTS, including helping to hire and direct a new team member. Also work with the HXL TAG and DFID on ideas about surfacing general IATI data on HDX, and converting it to a HXL-tagged spreadsheet format. May even include adding native IATI support to the Proxy, so that Quick Charts (etc) can just use it.
HXL standard (David, supported by CJ, Simon, and Mike)
Get the announcements for HXL 1.1 beta out by the end of Jan, and manage the process to finalising 1.1. Start work on the next HXL standard release, including discussions about focus and possible governance changes.
HXL partner outreach (David, supported by Data Partnerships and Data Systems teams)
Reach out directly to potential HXL standards partners. Try to have more orgs committing to HXL by end of quarter.
HXL infrastructure (David, supported by Simon and Mike)
Continue to harden improve the HXL software infrastructure underlying front-end implementations like Quick Charts. Also evaluate and possibly integrate alternative tools like Frictionless Data.
HXL communications (David, supported by Becky and Elizabeth)
HXL blog posts, tutorials, and other communications materials. Also support anyone presenting HXL at an event or teaching it in a workshop.
...