HDX data check (quality and sensitivity control and data freshness checks)
All data publicly or privately shared through HDX is reviewed by a Centre for Humanitarian Data team member to ensure it follows the Terms of Service. This initial check is always performed by a member of the Centre's Data Partnerships team (DPT). A dataset will immediately be deleted from the HDX platform in case the data is found to contain personally identifiable information or other data that the Data Partnership Team DPT member considered likely to be sensitive.
...
- In any case , the Data Policy Officer (Jos) is notified - in case the Data Policy Officer (Jos) is not available, this notification will be forwarded to the Data Partnerships Team lead (Javier) and to the Centre Lead (Sarah).In case the potentially sensitive data is microdata (row-level survey data), Mike will also be notified.
These team members, together with the Data Partnership Team member, review and document the case. After internal discussion, the contributor of the dataset will be notified and invited to jointly discuss the issue and to determine next steps.
Further, a dataset will be made private by the Centre for Humanitarian Data team member and the contributor will be notified immediately in case the Data Partnerships Team members finds that:
● The dataset is found to contain gaps that render the quality of the dataset insufficient for use by a humanitarian audience, or;
● There are no resources (files or links) or broken links are found, or;
● The dataset contains data that is not relevant to a humanitarian audience, or;
● Data was not shared in one of the suggested formats.
...
- the potentially sensitive data is microdata, the following steps will be taken:
- The DPT member on duty makes the dataset private and creates a ticket in the SDC for Microdata Log (this sheet)
- Nafi obtains a copy of the microdata
- Nafi notifies the organization and explains the SDC process, through this email template, and offers the organization the opportunity to download their data before it is deleted from the platform
- After Nafi and the organization have downloaded a copy, or after 24hrs pass, the data is deleted from HDX unless it is to be reasonably expected that the data in question does not pose any significant risk, given the contents and context of the dataset
- Nafi applies the risk assessment tool which is part of the SDCmicro package developed by the World Bank
- Nafi shares back the risk assessment with DPT and Policy
- The organization focal point and Nafi reach out to the organization to inform them about the established risk level and suggested next steps
- If needed, Nafi applies the SDC tool to the data concerned, and conducts another risk assessment and estimates information loss, results are shared with organization focal point and policy email address
- Next steps are decided in a call to do case review and recommendation with organization, focal point, Nafi, and ideally data policy
- If the data is re-published on HDX after SDC has been applied, Nafi suggests to the contributor how to flag this in the metadata
- Lessons learned are recorded in the SDC for Microdata Log (available here) by Nafi