Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document is intended as a collection of procedures and resources to guide the curation of Data Completeness instances (henceforth, Data Grids) which can be activated for any location page on HDX (by a sysadmin).  This document and others linked from it, should evolve to capture best practices and any other useful info leanred learned as the data grid curators do their work.

...

Process Overview

The basic curation process is outlined below:

...

There may be more on the feature server for testing purposes, but the ones listed below should be the only active ones on the production server.

CountryProduction Data GridFeature Server Data GridCurator(s)Last check date
YemenProduction: yemFeature: yemAmadu26 April 2019
SudanProduction: sdnFeature: sdn

Meti


IndonesiaProduction: idnFeature: idnFaizal26 April 2019
SomaliaProduction: somFeature: somMeti26 April 2019
ColombiaProduction: colFeature: colAmadu
PhilippinesProduction: phlFeature: phlAmadu26 April 2019
AfghanistanProduction: afgFeature: afgMeti
BangladeshProduction: bgdFeature: bgdFaizal
ChadProduction: tcdFeature: tcdNafi
MozambiqueProduction: mozFeature: mozObadah26 April 2019
VenezuelaProduction: venFeature: venJoseph
Democratic Repubic of the CongoProduction: codFeature: codJoseph
Central African RepublicProduction: cafFeature: cafNafi
MyanmarProduction: mmrFeature: mmrObadah

Quality Checks Process

Each dataset that is a candidate for data grid has to be evaluated to determine if it fully meets the requirements to be included, partially meets the requirements, or does not meet them at all.  The outcome determines what actions have to be taken in the YAML file to inlcude or exclude the file, and any comments to be recorded for users to understand where the dataset falls short.  Below the process diagram, you will find more details on each quality check.

...

  • Dataset does not appear to cover all admin X units and is therefore assumed to be incomplete.
  • Dataset covers a limited area.
  • Some "no data" values occur, but the meaning of these values not defined in the metadata.
  • It is not clear from metadata if this dataset attempts comprehensive coverage and is therefore assumed to be incomplete. 
  • Dataset is not considered complete by its contributor.
  • OpenStreetMap data relies on user contributions and may not be comprehensive for all areas.  Dataset does not always contain data about practicability of a road.
  • Dataset is limited to captials of administrative divsions.

Are location references explicit in the resource or joinable to an available location reference that also appears in the data grid?

...

  1. A top level Data Grid element which has
    1. One or more Category Elements (like "Admnistration" or "Population and Socio-Economic") which have
      1. One or more Subcategory Elements (like "Administrative Divisions" or "Populated Places") which have
        1. A set of rules for including and excluding datasets based on tags, dataset names, or any other query which hassupported by solr. These rules are in the format of an fq query and therefore can be tested using the hdx site.  For example to test the include rule (tags:"populated places" AND sunbnational:1) for Indonesia, use https://data.humdata.org/search?fq=(tags:"populated places" AND subnational:1 AND groups:idn)
          1. One or more include rules which specify one or more queries the results of which will be added to the data grid.  These rules are in the format of an fq query and therefore can be tested using the hdx site.  For example to test 
          2. Zero or more exclude rules which specity datasets that should not be allowed in the data grid
        2. A set of metadata overrides which refer to datasets already included by the rules and define how they are displayed and comments that are displayed along with them.

...