Number of Files Locally and Externally Hosted
Number of Resources Percentage of Resources | |||||
Type | 09/2016 | 01/2017 | 04/2017 | 07/2017 | 10/2017 |
---|---|---|---|---|---|
File Store | 2102 22% | 2472 24% | 2771 26% | 3036 17% | 3651 9% |
CPS | 2459 26% | 2449 24% | 2449 23% | 2449 14% | 2449 6% |
HXL Proxy | 2584 27% | 2584 25% | 2584 25% | 2584 15% | 2584 6% |
ScraperWiki | 162 2% | 158 2% | 160 2% | 160 1% | 160 0% |
Others | 2261 24% | 2544 25% | 2537 24% | 9203 53% | 32578 79% |
Total | 9568 100% | 10207 100% | 10501 100% | 17432 100% | 41422 100% |
The SQL queries are:
select count(*) from dbresources where run_number = xxx;
select count(*) from dbresources where run_number = xxx and url like '%data.humdata.org%';
select count(*) from dbresources where run_number = xxx and url like '%manage.hdx.rwlabs.org%';
select count(*) from dbresources where run_number = xxx and url like '%proxy.hxlstandard.org%';
select count(*) from dbresources where run_number = xxx and url like '%scraperwiki%';
Number of Dataset Updates before and after introduction of Overdue email
There is a 46% increase in updates happening after the overdue email was introduced, with many related to problems with automated systems eg. in HOTOSM.
The SQL queries are:
60 days of runs before overdue emails sent:
SELECT DISTINCT a.id, c.title FROM dbdatasets a, dbdatasets b, dbinfodatasets c WHERE a.id = b.id AND a.run_number > b.run_number AND b.run_number > 89 AND b.run_number <= 149 AND a.fresh = 0 AND b.fresh = 2 AND a.id = c.id;
60 days of runs after overdue emails sent:
SELECT DISTINCT a.id, c.title FROM dbdatasets a, dbdatasets b, dbinfodatasets c WHERE a.id = b.id AND a.run_number > b.run_number AND b.run_number > 149 AND b.run_number <= 209 AND a.fresh = 0 AND b.fresh = 2 AND a.id = c.id;
60 days of runs after overdue emails sent
Before | After | Reason | |
---|---|---|---|
HOTOSM | 12 | 36 | Likely failed export |
FTS | 51 | Likely scraper failures during initial creation | |
WFP | 10 | Wrong update frequency | |
IDMC | 47 | Wrong update frequency | |
InterAction | 36 | Wrong update frequency | |
Other | 21 | 18 | |
Total | 94 | 137 |
Baseline Crisis Data and new/updated Data on Crisis Onset
Crisis | Day0 | Country | ISO3 | Baseline | Updates | Creates |
---|---|---|---|---|---|---|
Rohingya | Myanmar | MMR | 54 | 7 | 11 | |
Rohingya | Bangladesh | BGD | 76 | 11 | 18 | |
Irma/Maria | Anguilla | AIA | 8 | 1 | 11 | |
Irma/Maria | Antigua and Barbuda | ATG | 26 | 7 | 14 | |
Irma/Maria | Bahamas | BHS | 43 | 7 | 7 | |
Irma/Maria | Barbados | BRB | 35 | 5 | 6 | |
Irma/Maria | Bonaire, Saint Eustatius and Saba (The Netherlands) | BES | 0 | 0 | 0 | |
Irma/Maria | British Virgin Islands | VGB | 11 | 5 | 9 | |
Irma/Maria | Cuba | CUB | 55 | 7 | 7 | |
Irma/Maria | Dominica | DMA | 17 | 3 | 21 | |
Irma/Maria | Dominican Republic | DOM | 47 | 8 | 14 | |
Irma/Maria | Guadeloupe (France) | GLP | ||||
Irma/Maria | Haiti | HTI | ||||
Irma/Maria | Martinique (France) | MTQ | ||||
Irma/Maria | Montserrat | MSR | ||||
Irma/Maria | Puerto Rico (The United States of America) | PRI | ||||
Irma/Maria | Saint Barthélemy (France) | BLM | ||||
Irma/Maria | Saint Kitts and Nevis | KNA | ||||
Irma/Maria | Saint Lucia | LCA | ||||
Irma/Maria | Saint Martin (France) | MAF | ||||
Irma/Maria | Saint Vincent and the Grenadines | VCT | ||||
Irma/Maria | Sint Maarten (The Netherlands) | SXM | ||||
Irma/Maria | Turks and Caicos Islands | TCA | ||||
Irma/Maria | United States Virgin Islands | VIR |
The SQL queries for the Rohingya Crisis MMR and BGD (in the order Baseline, Updates, Creates) are:
select b.name, a.last_modified, b.location from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and a.run_number = c.run_number and b.location like '%mmr%' and a.update_frequency != -1 and date(c.run_date) = '2017-08-24' and a.last_modified > '2016-02-24' order by a.last_modified;
with baseline as
(select a.id, b.name from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and a.run_number = c.run_number and b.location like '%mmr%' and a.update_frequency != -1 and date(c.run_date) = '2017-08-24' and a.last_modified > '2016-02-24')
select e.name, max(d.last_modified) as last_modified from dbdatasets d, baseline e where d.id=e.id and d.last_modified > '2017-08-24' and d.last_modified < '2017-10-25' group by e.name;
select e.name, max(d.last_modified) as last_modified from dbdatasets d, dbinfodatasets e where e.location like '%mmr%' and d.update_frequency != -1 and d.id not in (select a.id from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and
a.run_number = c.run_number and b.location like '%mmr%' and c.run_date < '2017-08-25') and d.id=e.id and d.last_modified > '2017-08-24' and d.last_modified < '2017-10-25' group by e.name;
The SQL queries for the the Hurricane Irma and Maria Crises combined with affected countries taken from https://reliefweb.int/disaster/tc-2017-000125-dom and https://reliefweb.int/disaster/tc-2017-000136-atg (in the order Baseline, Updates, Creates):
select b.name, a.last_modified, b.location from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and a.run_number = c.run_number and b.location like '%aia%' and a.update_frequency != -1 and date(c.run_date) = '2017-09-02' and a.last_modified > '2016-03-02' order by a.last_modified;
with baseline as
(select a.id, b.name from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and a.run_number = c.run_number and b.location like '%aia%' and a.update_frequency != -1 and date(c.run_date) = '2017-09-02' and a.last_modified > '2016-03-02')
select e.name, max(d.last_modified) as last_modified from dbdatasets d, baseline e where d.id=e.id and d.last_modified > '2017-09-02' and d.last_modified < '2017-11-03' group by e.name;
select e.name, max(d.last_modified) as last_modified from dbdatasets d, dbinfodatasets e where e.location like '%aia%' and d.update_frequency != -1 and d.id not in (select a.id from dbdatasets a, dbinfodatasets b, dbruns c where a.id = b.id and
a.run_number = c.run_number and b.location like '%aia%' and c.run_date < '2017-09-03') and d.id=e.id and d.last_modified > '2017-09-02' and d.last_modified < '2017-11-03' group by e.name;