...
Category | Description |
(1) Comprehensive: GEO_COOR (regex) HXL_TAGS (regex) PROTECTION_GROUP RELIGIOUS_GROUP SEXUALITY SPOKEN_LANGUAGE | Static. No updates needed unless errors or omissions are found. Example: SPOKEN_LANGUAGE will not need to be updated unless certain rare or dying languages appear to be missing. |
(2) Comprehensive in context: DISABILITY_GROUP EDUCATION_LEVEL MARITAL_STATUS | Functionality will be dependent on the correct context of key terms. Example: “single” is not exclusively a marital status, just as “primary” is not always an education level. |
(3) Not comprehensive: OCCUPATION HH_ATTRIBUTES HDX_HEADERS | Difficult to capture all possibilities upfront; may need updates as more datasets are scanned. Example: “child_headed”, “families headed by children”, and “hohh child” all express the same household attribute; different data contributors may have their own versions. |
Over time, we will refine our use of DLP based on its performance. This process will involve adding, updating, or removing custom infoTypes across these three categories to improve the detection of different forms of sensitive data.
...