Overview
...
Spatial data requires additional processing to tabular data. The following provides information about what data problems to look for and how to fix them. For specific information about CODs please see: Common Operational Dataset Processing page, COD-PS Standards and Process , COD-AB Standards and Process
...
The six themes outlined below should be considered before using and disseminating geographic data. If the data do not meet the criteria defined by these themes, and/or the data cannot be cleaned to meet these criteria, the sources for these data should be reviewed. If there is no other option to correct the problems, these issues should be documented in the metadata.
- They have a known source: data should not be used if the source is unknown because there is no guarantee of the verification of the data or the appropriate permission to use the data.
- They are complete in geographic scope: data need to span the entire country(s)/region(s) of interest. See example below of an incomplete dataset. See example in Figure 3. In this case, more research may be done to see if data are available which span the entire country.
- They have complete and accurate attribute information: if data do not have information about each geographic feature, there is an increased risk for the data to be used incorrectly. See more specific details on how much attribute information is needed under the Data Cleaning topic.
- They have a known projection: Unknown or incorrect coordinate reference systems (including datum and projection) can prevent the data from being overlaid properly with other sources of geographic information and incorrect spatial analysis. If the data’s coordinate reference system is unknown, refer to the source of the data to see if the original coordinate reference system can be determined.
- They are up-to-date and relevant to the current situation: the information associated with the data must be up-to-date OR useful to the situation for analysis. See example below of administrative boundaries not reflecting the current situation. However, if updated data are not available, out of date data are better than none, but the problem should be documented in the metadata record.
- They have correct topology: the spatial properties of the data must be accurate for the data to be used correctly. See the example of topological errors in a polygon file in Figure 3. Topology is checked differently for polygon, arc and point files.
Point Files: all points are generally in the correct location. Two examples of files that do not pass the topology check are 1) a file where a type error was made in the latitude and/or longitude field(s) of the file and the point is not in the correct location or 2) the location for a populated place is obviously incorrect (e.g. located in the ocean or incorrect administrative unit).
Polygon and Arc Files: no gaps and/or overlaps between the lines that make up the arcs or polygons are present in the data.
Data Cleaning
...
The following are the most common types of processing that needs to be done. More information about spatial data through the COD material see the resource section.
...
- Select Apply and OK
- You will see the outline highlighted. Now select Delete on your keyboard, and the outline is deleted
At any administrative level, boundaries that coincide with boundaries at a higher level in the hierarchy should be removed. The outer boundary is removed so that it does not conflict with the boundary of the administrative unit at a higher level. For example, first administrative level boundaries will be encompassed by international boundaries, and international boundaries will be encompassed by coastlines.
Formats
...
Consider the way in which spatial data is shared. The format it is shared in may impact who can use it (e.g. non GIS people can use tabular data). For details on how to change spatial data formats see: Steps For Data Format Conversion
...