Educating Data Sages

This short support document, will explore how to weave data understanding into your students' project work. How can we ensure that our students are data savvy? Why do we even care?

Working effectively with GIS requires that we know what our data shows and its basic characteristics. This "data about the spatial data" is known as Metadata. While this term may have a rather ethereal-technical sound, it is really just a fancy name for the details about a spatial data set such as name, creator, date created, location, projection, scale, accuracy, feature details, source, etc.

Having accurate Metadata can be a matter of life or death. For example, consider the NATO bombing of the Chinese embassy in Serbia; this political and human tragedy appears to be a direct result of missing or misused Metadata. The date field for the map should have alerted the targeting specialists to the fact that the data may have been out of date. They also should have known the path to (location of!) existing update data which was available. Other mistakes of this type are caused by scale problems (e.g., reading a small scale map as if it was an accurate large scale map), missing projection or datum information, or poor source documentation.

Often data is used for purposes for which it wasn't designed. Are you cutting your wedding cake with power hedge clippers? (Or is this appropriate for a garden wedding?). How do you know if you have the right digital geospatial data for the task? Look at the Metadata!

Metadata is like the information about a new car that you find on the sticker in the window. It includes items that lets you know the model type, model year, mpg, included options, extras, price, and other basics about the vehicle. It helps you understand what you get in the package. In the old days, many maps included many metadata elements on the map's margins.  This was easy, since the medium was paper. As we move more and more to digital data, we need new strategies for storing the metadata since digital data often is reduced to the spatial and attribute information on a floppy or other digital medium. Often, unless you open the data and look at it, the best clue you have as to the contents of the data is the hieroglyphic file name (goodstf.shp). Again, access to accurate and comprehensive metadata can help save the day.

Federal Metadata Standard

While there may be many different opinions on what is the essential metadata needed for any particular type of dataset, there is a federal standard that has been developed by the Federal Geographic Data Committee (see http://www.fgdc.gov) to provide a framework for metadata records.

Some of its broad metadata categories in the FGDC standard include:

Identification information such as the name of the creator, publication date, title, abstract, purpose, status, update frequency, bounding coordinates, keywords, constraints on use, person/organization to contact for the data, period of time covered by the data.

Data Quality Information includes accuracy of attributes and geography, the heritage (lineage) of the data, currentness.

Spatial Data Organization includes data structure type (raster or vector), type of raster or vector features types, number of features or cells.

Spatial Reference Information includes projection, horizontal coordinate system (like Lat/Long or State Plane Coordinate), vertical control system, horizontal datum (ellipsoid).

Entity and Attribute Information includes entity type and attributes for as many feature layers as you have, overview description of feature and attribute content.

Distribution Information includes items such as who distributes the data, if there is any liability assumed by the distributor, digital format for data transfer, address to contact distributor.

(Abstracted from the Content Standards for Digital Geospatial Metadata Workbook, FGDC, March 24, 1995).

At the FGDC web site, you can find detailed information on the standard used to format the metadata. The Federal Geographic Data Committee calls this standard the Content Standard for Digitial Geospatial Metadata (CSDGM). This is the current national standard for data in the US that is being promoted by the FGDC. It allows data to be cataloged in a common format which makes searching for data and sharing data much easier.

Over 40 software tools have been developed to help data developers record there metadata.  Some of these are commercial products others are shareware or freeware. (See USGS metadata tools page or Enabling Technologies' Metadata creation and ArcView metadata viewer extension.)

Currently there are many National Spatial Data Infrastructure (NSDI) Clearinghouse nodes in place or being organized. This provides access to the metadata on various collections of data. In the future, perhaps, fulfilling a desire to find data for a project (based anywhere in the world) will almost always be as easy as a simple search on the web followed by a quick download.

When you discuss the concept of geospatial data in a GIS, you might want to present it as made up of three components.

the geographic coordinate based information - spatial data,

the details about the features on the map - the attribute data, and

the context information about the dataset - the metadata.

CSDGM - "The Standard". This is the Content Standard for Digital Geospatial Metadata.

FGDC - Federal Geographic Data Committee (Actually this is more a governmental agency workgroup. There are a set of employees working full-time on this and others seconded from their own jobs such as from USGS.)

NSDI - National Spatial Data Infrastructure

Information provided by Steve Palladino, Former Co-Principal Investigator on the National Science Former Project, GIS ACCESS Project and Geography/GIS faculty member at Ventura College.

