Geography 26 Project Papers: Spring 1999
Metadata: issues in format and retrieval for CDFA Noxious Weed Mapbooks
Rosie Yacoub
Metadata: issues in format and retrieval for CDFA Noxious Weed Mapbooks
By Rosie Yacoub

Introduction
The CDFA has created or helped to create several data tools to help in the control of noxious weeds.  They helped to create the CalWeed Database:
http://endeavor.des.ucdavis.edu/weeds/
This database allows different agencies to post their weed control efforts around the state on a single internet site for networking purposes.  They also have the Noxious Weeds Database, an internal database created by the division of Integrated Pest Control (IPC), that stores data for weed projects.  IPC has also created Weed Mapbooks showing the locations of weeds in each county that will allow the state biologists more easily find and track weed occurrences.  Up until now, IPC has not created metadata for either of the aforementioned  projects.  The focus of this study was to determine a data product that is ready for documentation, determining a format and location for metadata appropriate to the product, and collecting that data.

Background
 "The spread of non-indigenous plants threatens biological diversity and the functioning of natural ecosystems..."(Jacono and Boydstun, 1998).  For this reason, and also to protect agricultural interests,  weeds are the focus of control efforts all over the country.  Weeds change their distribution over time, have the ability to jump  jurisdictional boundaries, and have similarities in their response to control methods (both within and between some weed species).  Because of these qualities,  databases, especially spatial databases, are excellent tools for tracking weed distribution and coordinating weed control efforts both within and between organizations.  These weed databases are being created by many agencies throughout the country (Jacono and Bodstun, 1998).
 The California Department of Food and Agriculture (CDFA) monitors and makes efforts to control many weed species. Certain noxious weeds that are known to be of economic importance are under the jurisdiction of the Integrated Pest Control branch, whose charge is to detect and eradicate or control these species (Akers, 1998).  Although the County Agricultural Commissioners Office takes the lead role  on most of these weed projects, there are a group of CDFA biologists who work closely with these offices, providing technical support, and sometimes taking a more direct role in weed control efforts.  The state is divided into districts (each containing several counties), and there is a biologist coordinating efforts for each district.
 Each county and biologist has a copy of the Noxious Weeds Database (an Access database) created by IPC.  The database allows them to enter data about: species, landowners, acreage, herbicides and/or equipment used, time spent, and location --- using both township/range fields and GPS data.  Each field copy shares its data with a central database via FTP or internet exchanges (Jacono and Boydstun, 1998).  The locational data is also used to make Noxious Weed Mapbooks used by both the state biologists and the County Ag. Commisioners’ offices.  Each map book pertains to a specific county, with each page containing a USGS quad (from Teale Data Center) overlaid with weed locations taken by a Trimble GPS unit.
Metadata for these projects should answer the following questions:  What does the dataset describe?  Who produced the dataset?  Why was the dataset created? How was the dataset created?  How reliable are the data; what problems remain in the dataset? How can someone get a copy of the dataset?
Schweitzer (1998) sites the ability to share data through entities like the National Geospatial Data Clearinghouse as the major reason and benefit for metadata.  But he also points out things to consider for organizations relating to the value of metadata: 1. The value of the data 2. The value to an organization of sharing their data. 3. The transience of the workforce. 4. The quality of the metadata.

Methods
Internet Search:  The internet was surveyed for formats and locations for storing metadata.  Criteria for formats included: readability, control of fields, data user (and potential data user) access.
Interviews with Staff: Staff at the Integrated Pest Control Branch of the CDFA were interviewed informally about metadata needs and the location of metadata for current projects.
Source Documentation:  Using information about the data collection, internet metadata, and information about Trimble GPS --- metadata for the Weed Mapbooks were collected.

Results
format 1: Excel spreadsheet (available through CDFA, Integrated Pest Control Branch or American River College, Ethan Way Center H: Geog 26/Yacoub//weedmeta)
 
 
Layer Name Hi-Res Counties Page Grids 7.5 Minute Quads SacEast Weeds All Weeds
File/Location GISLab333\333E\Layers\
Counties_HiRes\Countyname
GISLab333\333E\Maps\
Mapbooks\PageGrids\
Countyname
GISLab333\333E\Layers\
Quads\tQuads
GISLab333\333E\Layers\
WdDistricts\SacE\SacEWdsGPS
GISLab333\333E\Layers\
WdDistricts\AllWdDistsHand
Source ftp://lorax.biogeog.ucsb.edu/
pub/data/gap_analysis/calif
http://www.gislab.teale.ca.gov http://www.gislab.teale.ca.gov CDFA CDFA
Date Obtained/Created thru 5/15/99 10/98-5/99 10/98-2/99
Data Currency 051999 051999
N Bound. (DD) 42 42.125 
S Bound. (DD) 32.53  32.500 
W Bound (DD) -124.40 -124.375
E Bound, (DD) -114.13 -113.875
Source Projection Albers Teale Albers Teale Albers
Preparation converted to lat/lon (NAD 27) converted to lat/lon (NAD 27), added to grid to obtain buffer needed in OR, NV,AZ MX, and ocean converted to lat/lon (NAD 27); used World Reg tool in MapInfo to register raster data from Trimble GPS with Post Differential Correction some data from Trimble GPS, some hand drawn, mouse-digitized polygons
Scale Denominator 24,000 24,000
Resolution 400 dpi
Accuracy 1.5 m 1 meter 1 meter/unknown
Source Central Meridian -120 -120 -120
Source 1st Stand. Parallel 34 34 34
Source 2nd Stand. Parallel 40.5 40.5 40.5
Datum/Spheroid NAD27/Clarke1866 NAD27/Clarke1866 NAD27/Clarke1866 NAD27/Clarke1866 NAD27/Clarke1866
Comment used for title page; will be used in Weed Query serves as page clipping layer serves as background layer for weed data Weed data Weed data

format 2: using tkme metadata editor from USGS
Identification_Information:
  Citation:
    Citation_Information:
      Originator: California Department of Food and Agriculture, Integrated Pest Control Branch
      Publication_Date: 199906
      Title: Noxious Weeds Mapbooks
  Description:
    Abstract:
      This is collection of maps showing the locations of noxious weeds
      targeted for eradication by the CDFA.  The maps are orginzed by county,
      and each page represents a 7.5 minute quad. The locations of these weed
      occurances were documented by CDFA biologists using Trimble GPS data,
      and mouse-digitized areas. These locations have been overlayed on
      USGS quads provided by TEALE, and delimited by county using GDT
      data.
    Purpose:
      To display the current occurence of noxious weeds (as determined
      by the CDFA in California,and to document progress made in eradication,
      expansion, or discovery of these weeds.
    Supplemental_Information:
      The mapbooks are assembled in MapInfo by a program written in MapBasic.
      The program opens a workspace that contains the county, weed, USGS topo,
      and page grid layers; and then generates a page for each infested quad
      area.
  Time_Period_of_Content:
    Time_Period_Information:
      Range_of_Dates/Times:
        Beginning_Time: 1998
        Ending_Time: May 1999
    Currentness_Reference: Data current as above date.  Updates pending
  Status:
    Progress: Data is complete for department information.
    Maintenance_and_Update_Frequency: Will be updated continuously
  Spatial_Domain:
    Bounding_Coordinates:
      West_Bounding_Coordinate: -124.375
      East_Bounding_Coordinate: -113.875
      North_Bounding_Coordinate: 42.125
      South_Bounding_Coordinate: 32.500
 

 Keywords:
    Theme:
      Theme_Keyword_Thesaurus:
      Theme_Keyword: weeds
      Theme_Keyword: invasive weeds
      Theme_Keyword: weed control
    Place:
      Place_Keyword: Californa
  Access_Constraints: CDFA and County Ag Comissioners
Metadata_Reference_Information:
  Metadata_Date: 19990509
  Metadata_Contact:
    Contact_Information:
      Contact_Person_Primary:
        Contact_Person: John Gendron
      Contact_Organization_Primary:
        Contact_Organization:
          California Department of Food And Agriculture,
          Integrated Pest Control Branch
      Contact_Address:
        Address_Type:
          mailing address
          1220 N St. Rm. A-357
        City: Sacramento
        State_or_Province: CA
        Postal_Code: 95814
        Address_Type: jgendron@cdfa.ca.gov
      Contact_Voice_Telephone: (916) 654-0768
Lineage:
    Source_Information:
      Source_Citation:
        Citation_Information:
          Originator: Teale Data Center
          Title: 7.5 x 7.5 minute USGS quads
          Online_Linkage: www.gislab.teale.ca.gov
          Other_Citation_Details:
            Data was converted to MapInfo raster format using the WorldReg
            tool.  The projection was changed from Teale Albers to long/lat
            (NAD27) using the Save Copy As menu item and then selecting the
            desired projection.
      Source_Scale_Denominator: 24,000
      Type_of_Source_Media: Digitized USGS 7.5 minute quads
      Source_Contribution: Locational base map for weed data
    Source_Information:
      Source_Citation:
        Citation_Information:
          Originator:
          Publication_Date: 1998
          Title:
          Other_Citation_Details:
          Online_Linkage: ftp://lorax.biogeog.ucsb.edu/pub/data/gap_analysis/ca
          Larger_Work_Citation:
      Source_Scale_Denominator:
      Type_of_Source_Media:
      Source_Time_Period_of_Content:
      Source_Citation_Abbreviation: Hi-Res Counties
      Source_Contribution: County lines for title pages. Will become the clipping
    layer for county weed data in future editions
    Source_Information:
      Source_Citation:
        Citation_Information:
          Originator: California Department of Food and Agriculture, Integrated Pest Control Branch
          Title: A-Rated Noxious Weeds
          Publication_Date: 199906
          Other_Citation_Details:
            Some data is hand-drawn from older maps.  This data will be
            updated using GPS equipment.
      Type_of_Source_Media: Trimble GPS Data/ mouse-digitized polygons from
    hand-drawn maps.
  Attribute_Accuracy:
    Attribute_Accuracy_Report:
  Positional_Accuracy:
    Horizontal_Positional_Accuracy:
Quantitative_Vertical_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value:
Vertical_Positional_Accuracy:
      Quantitative_Vertical_Positional_Accuracy_Assessment:
        Vertical_Positional_Accuracy_Value:

Analysis
 The data collection, storage, and output of their Noxious Weeds Database will become structural to the weed control program.  The locations acquired with Trimble GPS units are also supplemented with other data using a GPS data dictionary and written report that are then entered into the Access database.  The database allows the State Biologists and County Ag. Commissioners to track  progress in controlling weeds.  The database can also generate reports on district-wide, weed-specific, state-wide, and other, broad views of the noxious weed control program; including totaling associated costs.  This allows program managers to check the efficacy of types of control, budget better, and makes cost-sharing partnerships easier to track.  So the data is quite useful.
As for sharing the data with groups or agencies outside the control program: 1. there are privacy issues for landowners that make specific location information a question 2. CalWeed, the internet database maintained at UC Davis, already documents and presents CDFA’s project work on noxious weeds in a format useful for networking purposes.  As it was clear that the database was designed as an in-house tool for the IPC Branch and the County Agricultural Commisioners Office, a public format like CERES is not necessary.  And because the data are essentially not spatial, a spatially focused format is probably not ideal.
Because it was clearer how the metadata for the Weed Mapbooks might be used, it was easier to determine what they should include; and for this reason, metadata for this project were collected.  As the weed layers in the Weed Mapbooks also relate to the Noxious Weeds Database, the metadata applies to both projects.
At the time of writing, the layers used as backgrounds were in flux.  The county areas from GDT were not lining up with the weed layer and the USGS Topos.  So, other polygon data (called Hi-Res County layer in the lab) were substitued as the county layer.  Also, it was found that the query used to determine which weed occurrences happened in which county was faulty in that it used the County field from the GPS data.  In some cases the biologist were working close to a county border, and crossed it while maintaining that the area they were surveying was in the county they started in.  It seemed clear that for this printing of the Mapbooks, the primary concern was to create a communication device: something that presented the locations of the weed occurrences in a way that facilitated finding them in the field.  Clean-up of the data necessary for spatial analysis would be done later.  In order to do this clean-up, to avoid or explain issues of data mis-alignment, and generally to facilitate consistency---metadata for this project should prove quite useful. It would have been useful to have this data on-hand while producing these Mapbooks.
Of the two formats used to describe the metadata for the Mapbooks, the spreadsheet format is the one most useful to the  creators and builders of these books;  and it is the current and future builders of the Mapbooks that are the audience of the metadata.

Conclusions
In addition to the dataset, one must also pay attention to the user(s) of the data when creating metadata.  Without understanding their desires for data, and their potential use for metadata, a lot of time and energy can be wasted in making a product that isn’t used.
 Metadata should be maintained as part of the process of data acquisition, so that they can be used to answer questions of data incompatibility as they come up while building a project; and so they can inform decisions about what data is desirable for a particular project.  This is not necessarily an easy task, because often metadata are more difficult to find than data.

References
Akers, Pat, 1998. Weed Prevention and Control in the California Department of Food and Agriculture. Noxious Times, Fall issue.

Jacono, C.C., and C.P. Boydstun, 1998. Proceedings of the Workshop on Databases for Nonindigenous Plants, Gainsville, FL, September 24-25, 1997.  U.S. Geological Survey, Biological Resources Division, Gainsville, FL.

Schweitzer, Peter N., 1998. Putting Metadata in Plain Language in Plain Language. GIS World, September.