Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Include Page
OET:ContentHeaderOET:
ContentHeader
Tip

Printing: If you want to print this document select tools -> export to pdf from the top right corner

...

To support the data collection procedure with software tools, the following architecture is used. This architecture uses the general client-server approach from the OGC where catalog web services link to distributed web services providing the data in WCS/WFS for grid/vector data respectively and WFS/WMS for grid/vector graphics. For graphics we also encourage KML (aka Google Earth feed) which can handle both grid and vector data. In addition to these high-level tailoring services, we propose to offer web services for plain standard data (RDBMS with geo-extension or netCDF-CF-OPeNDAP). Finally, we propose to use web services for raw data + tools as well. Here we propose a transactional protocol (meaning that users can also add data) that naturally includes version control. Allowing for sharing of non-standardized data is an important first step in the 5-star model of Tim Berners-Lee. It is often an essential step since some data require significant investment to standardize (2nd star) to open standards (3rd star).

Image Modified

Extraction

Gathering data

...

General information on a data set will be filled in a metadata form(see below). The resulting metadata file contains the following information:

  • title
  • description
  • contact information
  • resource identifier (unique location where data is stored)
  • other aspects like, date, lineage, language, authorization, copyright, etc...

This general information will be collected using the inspire directive

...

Referencing information on the earth requires the definition of a location. References done to a location are done in a coordinate system. Several types of coordinate systems exists. Most relevant are

  • Geographical coordinate system. This defines the size and shape of the earth (for example World Geodetic System 1984) , the origin (usually equator and greenwich) and the units (degrees).
  • Projected coordinate system. This defines a translation from the original geographic coordinate system in another (usually x,y cartesian) coordinate space. For example WGS 84/UTM zone 31N is defined as the transverse mercator projection of the WGS84 spheroid defined in meters.
  • Engineering coordinate system. This is a local system often related to a local object. For example in a physical experiment or on a boat.
  • Vertical coordinate system. Several reference levels can be used for vertical coordinate systems. Reference levels can be the spheroid, the geoid, the mean sea level or for example the mean low water level.

Example of vertical coordinate system

...

If we refer to a certain date or time there can be confusion about many aspects, possibly resulting in misinterpretation of data:

  • Calendar: Gregorian or other
  • Leap years: every 4 years a leap year, but not every 100 years except for every 400 years.
  • Time zones
  • Day light saving times

For time information the Climate and Forecasting convention will be used.

...

The following information needs to be stored with the measurements:

  • Units of measurement
  • Measurement method
  • Physical phenomena

For time information the Climate and Forecasting convention will be used.

...

This requires adding extra information to the collected data and the use of existing standards and conventions. Most of these tasks can be accomplished using scripts and command line utilities.
Examples of transformations are:

  • File format conversions
  • Reprojection
  • Statistical aggregation
  • Standardization

Files

The raw data can consist of many different file formats. Often the raw data is only understandable to the person who gathered, collected or created information. To make the data permanently understandable extra information is needed. Part of this information is provided in the metadata (title, author, copyright). The exact meaning of the data becomes clear after the automated conversion of the files into a standardized format.

...

Datasets have copyright. this means that to be able to use datasets you have to obtain a license. Import aspects to check in the license are restrictions on:

  • usage
  • redistribution
  • derived products.
    If you make a derived product (curry a dataset) there is not always a clear who owns the copyright and thus can provide licenses for usage. Please consult with your legal department for details.

The private environment

For the use of non-public data groups or organizations can make a duplicate environment for private data. Instead of public available services, these services can be made available through an intranet or other private network.

...