2009-10-19: Focus on WCS
From Datafedwiki
<Back to Task List | Create New Task | Edit with Form
Title: Register and use any WCS by implementing proper pre/postprocessing for each request
Date: 2009/10/19
For User: Hoijarvi, Rhusar
Status: InProgress
About Project or Tool: Catalog, Data Access (Add New project or tool)
Objective: Make WCS central part of data access related issues. (Add New Objective)
Make WCS more central part of data access related issues.
Contents |
[edit] WCS Issues
- Operate on truly distributed manner, using WCS services as they are, without forcing any relation to our datasets.
- Catalog WCS services.
- Allow registering preprocessing and postprocessing adapters to services, separating the data access from service adapter issues.
- Integrate with new faceted catalog view.
- Monitor the health of the services with regular samples.
- Enable access to the monitoring data.
[edit] WCS Registration and implementation
- Each WCS and Coverage will be in our catalog.
- Each WCS request: GetCapabilities, DescribeCoverage, GetCoverage will have optional preprocessing/postprocessing instructions.
- Capabilities documents can be fixed, and Keywords for our facets can be added.
- CoverageDescription can be fixed and enchanged to improve dimension reporting
- GetCoverage preprocessing and postprocessing will at least homogenize the -180..190 and 0..360 coordinate system mismatch and produce proper CF-1.0 NetCDF cubes.
- Our internal system will only work with kosher WCS, no exceptions.
- People who want to fix their WCS servers can use our postprocessed results as an example what to change.
Same goes with WMS, except that some work has already been done at AQ_uFIND to make any wild WMS browseable with our browser, even with varying time dimensions.
Related to separating adapters from data access: for example TOMS_AI has three different locations for lifetimes of different satellites. This can be packaged as a more general data access class
[edit] Metadata Harvesting from WCS Service
This section lists the metadata that is read by using the GetCapabilities and DescribeCoverage requests. The metadata is then written into local database for fast access.
The partial version of the cache DB. Facets are omitted.
[edit] Metadata read from Capabilities Document
View samples: Capabilities 1.0.0 Capabilities 1.1.0 These are almost identical.
- Service Name:
- A name the service provider assigns to the server.
- source path: //WCS_Capabilities/Service/name
- DB field: WCS_services.service_name
- Service Label:
- A human-readable label for this server, for use in menus and other displays.
- source path: //WCS_Capabilities/Service/label
- DB field: WCS_services.service_label
- Service Description:
- A description of the service.
- source path: //WCS_Capabilities/Service/description
- DB field: WCS_services.service_description
- Service Uri
- The uri of the service for http-get queries.
- source path: //WCS_Capabilities/Capability/Request/GetCapabilities/DCPType/HTTP/Get/OnlineResource
- DB field: WCS_services.service_uri
- Keywords
- The keywords are read, either as play tag, like "NASA", or in key-value pair like="Originator:NASA". They can then be used in the faceted search. Currently, plain keywords are not stored.
- source path: //WCS_Capabilities/Service/keywords/keyword
- DB field: Written to the facet tables.
[edit] Metadata read from CoverageDescription Document
View samples: DescribeCoverage 1.1.0 DescribeCoverage 1.1.0 These are quite different, so node paths differ.
Both versions 1.0.0 and 1.1.0 share these fields
- Coverage Name:
- Unique name in this service to identify the coverage, duplicates are prohibited.
- source path 1.0.0: //CoverageDescription/CoverageOffering/name
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Identifier
- DB field: WCS_coverages.coverage_name
- Coverage Label:
- A human-readable description for presentation in client forms or menus.
- source path 1.0.0: //CoverageDescription/CoverageOffering/label
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Title
- DB field: WCS_coverages.coverage_label
- Coverage Description:
- A narrative description of the coverage.
- source path 1.0.0: //CoverageDescription/CoverageOffering/description
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Abstract
- DB field: WCS_coverages.coverage_description
- Keywords
- The keywords are used similarly compared to service keywords. Each Coverage inherits the keywords the service has, but can add new ones
- source path 1.0.0: //CoverageDescription/CoverageOffering/keywords/keyword
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Keywords/Keyword
- DB field: Written to the facet tables.
- Default Coordinate Reference System:
- The coordinate system that requests should be made in. Currently only linear projections are supported.
- source path 1.0.0: //CoverageDescription/CoverageOffering/supportedCRSs/responseCRSs or requestResponseCRSs
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/SupportedCRS
- DB field: WCS_coverages.default_crs
- Spatial Dimensions:
- Description of the X, Y and optionally Z dimensions. Datafed can use linear Longitude, Latitude and optionally height or depth dimensions.
- source path 1.0.0: //CoverageDescription/CoverageOffering/domainSet/spatialDomain/RectifiedGrid
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Domain/SpatialDomain/BoundingBox
- DB field: dimensions.[all fields]
- Notice: 1.1.0 does not describe grid size, and neither describes irregular enumerated dimensions, they must be harvested from data.
- Time Dimension:
- Description of the time dimension. This can be a regular range: min, max, step; or enumerated time instances.
- source path 1.0.0: //CoverageDescription/CoverageOffering/domainSet/temporalDomain/timePeriod for ranges or temporalDomain/timePosition for individual instances.
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Domain/TemporalDomain/TimePeriodfor ranges or TemporalDomain/timePosition for individual instances.
- DB field: dimensions.[all fields]
- Other Dimensions:
- typically wavelength: min/max/res or enumerated
- source path 1.0.0: //CoverageDescription/CoverageOffering/rangeSet/RangeSet/axisDescription/AxisDescription/value/interval for ranges and value/singleValue for enumerated instances.
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Range/Field/Axis
- DB field: dimensions.[all fields]
Fields: WCS 1.1.0 allows each coverage have multiple fields in each grid location. Each field shares the dimensionality of the coverage. WCS 1.0.0 has only one field, implicitly, the coverage itself.
- Field Identifier
- Unique field name for this coverage, duplicates are prohibited.
- source path 1.0.0: //CoverageDescription/CoverageOffering/name
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Range/Field/Identifier
- DB field: WCS_fields.field_name
- Field Title
- A human-readable description for presentation in client forms or menus.
- source path 1.0.0: //CoverageDescription/CoverageOffering/label
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Range/Field/Title
- DB field: WCS_fields.field_title
- Field Description
- A narrative description of the field.
- source path 1.0.0: //CoverageDescription/CoverageOffering/description
- source path 1.1.0: //CoverageDescriptions/CoverageDescription/Range/Field/Abstract
- DB field: WCS_fields.field_description
[edit] Plan to harvest metadata from the data itself:
Most of the information is readily available with the DescribeCoverage request.
However valuable information may:
- be out of date of incorrect in the first place.
- simply be missing, for example often the elevation dimension values.
- be missing because there's no standard place to put it. For example data statistics.
The WCS catalog can be improved by sampling the data, and collecting:
- typical statistics: min, max, avg, std dev, 90 th %, null counts etc...
- units: resulting NetCDF files typically contain units,
- Free form attributes from the NetCDF files may contain important information.
[edit] Pre and Post processing scripts
Each WCS call is made to virtual WCS which accepts standard queries and produces XML and CF-1 files.
To enable using slightly nonstandard WCS services, the virtual WCS passes the input query and output data through optional pre/post filters.
Typical Filter calls are in python, calling predefined modules. The implicit parameters are 'query' for the key value pair http-get query, and 'result' for the data returned from the actual service.
Here are examples one liners, that call a library to tweak the result or query.
Postprocessing Capabilities:
# Document in wrong namespace: XmlTweak.RenameNamespace(result, "http://www.opengis.net/ows", "http://www.opengis.net/wcs")
# Add a facet keyword: WcsTweak.EnsureCapabilitiesKeyword(result, 'Platform:Model')
Postprocessing Describecoverage:
# Illegal SRS in envelope, change to standard WcsTweak.ChangeSrsName(result, "lonLatEnvelope", "http://www.opengis.net/wcs", "WGS84(DD)", "urn:ogc:def:crs:OGC:1.3:CRS84")
Preprocessing GetCoverage
# server does not understand elev_min,elev_max in the bbox WcsTweak.CutElevFromBBox(query)
Postprocessing GetCoverage:
# netcdf has incorrect "missing_value" missing = NetCDFTweak.ReadAttr(result, "MissingValue") NetCDFTweak.DeleteAttr(result, "MissingValue") NetCDFTweak.WriteAttr(result, "missing_value", missing)
| Objective | Make WCS central part of data access related issues. + |
| About | Catalog +, and Data Access + |
| Date | 19 October 2009 + |
| Status | InProgress + |
| TaskTitle | Register and use any WCS by implementing proper pre/postprocessing for each request + |
| User | Hoijarvi +, and Rhusar + |

