2009-10-23:Datafed - Juelich HTAP interoperability

From Datafedwiki

Jump to: navigation, search

Back to Data Developments

Contents

[edit] Web Coverage Service interoperability experiment

Datafed and Juelich shared did an interoperability experiment sharing data via Web Coverage Service.

It is a standard created by OGC to query data via web interfaces.

[edit] History of development actions

At Datafed Kari Hoijarvi created an implementation based on NetCDF as a data storage and transfer medium, and CF-1.0 Conventions as the standard to encode metadata. The system is written using C++ and python programming languages and it uses NetCDF version 3 Libraries from Unidata. Ed Fialkowski ported it from MS Windows to Linux port, and Internal use at datafed started October 2007.

The interoperability project started June 2009. Michael Decker at Juelich took the code and installed it on their Linux server. Since it was necessary to adapt the system a little to, Michael Decker started to contribute to the code.

[edit] Sharing Code using Version Control

The version control system used is darcs. It is a very flexible system, every developer maintains their own codebase, publish their own changes and pull in changes others have made.

Full history of every change is recorded in the repositories.

The code repository in Datafed is http://webapps.datafed.net/nest/OWS/ and at Juelich it is http://htap.icg.fz-juelich.de/darcs/OWS but you need darcs to go there, a regular web browser won't do much.

This screenshot shows the history of ISO time parsing code. Being created in datafed, M. Decker added for example improved error handling to it.

modifications of iso_time.py

M. Deckers also fixed some misinterpretations of the WCS standard and C++ standard, and improved non-ascii character handling in NetCDF attributes.

[edit] System Documentation

Currently, these four Wiki pages are maintained:

Overview of the System.

Installation on Windows

Installation on Linux

Programmers Documentation.

[edit] Installing the System at Juelich

The model data was generated and written into NetCDF files, which are then used by the WCS service. There's no need for configuration, since the NetCDF files themselves contain enough metadata so that the service knows what to serve.

WCS Contains three requests: Two for metadata and one for data. The metadata requests, GetCapabilities and DescribeCoverage return static XML documents describing the data it self. The WCS server is able to compile these automatically, extracting the dimension extents and writing them into the XML files.

Here's the link of Juelich Capabilities Document and to one Coverage Description in XML.

The data request GetCoverage returns a subset of the original file, but with trimmed dimensions. The global and variable attributes are copied. Here is an example. The return is an XML envelope, which contains a URL to the actual netcdf file.

[edit] Datafed as data Consumer

There are about 400 coverages in the Juelich HTAP model data. At datafed, we registered them for viewing in our catalog.

A good live example is CAMCHEM Ozone data. All the data is at Juelich, and the datafed browser gets it on line, without any local storage.

Since at Juelich, the longitude coordinate system is 0..360, and datafed uses -180..180, for the map view datafed actually makes two calls, and then joins the grid at Greenwich longitude.

To experience a bigger bicture, datafed created a multi-view browser for this model data.

Main Page: HTAP Model Show.

An Example live Console: MOZECH Multi View Console

MOZECH NO2 Console

With a multi view console you can browse several pages at a time. The navigation controls at to apply to all the views.

[edit] Aggregating HTAP Data Online

Starting from browsing MOZECH-v16_SR1_tracerm_2001

Datafed Browser

If you click the "Service Program", you'll get the view editor, which has this data flow diagram:

Map View dataflow

The making of the map image is organized into layers. Each of the rectangle is a Web Service, that is executed, and passes data to the next service. On top we have cursor service, which draws the cursor on the map. In the middle we have a Web Map Service call, which draws the world borders.

The real data flow happens in three services:

  • GetCoverage: Gets the data from Juelich
  • CubeAggregate: If more than one time is selected, this service performs an aggregation on the grid.
  • Render: Draws the aggregated grid on the map.

To demonstrate the aggregation, the time filter in GetCoverage was turned to "Time Range" and aggregation was changed to 80th percentile.


Original: One Month, June 2001

June 2001

Aggregate: Over one year, 80th percentile:

June 2001

The high areas are clearly bigger.

[edit] Processing HTAP Data Online

Datafed system allows you to add operators to combine data. In this demo we take two models and their NO2 data, and compute the difference.

Here is CEMAQ model of NO2:

CEMAQ model of NO2

Here is MOZECH model of NO2:

MOZECH model of NO2

This is the difference:

Difference between CEMAQ and MOZECH

These images are all from the same page HTAP/NO2_diff.

The data flow is following:

Binary Operator data flow

We have the normal flow for both CMAQ and MOZECH layers, and many decorations.

The Comparison layer pulls the data from the two aggregators, computes the difference, and renders the grid.

To see the actual data, first make the layer invisible, switch layer, and make it visible.

Personal tools
Workspaces
Clicky Web Analytics