I. Introduction
The implementation of the WOCE Global Centre at Brest effectively began in early 1993. A limited but significant second phase funding by IFREMER has been dedicated to the development of this project. The equipment (SUN workstation) and the data base management system (ORACLE version 7) had been previously bought. Most of the software developments were subcontracted in December 92 to a local software house with a consultancy of persons from ORACLE company. The data dictionary (105 pages) and the software specifications (107 pages) have been developed in house by IFREMER programmer Gilbert Maudire and the scientific advisor J.-P. Rebert. These documents are presently in French and subject to small changes. The final versions will be abbreviated and translated. A visit to NODC and MEDS was undertaken in December by the programmer to clarify some points related to the GTSPP format and the history table management.
We describe briefly hereafter the main functions of the planned system, the pending issues and the status of implementation.
II. General considerations
The system was designed, taking into consideration that it should serve the TOGA and WOCE needs which are respectively described in the implementation plans of these two programmes, taking into account the existing GTSPP data flow and possible evolution of requests and technology within these programmes. It is CLIVAR oriented.
The experience of the TOGA Centre proved that relational data bases can be a very effective tool to manage merged real-time and delayed mode subsurface data, if correctly designed. We decided therefore to adopt a similar and more modern system, though the amount of data will be much larger than that of the TOGA data base. Moreover the existence of Scientific QC Centres creates new constraints both in terms of data flow and in amount of information to keep. This leads to a very complex scheme of data handling, and the success of such an enterprise will be proved only with use.
Presently the first aims are to ensure
- an ascending compatibility with the TOGA data management
- full compatibility with the existing GTSPP
- modularity to meet future needs
- effectiveness of the system
The technical solutions adopted were
Hardware: SUN SPARC station 2
operating system: UNIX
data base: ORACLE 7
User Interface : X Windows (Open Windows)
Graphic library: GKS
languages: C and FORTRAN
The analysis of the structure of the tables, though completely independent of that performed by MEDS led to a very similar result. The data base is presently composed of 8 tables. For the history table we adopted the MEDS structure and rules.
The major differences with the last GTSPP record format received in Brest are due to the fact that the amount of information that can be handled in the type of data base that we adopted must be reasonable in size if we want to maintain the timeliness which is requested by end users. Otherwise the operations become prohibitory slow, or the data bases must be split, or the management is no more possible given the limited human power dedicated to the data centre. This led us
- not to save information useless for internal data management or for users like the IGOSS bulletin header and time (internal MEDS management and statistics), data availability (always), dup flag (duplicates will be physically deleted)
- to reduce the amount of information, where it may become to large to be practically handled:
- number of records of the history group. It is worth noticing that a history record is about three times larger than a data record. The amount of metadata attached to a station may (and it already happened) be larger than the data themselves, which is unreasonable. The number of records and the list of codes of this table will be limited depending on performance tests.
- the profiles will be reduced to a maximum number of 300 data. This will be automatically performed for profiles exceeding this length or when the depth resolution is less than 2 meters.
- to maintain some current TOGA data base codes. We will give priority to the IOC and GF3 codes. The platform identifier will be the call sign. The Institution and country codes will be the IOC and WDC codes (the GTSPP table is too short for delayed mode data. Example: for year 1987 there are 74 different Institution codes in the TOGA data base) the Ocean code will be maintained. All these codes will be converted in GTSPP codes for data exchange using the format (cross reference tables are included in the data base tables).
IV Specifications
Formats: 8 formats will be accepted for data submission including the previous formats, the GTSPP and TSDC format 3 to 4 output formats for data shipment are planned including the GTSPP and current TSDC format (GF3 format has the lowest priority).
Processing: The data pass first into an automatic QC programme including
- thresholds (defined in the data dictionary)
- Levitus seasonal climatology
- physical properties (spikes, inversion) preliminary flags are put on the data
Then it passes through a duplicate control programme (duplicates within the submitted data set and between the data set and the data base). This automatic process based on the TSDC previous rules, and hopefully improved, is completed by an interactive procedure where doubtful cases can be screened and the operator takes the decision.
The data set is reduced at inflexion points if not satisfying the conditions described in III.
The data set is submitted to an interactive quality control by an operator. This control has functions very similar to those performed at MEDS and NODC.
- profile editing versus Levitus seasonal climatology
- speed test
- allows deletion or interpolation of spikes (depending on digitization type)
- allows deletion of erroneous downward "tails"
- update of the history table
The data set is loaded in the data base.
V Status of implementation
Programming began in January. Approximately one fourth of the total planned time has been used. The user interface is finished, has been tested, and is being improved and debugged. Experimental transfer of TOGA data sets is planned this week to assess performances. All the functions described hereabove will be developed in May. Transfer of GTSPP data sets are planned in early June. The deadline for the end of contract is June 15. At this date the TOGA/WOCE data base should be operational.
The existing TOGA data management system should be abandoned this summer after a (hopefully) short period of overlap.
VI Conclusion and recommendations
The main conclusion is very simple: we hope that this system will work. Unfortunately its architecture is too complex and too different from the previous TOGA data base one, to allow us to be in a position to predict its full success. Further inconsistencies will probably appear and will have to be solved.
The major concern is that of a complete transfer of the information delivered by the QC Centres. The preliminary conditions are simple: they must enter in the present GTSPP format. The simplest way to ensure preservation of this information is to put it appropriately in the history table. The structure of this table allows room for several additional codes, which can represent a specific test. It is a little unfortunate that some fields are numerical instead of character fields (technically the AUX_ID and ORIG_VAL fields). This would allow much more flexibility to describe some types of corrections (for instance corrections of platform name cannot be properly notified. This is unfortunately the most frequent correction in merged real time/delayed mode data sets).
At this stage the attribute of these fields can be modified in our system. We do not know about the other Centres. Anyway it is strongly recommended that the three scientific QC Centres agree on a common list of codes (Act_Code, table 9 of the GTSPP Master Record, NODC June 2 1992) describing specific operations, controls or detected flaws. This list can be updated and distributed as far as new sophisticated test are implemented.
If participants did not agree on the adequacy of the GTSPP format to save all this information, some slight modifications of the tables can still be achieved, but a complete recasting of the system cannot be envisaged.