WOCE VOS/DMC UPPER OCEAN THERMAL
DATA ASSEMBLY CENTRE MEETING
Florida State University, Tallahassee, FL, USA
15-16 February 1990
WOCE Report No. 45/90
WOCE International Project Office
The purpose of the Upper Ocean Thermal Data Assembly Centre (UOT/DAC) Meeting was to discuss the responsibilities of each component of the DAC and to begin the process of defining operational details, for example, quality control procedures, exchange formats, collection and dissemination procedures, exchange and processing schedules, and standard product (analyses) generation. The meeting was attended by representatives of all DAC components. The participants were: R. Molinari (Chairman of the meeting) (NOAA/AOML), G. Meyers (CSIRO), W. White (Scripps/JEDA), R. Wilson (MEDS), J-P. Rebert (IFREMER), P. Saisou (IFREMER), G. Withee (NODC), M. Hamilton (NODC) and B. Thompson (WOCE-IPO)
Dr James O'Brien, head of the Mesoscale Air-Sea Interaction Group of Florida State University, opened the meeting by welcoming the participants to the Campus and providing a brief summary of the activities of the Departments of Meteorology and Oceanography. The latter is headed by Dr Phil Hsueh who was away during the period of the meeting.
Bert Thompson presented an overview of the Upper Ocean Thermal UOT/DAC operation as presently viewed by the WOCE Data Management Committee and outlined in the WOCE Implementation Plan (see below for details). He also provided a summary of Global XBT operations and plans. The XBT programmes are summarized in Figure 1 and Table 1. The overall XBT situation is improving but the Indian Ocean network is still operated at less than 50% of requirement and many of the Pacific high density eddy resolving line proposals are still under review. The Atlantic coverage should reach 60/70% of requirement by 1992. The WOCE XBT requirement includes the lines shown in Figure 1, plus the TOGA networks and broadcast mode data received through IGOSS.
Greg Withee, Warren White and Gary Meyers explained the JEDA (NOAA-Scripps) and CSIRO experience in operating Regional components for the TOGA XBT data collection and quality control scheme. The WOCE Regional Centres are similar to the JEDA and CSIRO TOGA operations. Through the cooperation of regional centres, NODC and AODC, and the TOGA Subsurface Data Centre in Brest, a high percentage of the Tropical data are in the TOGA XBT data set for the years 1985-1988. The 1989 delayed mode data will be available to the TOGA Center early in 1991. The delayed mode data now represents 60% of the data set. The quality control procedures used by JEDA and CSIRO are not identical. For example, CSIRO does a visual check of each profile while JEDA relies on its statistical test using bi-monthly histograms for 5 x 5 degree squares. Both centres check for consistency using mapping routines.
TABLE 1. WOCE XBT Programme Status
|PX||1||TRANSPAC - Canada/USSR/USA/Japan|
|2||TOGA XBT Network|
|5||USA/Taiwan - Start 91||4 HD||DB/XCTD||12|
|6||USA - Start 91||4 HD||DB/XCTD||12|
|7||USA - Start 92||4 HD||DB/XCTD||12|
|8||USA - Start after 94|
|11||USA - Start 91||4 HD||DB/XCTD||12|
|12||USA - Start after 94|
*Brackets indicate line under consideration, HD - High Density (30-40 km spacing)
The following terms are used extensively throughout this report and are defined as:
(i) Radio data (same as real time date): data received in near real-time via GTS through coastal stations (or other means), usually an abridged version of the entire temperature profile.
(ii) Delayed mode data: data received after the collecting vessel has reached port, can be in computer disk or hard copy form and includes the entire profile.
(iii) Scientific quality control: an in-depth quality control procedure typically performed by scientists with an interest in the data.
(iv) Flags: indicators which travel with the data giving the level of quality control performed on the data and the quality of the data.
(v) GTSPP: an IOC project to upgrade the collection, quality and distribution of radio and delayed mode temperature and salinity data using technological advances such as satellite data transmissions, CD-ROMs and computer-based information management systems.
The UOT/DAC network will consist of three levels of activity:
(1) National Data Centres (MEDS, NODC, AODC) which will have responsibility for the collection of radio and delayed mode Upper Ocean Thermal (primarily XBT) data and the distribution of that data amongst National, Regional and Global WOCE Centres within prescribed sequences and schedules. The IGOSS/IODE Global Temperatures Salinity Pilot Project (GTSPP) will be used extensively by the National Data Centres for this purpose.
(2) Regional Centres for the Atlantic, Pacific and Indian Ocean which have as their primary function the production of high quality data sets, derived by a procedure to be described shortly, using both radio and delayed mode data. These centres will not accept or reject data, merely provide flags on the quality. Regional Centres will also generate selected products.
(3) A Global Centre that will merge the regional data sets into a comprehensive global data set and distribute full set or sub-sets to WOCE and other users. The Global Centre will provide the full set to the World Data Centre (WDC) system for long-term archival and further distribution.
The data flow amongst the WOCE Centres is presented in the following Schematic:
MEDS Marine Environmental Data Service, Ottawa, Canada - R. Wilson
NODC National Oceanographic Data Center, Washington DC, USA - G. Withee
AODC Australian Oceanographic Data Center, North Sydney, NSW, Australia - B. Searle
AOML Atlantic Oceanographic & Meteorological Laboratory, Miami, Florida, USA - R. Molinari
JEDA Joint Environmental Data Center, La Jolla, California, USA - W. White
CSIRO Marine Laboratory, Hobart, Tasmania, Australia - G. Meyers
BOM Bureau of Meteorology, Melbourne, Australia - N. Smith
IFREMER Brest, France - J-P. Rebert
WDC World Data Centre (of ICSU)
In the above schematic, the centres which have volunteered to participate in the UOT/DAC programme are identified. This does not imply that all centres have confirmed their participation in the programme but they are, in principle, ready to do so. The WOCE Scientific Steering Group and Data Management Committee have endorsed a programme including these specific centres, and look forward to receiving confirmation from the centres and countries involved.
The following detailed descriptions of the operations of the centres relate, in part, to their ongoing operations. Differences and apparent duplication will therefore be noted, but procedures, for example for quality control, will be standardized for WOCE purposes. Later sections of this report address some steps being taken by the centres involved to institute common procedures. Many differences will be eliminated through the use of the GTSPP.
Marine Environmental Data Service - Canada
The primary function of MEDS will be to prepare the initial data set composed of those observations collected by radio. MEDS will receive input via the Global Telecommunications System (GTS) and cooperating National data centres. MEDS is presently cooperating with national centres in Australia, USA, USSR, France, Federal Republic of Germany and Japan to determine whether Toronto's (Canada's GTS hub) daily collection of XBT data is complete. If Toronto cannot obtain the full data set from the GTS, these centres may be asked to continue their support on a monthly basis. A full GTS circuit will be installed between Toronto and MEDS in 1990. The meeting suggested that WMO and IGOSS be requested to review the BATHY/TESAC routing on the GTS and assist in arranging for these messages to be passed to MEDS via Toronto.
MEDS will merge all radio data into a global set on a daily basis and subject it to quality control tests. Temperature will be checked against climatology, duplications eliminated, fields verified (latitude, longitude, time, etc.) and a few other checks made. Data will not be eliminated when failed, only flagged. The flagged data set will be forwarded to the US/NODC on a daily basis.
National Oceanographic Data Center - USA
The NODC will have a key role in the collection and distribution of both radio and delayed mode data. It will function as the hub through which all data passes to and from the national centres to the regional and global centres. NODC will merge radio data sets received from MEDS into a monthly set and provide the regional sub-sets required by AOML, CSIRO and JEDA. These sub-sets will be transferred to the regional centres soon after the end of each month. These same sub-sets will be returned to NODC a week later fully quality controlled with flags. NODC will forward the QC'ed radio data to the Global Centre in Brest in six month batches. The quality control routines for radio (near-real time) data presently used by NODC are described in Annex 1. These procedures will be superseded (perhaps incorporated) by the GTSPP quality control routines to be performed by MEDS.
The delayed mode data collected by NODC are quality controlled (see Annex 2) before they are transferred to the regional centres for the scientific QC. Yearly batches will be sent to the regional centres. The regional centres will return the delayed mode data, fully QC'ed and flagged six months after receipt. This data will then be transferred by NODC to the Global Centre: i.e. 18-24 months after the end of the year being processed (e.g. 1990 batch transferred in 1992). NODC will also provide the Global Centre with updates and additions every six months. Some of this data will not have been QC'ed by the regional centres, therefore will be for interim use only and eventually be replaced by the fully QC'ed data.
Indian Ocean Regional Centre - Australia
The Indian Ocean regional component of the UOT/DAC will be a cooperative effort of the CSIRO Marine Laboratory, Australian Oceanographic Data Center and the Australian Bureau of Meteorology. The function is, of course, to provide the scientific QC of the radio and delayed mode data which is provided via MEDS, NODC and National sources. While the functions to be performed are known, which component will undertake them is still being discussed in Australia. Oversight of the entire operation will reside with CSIRO. The receipt of data from NODC and its return in the fully QC'ed and flagged form, both radio and delayed mode, will be within the time frames mentioned above in the discussion of NODC.
The AODC will receive the data sets from NODC, may perform some QC, and will coordinate the distribution within Australia and the return of data sets to NODC for further relay to the Global Centre in Brest, France. CSIRO will be responsible for the scientific QC. Its present QC procedures, which more than likely will be incorporated in WOCE, are the following described by R. Bailey and G. Meyers.
Quality control (QC) of expendable bathythermograph (XBT) data at the delayed mode stage is undertaken by research oceanographers participating in the programme. The vertical profiles are checked on a voyage by voyage basis for common malfunctions, regional oceanographic features, drop to drop consistency along the cruise track, and duplicate drops of unusual features. The data are also checked against a climatology based on the data collected by ships participating in the CSIRO XBT Programme. An archive of unusual features observed along the different ship routes is used in the QC process. The features are checked with CTD data as opportunities arise.
An interactive editing routine has been set up on the in-house mainframe (VAX/VMS) computer to edit the data. QC decisions on common malfunctions and real oceanographic features are flagged (Table 2) on the data set. The data is further classed (1-4) by depth according to the type of flag associated with the data. Class 1 data are good data. Class 2 data have unusual features, but they are considered probably to be real. Class 3 data have features considered to be most likely the result of instrument malfunctions and not real features. Class 4 data are obviously erroneous data.
The data are stored in three archives. The first archive contains the unedited, full resolution, raw data as collected from the merchant ships. The second archive consists of the edited, full resolution data (Class 4 removed). The third data archive has the data condensed to a 2 metre format. This third archive is the archive used in further analysis, and for transfer of data to other organizations.
Quality control of the data is considered to start by providing the voluntary observers with continual feedback on why they are collecting the data as well as the results obtained. The two-way communication between observers and researchers will inevitably lead to a more carefully collected and hence generally higher quality data set."
Feedback to observers is an essential part of the QC programme and it is expected that all regional centres will take on this function. It is not by chance that the centre operators are associated with XBT programme managers and that their research interests rely on high quality XBT data. Providing feedback is already one of their habits.
The Bureau of Meteorology generates ocean products and can undertake the mapping test within the quality control programme. Mapping for QC purposes (see below) will be performed at all regional centres.
Atlantic and Pacific Regional Centres - USA
The Atlantic Regional Centre will reside at NOAA's Atlantic Oceanographic and Meteorological Laboratory, Miami and the Pacific at the Joint Environmental Data Center at Scripps. They will receive both radio and delayed mode data from NODC for their respective basins within the time frames specified above (NODC and schematic): radio - monthly; delayed mode annually, and perform the standard statistical, subjective and mapping QC procedures discussed below. They will return QC'ed data to NODC for further relay to Brest and will prepare selected analyses. A summary of the present JEDA QC is given in Annex 3. Analyses of interest to other WOCE scientists will be identified by the WOCE Core Project Working Groups and Scientific Panels (e.g. Surface Layer).
Global Upper Ocean Thermal Data Assembly Centre - France
The global component will be operated by IFREMER, Brest, the same group which now operates the TOGA Subsurface Data Centre. Their WOCE function will be to maintain the Global WOCE data set throughout the entire process, distribute the full set or sub-sets to WOCE and other users, and to ensure the fully QC'ed data set is provided to the World Data Centre system for long-term archival and storage. The Global Centre relies on the procedures for exchange and quality control described in the sections above. IFREMER has been cooperating with NODC, JEDA and CSIRO in the collection and QC of the TOGA data since 1985 and is planning to participate in the GTSPP. As can be expected, IFREMER also has its set of quality control procedures. These procedures are documented in TSDC technical report 9001, 3 February 1990, prepared by J-P. Rebert. Portions of the following sections are taken from Rebert's report. IFREMER receives radio XBT data regularly from the French Meteorological Service in order to be able to provide near-real time services to the TOGA community.
3. QUALITY CONTROL
At present, Quality Control checks are performed at several stages in the XBT data collection system and the techniques used, while often compatible, are not exactly alike nor are the inconsistency indicators (Flags) routinely passed on or retained. The GTSPP being developed by IGOSS and IODE recognizes this problem and has as one of its objectives the standardization of quality control (including flagging) procedures. The quality control procedures to be utilized within WOCE can be divided into those performed by regional centres and those performed by the collection facilities - MEDS, NODC and AODC. (The Global Center is discussed below.) In general, the collection facilities would be expected to compile radio and delayed mode data sets which have been subjected to duplication elimination and checks on the internal inconsistency of the message; are all the groups there, properly formatted and in agreement with recent operations of the ship in question. No temperature data would be eliminated or adjusted at this stage. The regional centres would be responsible for the scientific checks using climatologies, profile inspections and mapping techniques. The regional centres may prefer that some of the scientific QC be performed at a collection facility. If so, that is a regional matter which should be arranged with the cooperating centres.
The plan is to deliver fully QC'ed, flagged, regional centre certified data to the global centre. Taken at face value, that should imply no further QC would be required at the global centre. No centre, however, can be expected to give assurances on the quality of the data it holds without some form of review. This is perfectly acceptable, but any inconsistencies must be worked out with the regional centre, and all others who have been involved informed how the problem was solved.
Collection Centre QC Procedures
The WOCE collection centre QC procedures will conform with the GTSPP procedures. These procedures, which are not finalized, could be modified to meet WOCE specifications. White, Molinari and Meyers will be provided with the details of the proposed GTSPP QC procedures by Ron Wilson and will inform him of any changes or additions they believe necessary. Documentation on the pre-GTSPP QC procedures of NODC are given in Annexes 1 and 2. The collection centres will utilize the GTSPP QC procedures soon after agreement is reached, even though the GTSPP may not have officially begun.
Regional Centre QC Procedures
The meeting agreed that the following were essential scientific QC steps:
(1) Profile Review (Subjective profile by profile inspection).
(2) Statistical Test - Histogram on 5 x 5 degree, bi-monthly basis.
Levels - SST and every 50 m to 700 m. If test fails at any depth, observation will be flagged and not included in mapping/analysis.
(3) Mapping (See Annex 3 for example).
(1) Profile Review (by cruise, if not performed by the operators).
(2) Same as Radio (2).
The use of similar, if not identical, procedures in steps 2 and 3 (radio and delayed mode) is an objective that will be pursued during 1990 by the regional centres in cooperation with the National and Global Centres. Ron Wilson has taken on the task of collecting information on flags generated by the centres and will propose a common set for review by other participants. He will also circulate the GTSPP format for comment. The optimum interpolation procedure used by JEDA is described in Annex 3. Common features of the mapping should include climatologies (Levitus seems popular but others will be considered also), time/space scales, statistical procedures and the analyses should overlap.
Step 1 may be more difficult to standardize because of its subjective nature. The procedure used by Meyers is described on page 5. Molinari describes his procedure as follows:
"Profiles from one cruise are put on the screen consecutively and ten at a time to look for obvious outliers. AOML only has small screens but to save operator eyesight, recommends bigger work station type screens. As an inset on the same screen, AOML also displays the positions of the ten stations. The inset serves two purposes; it enables AOML to identify bad positions and when XBTs were widely spaced, it enables AOML to distinguish profiles in dramatically different thermal structure areas and not mistake them as outliers. The profiles and positions are colour coded to facilitate identification of bad casts/positions. This is a simple procedure but obviously labour intensive.
Considerable time was spent by the meeting in reviewing the flagging of data. The bottom line was, of course, that the generation and retention of flags is a critical part of the quality control programme. As can be expected, the flags used at present are not standardized but, as mentioned earlier, that is an objective and Wilson is taking the lead. His recommendations will no doubt take into consideration the procedures being considered within the GTSPP and those recommended by the WOCE Regional centres.
TABLE 2. CSIRO Quality Control Codes
|Category||Accept Feature||Reject Feature|
|Character Code||Integer Code||Action||Character Code||Integer Code||Action|
|Surface Spikes (start-up transients)||CSA||1||No change to class of data. Surface data removed to 3.7 m and replaced with 99.99.||-||-||Not applicable.|
|Modulo 10 Spikes||MOA||2||No change to class of data. Spikes replaced with linearly interpolated values.||-||-||Not applicable.|
|Wire Break||-||-||Not applicable.||WBR||-3||Class 4. Data deleted below depth of wire break.|
|Hit Bottom||HBA||4||Class 3 below depth of possible bottom hit.||HBR||-4||Class 4. Data deleted below depth of bottom hit.|
|PET Fault. (A type of leakage malfunction which occurs at the bottom of PROTECHNO profiles - thought to be a problem with the recorder-processor)||PFA||5||Small amplitude. Class 2 below depth of PET fault.||PFR||-5||Large amplitude. Class 4. Data deleted below starting depth of anomaly.|
|Inversion||IVA||6||No change to class of data.||-||-||Not applicable.|
|Nub (Inversion at the base of the mixed layer)||NUA||7||No change to class of data.||-||-||Not applicable.|
|Step-like trace / Fine structure||STA||8||No change to class of data.||-||-||Not applicable.|
|Wire Stretch||WSA||9||Class 2 below depth of possible inversion.||WSR||-9||Class 3 below depth of wire stretch.|
|Leakage / Fine Structure||LEA||10||Class 2 below depth of possible leakage.||LER||-10||Class 3 below depth of leakage.|
|Cusping||CUA||11||Small amplitude. Class 2 below start of cusping.||CUR||-11||Large amplitude. Class 3 below start of cusping.|
|High Frequency Instrument Noise||HFA||12||Small amplitude. Class 2 below start of noise - noisy data filtered.||HFR||-12||Large amplitude. Class 3 below start of noise.|
|Insulation Penetration (Isolated noise event with very small depth range)||IPA||13||Trace recovers after penetration. Class 2 below spike. Spike replaced with linear interpolation.||IPR||-13||Trace does not recover. Class 3 below spike.|
|Spike(s) / External Interference||SPA||14||Small amplitude. Class 2 below anomaly. Erroneous data removed and linearly interpolated.||SPR||-14||Large amplitude. Class 3 below anomaly.|
|Bowed Mixed Layer||BOA||15||Class 2 from surface.||BOR||-15||Class 3 from surface.|
|Other Surface Anomaly||SAA||16||Class 2 from surface.||SAR||-16||Class 3 from surface.|
|Temperature Difference / Offset||TOA||17||Class 2 from surface.||TOR||-17||Class 3 from surface.|
|Constant Temperature Trace||CTA||18||Class 1 to 10 m, class 3 below.||CTR||-18||Class 3 from surface.|
|No Trace||-||-||Not applicable.||NTR||-19||Class 4 from surface.|
|NBG||-||-||Not applicable.||NGR||-20||Class 3 from surface.|
|Test Probe||-||-||Not applicable.||TPR||-21||Class 4 from surface.|
|Digitized Data||DDA||22||No change to class of data.||-||-||Not applicable.|
|BATHY message DATA||BDA||23||No change to class of data.||-||-||Not applicable.|
|Duplicate Drop||DUA||24||No change to class of data. Duplicate to be kept.||DUR||-24||Class 3 from surface. Duplicate to be rejected.|
|Converted to RAW from 2 m.||2MA||25||No change to class of data.||-||-||Not applicable.|
|Converted to RAW from 5 m.||5MA||26||No change to class of data.||-||-||Not applicable.|
|Sticking Bit Problem||SBA||27||Data class 2 from surface (19 pt. filter with coeffs of 0.0526)||-||-||Not applicable.|
|Operator Error||-||-||Not applicable.||OPR||-28||Class 3 from surface.|
Two present flagging schemes provide one with a quick overview of the type of detail WOCE seeks within its XBT quality control programme. First, in Table 2, we have CSlRO's categorization of XBT problems and associated actions.
The classes referred to in the table are:
(1) Good data.
(2) Probably good data (some unusual features).
(3) Probably bad data (doubtful features).
(4) Bad data.
"Classes", sometimes referred to as Quick Look QC indicators are usually retained in the header portions of the stored data. All systems use this approach.
The second flagging scheme is that used by the TOGA Subsurface Data Center. The remainder of section 3 is an extract from the TSDC report of February 1990.
The header for each observation in the TSDC data set contains a character which indicates if the profile has been checked or not. This is called the "version" of quality control (referring to the Quality Control Manual 1 presented at the GTSPP meeting at Ottawa, July 1989).
- 0 means that TSDC did not yet check the profile (though this profile may have been qualified elsewhere).
- 1 means that the profile has passed the first version of TSDC's quality control.
Version 1 of the controls consists of a screening of the profiles, compared subjectively to the Levitus monthly climatology at the same location. "Subjectively" means that, as there is no standard deviation in this climatology, TSDC uses an arbitrary envelope around the climatological profile, say 2 sigma wide, and some operators' "knowledge" concerning variability in the different layers, possible areas and size of temperature inversions, etc.
While the profiles are passing this test, the flag fields are filled.
Version 2, and higher, are reserved for future and higher level quality controls, (consistency, models, etc....). There is presently 1 or 0 in this field.
Once the profile has passed the control, the field "profile quality flag" is filled as well as the flags attached to each of the profiles data. The fields "position flag" and "date flag" are not filled as they have not been controlled during this test.
In fact, TSDC checks the positions and dates during other controls (mainly when the data base is scanned for "hard" duplicates elimination), and with ships speed control. These tests are, unfortunately, not applied routinely along with the data checks, but rather off line on a microcomputer. The consequence is that these flag fields are not as yet filled even if position and date have been controlled. For large errors, however, in date or position which can be corrected (hemisphere change for instance), the flag should be 5 according to the IGOSS scale given below.
Flags signification and rules:
2: inconsistent with other elements
6 to 8: reserved
Flags attached to the data
The above scale is the IGOSS scale. Of course, the distinction between a doubtful (flag 3) and an erroneous (flag 4) feature is sometimes not very clear. TSDC has, however, to take decisions and therefore has adopted the following rules:
- spikes are flagged 4.
- temperature inversions are flagged 3 for solitary profiles or for areas without references or known occurrence of inversions.
- temperature increases at the bottom of the profile are flagged 4.
- when we hesitate, we flag 3. Rationale: draw the attention of the user of a possible problem.
For simplification purposes, TSDC does not use flag 2 (some inconsistencies) which is hard to interpret. TSDC data are considered either good, doubtful or wrong. Flag 2 will be used in upgraded "versions".
Profile quality flag
This flag must summarize the flags attached to the profile data. TSDC adopted the following rules:
- when a profile doesn't even look like a temperature profile it is rejected from the data base.
- a profile flagged 4 is unusable. Major causes: instrumental error or large uncorrectable position error.
- a profile flagged 3 contains data flagged 3 and/or 4 and may be partly usable for some purposes (the user must determine). The most frequent case is that the profile is good to some depth.
- a profile flagged 1 is good and contains only data flagged 1.
Flags transmission of IGOSS data
The GTS data transmitted to the TSDC by the French IGOSS Centre are flagged before their submission, according to the procedures described in the "Guide to Operational Procedures for the Collection and Exchange of IGOSS Data" (Manual and Guides 3, 1988, UNESCO). However, the French Meteorological Office flags only the data, not the profile. When loaded in the TOGA data base, their flags are saved and the profile flag is filled according to the following rules:
- if all the data are flagged 1 the profile flag is 1
- if some data are flagged 3 or 4 the profile flag is 3
- the position and data flags are saved (1 to 5)
- the "version" field is left to 0 (not checked by TSDC)
(Note: the first depth level is greater than zero, it is flagged 3 by the French Met. Office. These profiles are therefore automatically flagged 3).
The additional real time or delayed mode data transmitted further by other Centres are mostly not provided with flags or submitted with different scales. Their flags are lost and the profile and data are flagged 0 in the TOGA data base.
4. RE-ANALYSIS AND STORAGE OF ANALYSES
The question of re-analysis was discussed by the meeting which generally agreed that at some point, perhaps three years after a given year, it would be necessary. No decision was taken on re-analysis as it was a bit premature. This discussion also raised the question of saving analyses and again it was generally agreed that this was a good idea. Brest could assist but would not like to manage the file, just provide copies. The analyses performed solely on the radio data would not be saved, only that performed after incorporating delayed mode data. Further discussion is needed on this subject.
5. PROGRAMME IMPLEMENTATION SCHEDULE
The WOCE aim is to create an Upper Ocean Thermal data set for the period 1990 - 1995/6 and there was general agreement that the regional centres could meet this objective. Funding and standardized procedures must be finalized first, so the 1990 radio data will not be QC'ed by the regional centres: they will rely on delayed mode data for 1990. Regional Centres will come on line at different stages and at different levels of activity because of the funding. The Atlantic programme (NOAA/Molinari) is now funded while in the Indian Ocean, a proposal must be prepared for a 1991 start. Some measures are possible beforehand but probably not the full task. The Pacific programme is funded for 1990 but further funding must be sought.
US/NODC QUALITY CONTROL PROCEDURES
NEAR REAL TIME DATA
Near-real time bathythermograph data are received at the National Oceanographic Data Center (NODC) via the National Meteorological Center (NMC), Ocean Products Center (OPC), Camp Springs, MD, and the unclassified data from the US Navy, Fleet Numerical Oceanographic Center, Monterey, CA, over electronic telecommunications.
The NODC VAX 11/785 receives XBT data on a weekly basis through an RJE via the NAS 9000 from the OPC personnel in Camp Springs, MD. The weekly files are converted to an in-house Universal Bathythermograph Format (UBT). DECNET is utilized to transfer weekly files from the FNOC to NODC. The FNOC personnel do not use the JJXX format, instead they have provided NODC with an in-house format referred to as NODDS. The NODDS format is also converted to the NODC in-house UBT format. Both the JJXX and NODDS formats receive an initial check as they are converted to the UBT format.
Quality indicators (Data Source Code) are placed in character 100 in the UBT format which identifies the data as a specific level of quality. The NMC data is designated to be a 2, and the FNOC data is designated to receive a 3. These Quality indicators are utilized when NODC receives duplicate radio messages from two different sources.
The ELIM_DUPS program sorts the file in the following sequence:
TOGA QUALITY STATUS (DATA SOURCE CODE)
The elimination of Exact duplicates are performed and the highest quality data are selected. A subroutine, REJECT will eliminate any records that have illogical key fields for date, time, position, ship, or country. Fields are checked to be within a numeric range and that alpha character codes do not exist. Records that fall within the geographic TOGA AREA are written to the PACIFIC data file which will be used to update the continuously updated TOGAMASTER data base. Near-duplicates are reviewed once the tracking system is updated. The NODC utilizes a relational database, DATAEASE and an AT 386, which:
- Checks the radio message data in a cruise mode to verify logical speeds.
- Checks for complete replacement of radio message data with delayed mode data. The date, time or position may not match exactly due to transmission problems.
- Checks for near-duplicates received from both NMC and FNOC.
CONVERSION OF ALL INPUT FORMATS TO NODC UBT FORMAT
Quality Control Program
Convert JJXX & NODDS formats to UBT
1) Table lookup of Instrument_Type codes from IGOSS the Data_type code. These Instrument_type codes are:
5 Thermistor chain
2) Checks deepest depth-temperature pairs for an illegal alpha character. If found, the record is shortened by that number of depths.
3) Sometimes the starting 4 bytes of a station, "JJXX", are moved to the left to correctly position these characters. A station must start with these characters. Otherwise, the station is not processed.
4) If first depth is non-zero, a depth is added to the station, the depth-temperature pairs are shifted down 1 depth, and the first depth is made zero.
ASSEMBLY OF NODC UBT DATA
Once a file is converted to the UBT format, it is to be placed into one single file which will hold all the current months processing of both radio and delayed data. The elimination of exact duplicates and the selection of the highest quality records will be performed when the RUN_ELIM procedure is invoked.
"Elimdups" Rejection Criteria
If any non-numeric, except perhaps a blank, is in the following fields, the record is rejected:
Degrees Latitude, Longitude
Minutes Latitude, Longitude
Tenths Minute Latitude, Longitude
Minutes of Hour
If any of the following fields is totally blank, the record is rejected:
Hemisphere of Latitude, Longitude
Degrees Latitude, Longitude
Minutes Latitude, Longitude
Minutes of Hour
If any of the following fields is outside its given numeric range, the record is rejected:
|Degrees Latitude||00 - 90|
|Degrees Longitude||000 - 180|
|Minutes of Latitude, Longitude||00 - 59|
|Tenths Minute Latitude, Longitude||0 - 9, blank|
|Year||72 - present|
|Day|| 01 - allowable maximum
(leap year considered)
|Hour||00 - 24|
|Minutes of Hour||00 - 59|
If the units digit of any following field is blank, the record is rejected:
Degrees Latitude, Longitude
Minutes Latitude, Longitude
If there are single blanks in the following fields, it is replaced by a numeric character zero:
Degrees Latitude, Longitude
Minutes Latitude, Longitude
Minutes of Hour
If the Hemispheres of Latitude and Longitude do not contain either of 2 allowable values, the record is rejected:
Latitude Hemisphere N or S
Longitude Hemisphere E or W
The WMO Quadrant is tested for 1 of 4 allowable values based on the Hemispheres of Latitude, Longitude. Otherwise, the record is rejected.
If there is a single blank in either of these 2 fields, the record is rejected:
US/NODC QUALITY CONTROL PROCEDURES
DELAYED MODE DATA
One of the primary goals of the National Oceanographic Data Center (NODC) is to provide quality marine environmental data in a timely and cost effective manner. The NODC has compiled an extensive global database of oceanographic observations. The technical correctness of the database and maintenance requires a continuous flow of both new and revised data which can be evaluated rapidly.
NODC's techniques and processes for quality controlling data have several known deficiencies. In an effort to improve these processes within the data center, NODC's ADP Support Division was tasked with developing a new automated system for data quality assurance (Quality Assurance System). The objective of this system development effort was to provide NODC with an accurate, effective, and efficient procedure for processing marine data.
The Quality Assurance System is presently being implemented by the Data Processing Branch of the Data Base Management Division of the NODC. The system is designed to control the quality of certain data which has been received by NODC's Data Acquisition and Management Branch and is to be prepared for inclusion in NODC's archives by the Inventory and Archives Branch. It is designed as a discipline independent processing system so that software is usable for any data format acceptable to NODC. In order to monitor and direct the progress of data through all stages of quality assurance, a tracking data base is used. A common information library was created to support the quality assurance system.
This paper presents a brief description of the following procedures:
- FVERIFY Field verification checks
- RVERIFY Relationship verification checks
- IVERIFY Interactive verification checks
- MVERIFY Model verification checks
FVERIFY is a batch process which performs a variety of predefined checks on individual data fields to assure that the characteristics of any given field match characteristics which have been defined as acceptable for that field.
- Each field contains the expected data type (i.e. integer, text, etc.)
- Data values are within acceptable ranges
- Fields containing coded values contain acceptable codes
- Data values are consistent with information stored in the Quality Assurance Tracking System
- Mandatory fields are present
RVERIFY is also a batch process which evaluates the appropriateness of data values in relation to other data values. Data validity is determined either by comparing raw data values to other raw data values or by comparing values computed from raw data values to other raw or computed data values. Data relationship verification is similar to the data field verification process in that non-interactive software will perform defined relationship tests on the data, record any errors that occur in an error file, and, upon completion, generate a summary of the errors encountered.
For BT data, RVERIFY will test to see that:
- Each record contains at least two depth/temperature pairs
- The CONSEC number increases from one station to the next
- The LATITUDE DEGREES and LONGITUDE DEGREES are not both equal to 0
- If LATITUDE DEGREES equals 90, then LATITUDE MINUTES equals 0
- If LONGITUDE DEGREES equals 180, then LONGITUDE MINUTES equals 0
- For any date field, the day value is valid for the given month value
- The location value is in water
- The INTERVAL OF DIGITIZATION field is equal to 31, 32, or 33
- If the BOTTOM DEPTH field is blank, then the BOTTOM FLAG field must be blank
- If the INSTRUMENT field equals 1, then the TREATMENT AND STORAGE OF DIGITIZED DATA field must equal 01, 02, 03, 04, 05, 06, 07, 21, 22, 23, 24, 25, or 26
- If the INSTRUMENT field equals 1 and the PROBE field equals 1 or 2, then the DEPTH value within the last depth/temperature pair for that record must not be greater than 760 meters
- If the INSTRUMENT field equals 1 and the PROBE field equals 3 or 4, then the DEPTH value within the last depth/temperature pair for that record must not be greater than 460 meters
- If the INSTRUMENT field equals 1 and the PROBE field equals 6 or 7, then the DEPTH value within the last depth/temperature pair for that record must not be greater than 200 meters
- If the INSTRUMENT field equals 1 and the PROBE field equals 5, 9, A, or B, then the DEPTH value within the last depth/temperature for that record must not be greater than 1830 meters
- If the TIME PRECISION CODE field equals 1 or 4, then the MINUTES field must be greater than or equal to 0 and less than or equal to 59
- If the TIME PRECISION CODE field equals 2, then the resulting quotient upon dividing the MINUTES field by 6 must be greater than or equal to 0 and less than or equal to 9
- If the TIME PRECISION CODE field equals 3, then the MINUTES field must be blank
- If the LATITUDE PRECISION CODE field equals 1 or 4, then the LATITUDE MINUTES field must be greater than or equal to 0 and less than or equal to S9
- If the LATITUDE PRECISION CODE field equals 2, then the resulting quotient upon dividing the LATITUDE MINUTES field by 6 must be greater than or equal to 0 and less than or equal to 9
- If the LATITUDE PRECISION CODE field equals 3, then the LATITUDE MINUTES field must be blank
- If the LONGITUDE PRECISION CODE field equals 1 or 4, then the LONGITUDE MINUTES field must be greater than or equal to 0 and less than or equal to 59
- If the LONGITUDE PRECISION CODE field equals 2, then the resulting quotient upon dividing the LONGITUDE MINUTES field by 6 must be greater than or equal to 0 and less than or equal to 9
- If the LONGITUDE PRECISION CODE field equals 3, then the LONGITUDE MINUTES field must be blank
- The elapsed time between stations must be greater than or equal to 0 minutes and less than or equal to 10 days
- The calculated speed between stations must be less than or equal 25 knots
- The DEPTH value must increase from one depth/temperature pair to another within a station.
- The DEPTH value of the last depth/temperature pair within a station must be less than or equal to the station's BOTTOM DEPTH value plus 10%
- The DEPTH value of the first depth/temperature pair within each station must equal 0
- No DEPTH value within any depth/temperature pair within a station can be blank
- No TEMPERATURE value within any depth/temperature pair within a station can be blank
- The actual number of depth/temperature pairs must be equal to the number of depth/temperature pairs field
ADDITIONS TO RVERIFY
- SHIP CODE must be the same from station to station
- COUNTRY CODE must be the same from station to station
- NODC CODE, (METHOD, INTERVAL & TREATMENT/STORAGE) should not be different than previous station
- XBT INSTRUMENT TYPE should not be different than previous record
- GEOGRAPHIC QUADRANT must not conflict with LATITUDE/LONGITUDE
- DATE must be in chronological order
- CALIBRATION TEMPERATURE must be within 16.4 to 17.0 degrees C range, (not checked if INSTRUMENT TYPE = 4)
- CALIBRATION DEPTH must be within -30 to 30 meters or -100 to 100 foot range
IVERIFY is an interactive process which allows the user to request textual and graphic displays of a BT data set as aids in the diagnosis of errors, and to make necessary corrections to the data.
IVERIFY capabilities allow you to:
- Display a trace of a station
- Display a cruise track plot
- Display a waterfall plot
- Display a bottom depth plot
- Correct station header information
- Correct temperature values
- Delete depth/temperature pairs
- Miscellaneous information available: Sea floor, Speed, Consec #, Record #, Total # records, Total # records deleted.
The model comparison procedure will determine the acceptability of a data value in comparison to existing values within the environmental models. The NODC models are statistical summaries of oceanographic conditions and are useful in identifying data values which fall outside expected limits. The model checking software will perform defined comparisons on the data to historical data contained in the models. Errors will be noted in an error file and corrections to the data will be made in the same way as the data field verification. The model comparison program will be run only if a models file exists for the type of data being processed.
MVERIFY capabilities allow you to perform:
- SURFACE TEMPERATURE TEST
- MIXED LAYER TEST
- THERMOCLINE TEST
- ISOTHERM TEST
QUALITY CONTROL OF SUBSURFACE THERMAL DATA
AT THE SlO JOINT ENVIRONMENTAL DATA ANALYSIS CENTER
Warren B. White and Stephen E. Pazan
The principal task of the Joint Environmental Data Analysis (JEDA) Center at the Scripps Institution of Oceanography (SlO) is scientific quality-control of Pacific thermal data and related information for the TOGA scientific community.
The JEDA Center conducts scientific quality-control of its data set in the context of active oceanographic research and identifies and flags data of questionable quality; correction of erroneous data is the responsibility of the thermal data sources and is not done by the JEDA Center. Quality control is conducted in two phases: first there is an objective screening of the data records, and then there is a final scientific examination of the thermal data in an interpretive context. The initial objective screening examines the quality of the data reporting, not the quality of the thermal data itself. The final scientific examination starts with quality-control procedures which compare thermal data observations against both historical data and neighbouring observations. Finally, a trained scientist interprets and edits thermal data based upon whether these data are consistent with current scientific understanding.
2. OBJECTIVE SCREENING OF REPORT STRUCTURE
Each data report (e.g. the data-time group, latitude-longitude, platform identification and temperature-depth profile taken altogether) sent to the JEDA Center is screened by the NODC for uniqueness, internal format consistency, and completeness. These quality control procedures are not duplicated at the JEDA Center, but two objective screening procedures undertaken at the JEDA Center extend them further. First, as each data report, whether from NODC, the Japanese Hydrographic Center, or any other source, is translated into the JEDA Center's internal binary format, inconsistencies with the format description are exposed as a concomitant of the re-formatting process. Since all subsurface thermal data received by the JEDA Center is rendered into the same binary report form, every data report undergoes this format screening. Second, although exact duplicates are removed by the National Oceanographic Data Center (NODC), there are many reports remaining which are in fact duplicates, having arisen from the same observation, but differing in some detail. Frequently, IGOSS radio reports are archived with time information resolved only to hours, while delayed mode reports are accurate to minutes; these discrepancies result in unresolved 'near duplicates' remaining in the thermal data set delivered to the JEDA Center. The JEDA Center has adopted definitions of three levels of near-duplication which differ from exact duplicates as follows:
1. Temperature-depth observations may differ in number and value;
2. The same as near-duplicate Class 1, above, except that the date-time groups may differ by round-off or truncation of the minutes value;
3. The same as near-duplicate Class 2, above, except that the latitudes and longitudes may differ by round-off or truncation of the longitude or latitude.
All but one report in a group of near-duplicates are flagged; the unflagged report is that delayed-mode report with the finest vertical resolution.
3. SCIENTIFIC QUALITY CONTROL
Scientific quality control at the JEDA Center is taken in two steps serially applied; first, examine the data for outliers using distribution statistics founded upon historical data; second, subjectively examine graphical summaries of the data. The procedures used in the automatic and subjective quality control steps are as follows.
3.1 Automatic Quality Control
Residual values which fall into the tails of long-term observed distribution histograms of anomalous temperatures are flagged, since erroneous observations of temperature form the bulk of these outliers. The procedure uses observed cumulative distributions where historical data are sufficient to create such a distribution; if historical data are insufficient, temperature data profiles are simply flagged if any temperature is found outside the range 0 degrees C to 38 degrees C. The cumulative distribution histogram is summarized into three categories; the first and last categories consist of outliers which account for 5% of the total historical data. This defines the meaning of "outlier" as used above. IGOSS data received from NODC has at least 5% of its population flagged as suspect by this test.
3.2 Subjective Quality Control
Geophysical thermal patterns and thermal patterns arising from erroneous data may be confused. The most common error appears as an unreasonably strong cold or warm core eddy; these structures are considered suspect when they are the consequence of data reports from a single ship. Erroneous data can create thermal structure variability correlated with observational density, such as anomalously high temperatures aligned with a ship track. This is known as data bias. On the other hand, two examples of thermal structure variability which could be mistakenly ascribed to erroneous data are the rapid increase in thermocline depth during the onset of winter storms in the subarctic and the rapid change in equatorial upper ocean thermal structure which occur during El Niño. The scientist must judge whether the patterns of variability are consistent with known geophysical phenomena or if they are more likely to be due to erroneous data. A number of maps, graphics and statistical products which assist the scientist in the quality-control task have been developed at the JEDA Center. Each produce and the rationale for its use is listed as follows:
(1) Distribution maps of observation locations; one of the first steps in quality control is the assurance that there are no unexpected data voids or missing ship tracks.
(2) Error maps of optimally interpolated temperature. The interpolation error shows the effect of data inhomogeneity upon the mapped temperature.
(3) Horizontal maps of sea surface temperature residuals from a long-term bimonthly mean. Maps of SST are produced from independent satellite and ship weather observations by the Climate Analysis Center (CAC), and seasonal variability of SST is relatively well known and documented in many atlases.
(4) Vertically-averaged temperature (Tav ) (0-400 m) residuals from a long-term bimonthly mean. Tav and heat content are related to each other through the product of the heat capacity of water and the 400 m depth; as such, Tav provides an integrated glimpse of upper ocean thermodynamics and dynamics.
(5) Dynamic height (0/400 db) residuals from a long-term bimonthly mean. Dynamic height (0/400 db) residuals show the interannual response to local and remote ocean dynamics and atmospheric forcing. Comparison of dynamic height (0/400 db) with either Tav (0/400 m) or sea level (Rebert et al., 1985) is an effective quality-control procedure.
(6) Time/longitude matrices of dynamic height (0/400 db) are used to identify and track westward propagating Rossby waves and eastward propagating equatorial Kelvin and forced waves. This introduces an important element of time consistency into the examination of the data from month to month.
Objective Mapping Methods
Map construction has been made possible by the development over the past 15 years of the Volunteer Observing Ship (VOS) XBT program. Studies of historic XBT, MBT, and hydrographic thermal data have provided information on thermal sampling requirements needed to measure change in the thermal structure of the upper ocean. White et al. (1982) computed the space and time decorrelation scales for the tropical and subtropical western North Pacific. Using these scales, the optimum interpolation procedure maps inhomogeneous distributions of temperature residuals onto a uniform spatial and temporal grid at eleven standard depths: 0, 20, 40, 60, 80, 120, 160, 200, 240, 300, and 400 m. The optimum interpolation procedure maps the temperature residuals at each depth separately, accepting thermal data around each grid point in an ellipse with a zonal major axis of no more than 15 degrees of longitude and a meridional minor axis of no more than 3 degrees of latitude. The interpolated temperature residuals are converted to gridded absolute values by adding the climatological norm back onto the residual values of temperature at each grid point.
T/S relations derived from Levitus' mean temperature and salinity profiles are used to compute specific volume, dynamic height, and velocity potentials. Exact proceduresfor computing dynamic height from temperature/depth and information are given in Inoue et al. (1987)
The Kolmogorov-Smirnov Test
The mapping procedure requires data sampling from a Gaussian normal population. The Kolmogorov-Smirnov (K-S) (Press et al., 1986) non-parametric statistical test is a method of assuring that the mapping data fulfils this condition. In regions encompassing oceanic thermal front, boundaries of major ocean currents, and abrupt changes of thermal structure in time, observations of two distinctly different populations may be mixed in the optimal interpolation analysis. This would result in a undesirable smoothing of sharp oceanic thermal fronts and suppression of abrupt changes in the maps. The mapping procedure uses the K-S test for requiring data from Gaussian normal populations be used in optimal interpolation analyses.
Inoue, M., J.J. O'Brien, W.B. White, and S.E. Pazan, 1987. Interannual variability in the tropical Pacific for the period 1979-1982. J. Geophys. Res.,92, 11671-11679.
Press, W.H., B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, 1986. Numerical Recipes: The Art of Scientific Computing. Cambridge, Cambridge University Press, 818 pp.
Rebert, J.P., J.R. Donguy, G. Eldin, and K. Wyrtki, 1985. Relations between sea level, thermocline depth, heat content, and dynamic height in the tropical Pacific Ocean. J. of Geophys. Res., 90(C-6), 11719-11725.
White, W.B., G. Meyers, and K. Hasunuma, 1982. Space/time statistics of short-term climatic variability in the western North Pacific. J. Geophys. Res., V87, 1979-1989.