CDIP netCDF datasets for Datawell Waveriders: NCEI collection Feb 2021 ------------------------------------------------------------- The primary source for the netCDF datasets in NCEI's SIO-CDIP-netCDF collection is CDIP's THREDDS server, https://thredds.cdip.ucsd.edu/. The datasets and metadata are designed to meet the standards and recommendations presented by the Attribute Convention for Dataset Discovery (ACDD), the Climate and Forecast (CF) Metadata Conventions, and the National Centers for Environmental Information (NCEI) NetCDF Templates. CDIP's netCDF datasets are in the 'classic' netCDF format and can be accessed with any standard netCDF client. Extensive documentation on CDIP's processing, QA/QC, and operations can be found at https://cdip.ucsd.edu/documentation Specifications, manuals and technical notes on the Datawell buoys used by CDIP are available at https://datawell.nl DATASET TYPES ------------- At the top level, the Waverider datasets available in the NCEI collection are divided into two groups: ARCHIVE and REALTIME. ARCHIVE datasets cover completed buoy deployments with all quality control procedures applied, while REALTIME datasets are for active buoy deployments, subject to a more limited subset of QC routines. ARCHIVE datasets are largely static, whereas REALTIME datasets are regularly updated. On CDIP's THREDDS server, most REALTIME datasets are updated at 30-minute intervals. ARCHIVE datasets - single-deployment ------------------------------------ There are two distinct types of ARCHIVE datasets available in the NCEI collection: single-deployment datasets and multi-deployment historic aggregates. Comprehensive single-deployment datasets include all of the values transmitted or stored by the buoys - wave parameters, wave spectra, continuous displacements, sea surface temperatures, GPS readings, system diagnostic information, and in some cases surface currents or air temperatures. They also include additional spectral wave fields derived by CDIP. Complete metadata is provided in every deployment file: hull and hatchcover serial numbers, comments on data or operational issues during the deployment, etc. One deployment covers a single installation of a hull, hatchcover, and mooring combination for a station. The largest portion of these datasets are the vertical and horizontal displacement values recorded by the buoy, which are sampled at 1.28Hz (Mk1-3) or 2.56Hz (Mk4) over the entire buoy deployment. The comprehensive single-deployment datasets include all records for the deployment, both those that pass CDIP's quality control routines and those that fail and are not otherwise publicly released. Full single-record and time domain QC is performed on each dataset. Information from the post-deployment calibration checks on the buoy may also inform the setting of the quality flags. Single-deployment datasets are named with the 3-digit station ID, the 2-character data stream ID, and the buoy deployment number: 071p1_d03.nc, 181p1_d01.nc, etc. The buoy deployments are numbered starting from 01. For stations which have had non-directional buoys in addition to Waveriders, there will be gaps in the deployment numbers. For instance Station 076, Diablo Canyon, had a long history of non-directional buoys before the first Waverider was deployed; the first netCDF dataset available is 076p1_d08.nc. ARCHIVE datasets - Multi-deployment 'historic' aggregates --------------------------------------------------------- The comprehensive single-deployment datasets are also aggregated for each station, creating a historic dataset with all Waverider records for the site. The aggregate datasets only include records for which the primary QC flag is '1', i.e. good data. The historic datasets include the wave, SST, GPS, and system variables, but exclude the displacements, since they are generally too large to be handled efficiently across deployments. Use of the multi-deployment aggregates is recommended for all users except those who are interested in displacement data or in accessing records which have not passed CDIP's quality control checks. Historic aggregates are named with the 3-digit station ID, the 2-character data stream ID, and the label 'historic': 071p1_historic.nc, 181p1_historic.nc. REATLIME datasets - Comprehensive 'rt' -------------------------------------- Data from currently active Waveriders are continuously updating in two types of REALTIME datasets. The comprehensive 'rt' datasets include all of the values transmitted or stored by the buoys - wave parameters and spectra, sea surface temperatures, GPS readings, system diagnostic information, etc. - with the exception of the continuous displacements. The quality control flags on the 'rt' datasets are set based on a range of QC tests. However some tests in the time domain which are used with ARCHIVE datasets - spike checks and the like - are not applied to the REATLIME data. Comprehensive realtime datasets are named with the 3-digit station ID, the 2- character data stream ID, and the label 'rt': 071p1_rt.nc, 181p1_rt.nc. REATLIME datasets - Continuous displacements -------------------------------------------- Because the displacements from a single deployment can form a very large dataset, with potentially tens of millions of records, they are stored in a separate dataset for active buoy deployments. These 'xy' datasets contain only the displacement values, flags, and source file information. Note that for some buoys displacement data is not acquired in realtime, or is only acquired when the wave height meets a certain threshold. So the xy datasets may have limited records available or may be missing entirely for some buoys. As with the single-deployment ARCHIVE datasets, the displacement values in the 'xy' datasets are not subject to quality control. They are all recorded directly as received from the buoy, with the QC flag value set to '2', 'not_evaluated'. Where displacements are not available the QC flag value will be '9', 'missing'. REALTIME displacement datasets are named with the 3-digit station ID, the 2- character data stream ID, and the label 'xy': 071p1_xy.nc, 181p1_xy.nc. QUALITY CONTROL --------------- When working with CDIP's netCDF datasets, an awareness of the relevant QC standards and implementation is essential. CDIP's REALTIME 'rt' and ARCHIVE single-deployment datasets may include records that have not passed QC checks; proper use of the datasets' flag variables is needed to identify these records. The QC status of each sensor observation is defined by a two-level flagging system based on UNESCO's Ocean Data Standards (IOC Manuals and Guides, 54, Volume 3 Version 1). Users of the 'rt' and single-deployment datasets need to load and assess the QC flags when working with the data. The primary QC flag for all wave, temperature, and current observations is set to either 1 (good) or 4 (bad). The secondary QC flag will indicate the reason that any record is flagged bad. For the continuous displacements, however, no primary QC is performed; the primary flag QC is either 2 (not evaluated) or 9 (missing). Note that flag variables are only defined and set for environmental variables, e.g. for the sensor measurements and their derived values, not for diagnostic and metadata variables. Any variable for which flags are available will have the flag variables noted in its 'ancillary_variables' attribute. ADDITIONAL METADATA AND INFORMATION ----------------------------------- The two-tier QC flagging system used in the netCDF datasets does not include all of the information generated by CDIP's quality control routines. More extensive QC information is saved in a database of processing warnings and flags. The content of this database can be made available to users upon request. Most of the information in the logs are 'warning' notices, where the automated QC noted anomalous data but did not change the QC flags for the record in question. The logs also include a record of manual changes made to the datasets' QC flags and to metadata attributes. Lastly, in addition to accessing data via the THREDDS server, users can browse the contents - or design their own client software - based on ascii dataset catalogs provided online. These catalogs list all of the available datasets and for each give the sensor type, start and end dates, and total number of wave records available. The catalogs can be accessed at the URLs below. https://cdip.ucsd.edu/data_access/metadata/wavecdf_by_stnid.txt https://cdip.ucsd.edu/data_access/metadata/wavecdf_by_datemod.txt Any questions or comments on the SIO-CDIP-netCDF collection can be directed to www@cdip.ucsd.edu