Skip to main content

Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205)

INVESTIGATORS:
Luke Gregor - Institute of Biogeochemistry and Pollutant Dynamics (IBP)
Alice D. Lebehot - SOCCO, Council for Scientific and Industrial Research, Cape Town, 7700, South Africa; MaRe, Marine Research Institute, University of Cape Town, Cape Town, 7700, South Africa
Schalk Kok - Department of Mechanical and Aeronautical Engineering, University of Pretoria, Pretoria, 0028, South Africa
Pedro M. S. Monteiro - SOCCO, Council for Scientific and Industrial Research, Cape Town, 7700, South Africa

PACKAGE DESCRIPTION: This dataset contains surface-ocean partial pressure of carbon dioxide (pCO2) that the ensemble mean of six two-step clustering-regression machine learning methods. The ensemble is a combination of two clustering approaches and three regression methods. For the clustering approaches, we use K-means clustering (21 clusters) and open ocean CO2 biomes as defined by Fay and McKinley (2014). Three machine learning regression methods are applied to each of these two clustering methods. These machine learning methods are feed-forward neural-network (FFN), support vector regression (SVR) and gradient boosted machine using decision trees (GBM). The final estimate of surface ocean pCO2 is the average of the six machine learning estimates resulting in a monthly by 1° ⨉ 1° resolution product that extends from the start of 1982 to the end of 2016. Sea-air fluxes (FCO2) calculated from pCO2 are also presented in the data. The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and fCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.

CITE AS: Gregor, Luke; Lebehot, Alice D.; Kok, Schalk; Monteiro, Pedro M. S. (2019). Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/z682-mn47. Accessed [date].


IDENTIFICATION INFORMATION FOR THIS DATA PACKAGE:
NCEI ACCESSION: 0206205
NCEI DOI: https://doi.org/10.25921/z682-mn47
EXPOCODE: Various;
CRUISE ID: Various;
SECTION/LEG: Various;

TYPES OF STUDY:
Data synthesis product;Discrete measurement;Profile;

TEMPORAL COVERAGE:
START DATE: 1982-01-01
END DATE: 2016-12-31

SPATIAL COVERAGE:
NORTH: 89.5
WEST: -180
EAST: 180
SOUTH: -89.5

GEOGRAPHIC NAMES:
Pacific Ocean;Atlantic Ocean;Indian Ocean;Southern Ocean;Arctic Ocean;

PLATFORMS:
Various;

RESEARCH PROJECT(S):
none;


VARIABLES / PARAMETERS:

Time
Abbreviation: Time
Unit: seconds since 2000-01-01
Detailed sampling and analyzing information: min = 1982-01-15; max = 2016-12-15; step = month
Latitude
Abbreviation: lat (180)
Unit: degrees_north
Detailed sampling and analyzing information: min = -89.5; max = 89.5; step = 1.0
Latitude
Abbreviation: lon (360)
Unit: degrees_east
Detailed sampling and analyzing information: min = -180; max = 180; step = 1.0
partial pressure of surface ocean CO2 raw
Abbreviation: pCO2sea_raw (time, lat, lon)
Unit: µatm
Detailed sampling and analyzing information: The ensemble mean of six machine learning methods that first cluster data and then apply regression to the clusters
smoothed partial pressure of surface ocean CO2
Abbreviation: pCO2sea_smooth (time, lat, lon)
Unit: µatm
Detailed sampling and analyzing information: The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.
sea-air CO2 flux
Abbreviation: FCO2_raw (time, lat, lon)
Unit: molC/m2/yr
Detailed sampling and analyzing information: sea-air CO2 flux calculated after Landschutzer et al (2016) as the product of the following: (pCO_2^sea-pCO_2^air) where positive values are outgassing from sea to air; the Wanninkhof (1992) parameterisation of k_w scaled globally to 16 cm/hr with wind from ERA-interim (Dee et al, 2011); K_0 from Weiss (1974) using the OISSTv2 sea surface temperature product (Reynolds et al. 2007) and EN4 sea salinity (Good et al. 2013); (1 - ice_frac) as described in Butterworth and Miller (2016).
smoothed sea-air CO2 flux
Abbreviation: FCO2_smooth (time, lat, lon)
Unit: mol/m2/a
Detailed sampling and analyzing information: The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.
partial pressure of atmospheric CO2
Abbreviation: pCO2air (time, lat, lon)
Unit: µatm
Detailed sampling and analyzing information: Atmospheric pCO2 from CarboScope v1.7 (Rodenbeck et al 2014) source = http://www.bgc-jena.mpg.de/CarboScope/
boolean mask where True is ocean and False is land or NULL
Abbreviation: seamask (lat, lon)
Detailed sampling and analyzing information: source - https://www.ncei.noaa.gov/access/ocean-carbon-data-system/oceans/SPCO2_1982_present_ETH_SOM_FFN.html

PUBLICATIONS DESCRIBING THIS DATASET:
Gregor, L., Lebehot, A. D., Kok, S., and Scheel Monteiro, P. M., 2019: A comparative assessment of the uncertainties of global surface ocean CO2 estimates using a machine-learning ensemble (CSIR-ML6 version 2019a) - have we hit the wall?, Geosci. Model Dev. Discuss., 12, 5113-5136, https://doi.org/10.5194/gmd-2019-46.
ADDITIONAL INFORMATION:
none;
FUNDING AGENCY:
SOCCO, Council for Scientific and Industrial Research
PROJECT TITLE:
PROJECT ID:

SUBMITTED BY: Luke Gregor (luke.gregor@usys.ethz.ch)

SUBMISSION DATE: 2019-10-18