Skip to main content
OCADSAccess DataGlobal surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205) NDP-101

NDP-101 (2019)

NCEI Accession 0206205 Data and Documentation Files

Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205)

Luke Gregor1, 2, 3, Alice D. Lebehot1,2, Schalk Kok4, Pedro M. Scheel Monteiro1

Prepared by Alex Kozyr5

1SOCCO, Council for Scientific and Industrial Research, Cape Town, 7700, South Africa
2MaRe, Marine Research Institute, University of Cape Town, Cape Town, 7700, South Africa
3Environmental Physics, Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland
4Department of Mechanical & Aeronautical Engineering, University of Pretoria, Pretoria, 0028, South Africa
5National Centers for Environmental Information, National Oceanic and Atmospheric Administration, 1315 East-West Highway, Silver Spring, MD 20910-3282

image

Abstract

This NCEI accession contains surface-ocean partial pressure of carbon dioxide (pCO2) that the ensemble mean of six two-step clustering-regression machine learning methods. The ensemble is a combination of two clustering approaches and three regression methods. For the clustering approaches, we use K-means clustering (21 clusters) and open ocean CO2 biomes as defined by Fay and McKinley (2014). Three machine learning regression methods are applied to each of these two clustering methods. These machine learning methods are feed-forward neural-network (FFN), support vector regression (SVR) and gradient boosted machine using decision trees (GBM). The final estimate of surface ocean pCO2 is the average of the six machine learning estimates resulting in a monthly by 1° ⨉ 1° resolution product that extends from the start of 1982 to the end of 2016. Sea-air fluxes (FCO2) calculated from pCO2 are also presented in the data. The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and fCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.

Method citation:

Gregor, L., Lebehot, A. D., Kok, S., and Scheel Monteiro, P. M.: A comparative assessment of the uncertainties of global surface­-ocean CO2 estimates using a machine learning ensemble (CSIR-ML6 version 2019a) – have we hit the wall?, Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-46, in review, 2019.

Variables:

COORDINATES
time (420)
units = seconds since 2000-01-01
min = 1982-01-15
max = 2016-12-15
step = month
lat (180)
	long_name = latitude
	units = degrees_north
	min = -89.5
	max = 89.5
	step = 1.0
lon (360)
	long_name = longitude
	units = degrees_east
	min = -180
	max = 180
	step = 1.0 
	
VARIABLES
pCO2sea_raw  (time, lat, lon) 
	long_name = partial pressure of surface ocean CO2
        units = µatm
        description = The ensemble mean of six machine learning methods that first cluster data and then apply regression to the clusters.

pCO2sea_smooth (time, lat, lon) long_name = smoothed partial pressure of surface ocean CO2 units = µatm description = The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.
FCO2_raw (time, lat, lon) long_name = sea-air CO2 flux units = molC/m2/yr Description = sea-air CO2 flux calculated after Landschutzer et al (2016) as the product of the following: (pCO_2^sea-pCO_2^air) where positive values are outgassing from sea to air; the Wanninkhof (1992) parameterisation of k_w scaled globally to 16 cm/hr with wind from ERA-interim (Dee et al, 2011); K_0 from Weiss (1974) using the OISSTv2 sea surface temperature product (Reynolds et al. 2007) and EN4 sea salinity (Good et al. 2013); (1 - ice_frac) as described in Butterworth and Miller (2016).
FCO2_smooth (time, lat, lon) long_name = smoothed sea-air CO2 flux units = molC/m2/yr description = The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.
pCO2air (time, lat, lon) long_name = partial pressure of atmospheric CO2 units = µatm description = Atmospheric pCO2 from CarboScope v1.7 (Rodenbeck et al 2014) source = http://www.bgc-jena.mpg.de/CarboScope/
seamask (lat, lon) description = boolean mask where True is ocean and False is land or NULL source = www.nodc.noaa.gov/ocads/oceans/SPCO2_1982_2015_ETH_SOM_FFN.html
Please cite this data set as:
Gregor, Luke; Lebehot, Alice D.; Kok, Schalk; Monteiro, Pedro M. Scheel (2019). Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205). Version 1.1. NOAA National Centers for Environmental Information Dataset.
Last modified: 2021-03-17T18:30:28Z