# 3.2 The Regional Multiple-Parameter Linear Regression for Checking for Offsets in DIC and TAlk

A regional MLR was applied to corroborate any offsets in the crossover comparisons of DIC and TAlk data. The regional MLR was performed to determine whether DIC and TAlk data for particular cruises were consistent in a regional context. For this check, a somewhat subjective assessment was made to determine which cruises to use as controls to create the MLR and which to use as unknowns. In general, the unknowns had larger offsets in the crossover analyses and/or had more scatter in the data. Several iterations were performed with different subsets of cruises used as controls to determine the coefficients. This was done to ensure consistency in cruise data

used to create the regressions. DIC and TAlk data from greater than 1500 m were fit with an MLR as a function of common hydrographic parameters. After calculation of the linear regression, it was possible to investigate the residuals for evidence of systematic differences between the various cruises. Details of the procedure can be found in Key (1999, 2000). The data quality of the independent variables was checked in the regional MLR technique to minimize potential biases resulting from an artificial offset in one of the independent variables.

The following regressions were solved
for the coefficients *α _{x}*
and

*β*:

_{x}DIC=_{}α_{1}+ α_{2}S+ α_{3}Θ+ α_{4}AOU+ α_{5}NO_{3}+ α_{6}SiO_{2}(1)TAlk= β_{1}+ β_{2}S+ β_{3}Θ+ β_{4}NO_{3}+ β_{5}SiO_{2}(2)

where

α and β = constants,
*S* = salinity,
*Θ * = potential temperature, C,
*AOU* = apparent oxygen utilization, µmol/kg,
*NO*_{3} = nitrate, µmol/kg,
*SiO*_{2} = silicate, µmol/kg.

The coefficients were determined from
regressions of the data from the cruises listed as "control" in
Table 6 and Table 7.
The independent variables used for the fits *(Θ,
S, AOU,* SiO_{2} for TAlk; and *,
S, AOU,* SiO_{2}, NO_{3} for DIC) are closely
related to response variables (DIC and TAlk). Potential temperature,
*&Theta,* and salinity, *S,*
for example, represent physical factors that influence DIC and TAlk,
while *AOU* accounts for the biological effects. However,
several different parameters can represent the same process. The
choice of parameters used to represent the processes that affect DIC
and TAlk were based on previous work in the Pacific (Lamb et al.
2001).

The Atlantic Ocean is divided into three regions: (1) north of 15° N, (2) between 15° N and 15° S, and (3) south of 15° S. Water mass characteristics for each of these regions are different, and thus the coefficients of the MLR are different.

For DIC, the cruises used as controls in region 1 (north of 15° N) included A01E, A01W, A02, A05R, A16N, A20, A22, and A24 (see Table 6 for details). The cruises used as "unknown" were A05 and A01WR. In region 2 (between 15° N and 15° S), the cruises used as controls included A08, A13, A14, A15, A16N, and A20. Those used as unknowns were A06, A07, A16S, A22, and A17. In region 3 (south of 15° S) the controls were A10, A13, A14, A15, and A16S. The unknowns were A09, A12, A12B, A12BR, A17, A21, and IO6. The Indian Ocean cruise IO6 along 30 E was included in the Atlantic synthesis as a boundary cruise between the Indian and Atlantic oceans. Data from IO6 were consistent with other Indian Ocean data based on similar quality checks (Millero et al. 1998, Sabine et al. 1999). The correlation coefficient, standard error, and coefficients for the best-fit equations obtained from the controls for each region for depths greater than 1500 m are as follows.

- Region 1--north of 15° N, N =
3130,
*R*^{2}= 0.95, residual standard error = 3.44 µmol/kg*DIC*= 2519.890 - 11.741*S*+ 1.418*T*+ 0.369*AOU*+ 0.896*NO*_{3}+ 0.693*SiO*_{2}(3) - Region 2--between 15° N and 15°
S, N = 1751,
*R*^{2}= 0.93, residual standard error = 5.72 µmol/kg*DIC*= 783.944 + 38.214*S*+ 0.369*T*+ 0.033*AOU*+ 1.375*NO*_{3}+ 0.974*SiO*_{2}(4) - Region 3--south of 15° S, N = 1388,
*R*^{2}= 0.95, residual standard error = 4.68 µmol/kg*DIC*= 4546.784 - 69.213*S*+ 5.210*T*+ 0.009*AOU*+ 0.385*NO*_{3}+ 0.887*SiO*_{2}(5)

Predicted DIC values were computed with these equations and were compared with observed values. The mean difference between observed DIC and the predicted DIC (i.e., observed minus predicted) for each of the cruises is given in Table 6.

For TAlk, the cruises used as controls in region 1 included A05, A05R, A16N, A22, and A24. The unknown cruises were A01E, A01W, A02, and A20. In region 2, the controls were A14, A15, A16N, A16S, and A22. The unknowns were A06, A07, A08, A17, and A20. In region 3, the controls were A10, A14, A15, and A16S. The unknowns were A09, A12, A21, A23 and A17. The best-fit MLR equations obtained from the control cruises for TAlk are as follows.

- Region 1-north of 15° N, N =
1759,
*R*^{2}= 0.93, residual standard error = 4.2 µmol/kg*TAlk*= -450.510 + 77.811*S*- 3.706*T*+ 2.570*NO*+ 0.808_{3}*SiO*_{2}(6) - Region 2-between 15° N and 15°
S, N = 737,
*R*^{2}= 0.78, residual standard error = 6.9 µmol/kg*TAlk*= 808.654 + 43.164*S*- 18.971*T*+ 3.765*NO*_{3}+ 0.408*SiO*_{2}(7) - Region 3 south of 15° S, N =
754,
*R*^{2 }= 0.67, residual standard error = 8.4 µmol/kg*TAlk*= -634.340 + 84.305*S*- 3.401*T*+ 0.588*NO*_{3}+ 0.560*SiO*_{2}(8)

TAlk values were computed with these equations and were compared with observed values. The mean difference between observed TAlk and the predicted TAlk (i.e., observed minus predicted) for each of the cruises are given in Table 7.