Science Outputs

Comparison of four learning-based methods for predicting groundwater redox status

Journal of Hydrology, Vol. 580, 124200

Abstract

Knowing the location where groundwater denitrification occurs, or by proxy the groundwater redox status (oxic, mixed, and anoxic), is valuable information for assessing and managing potential agricultural land-use impacts on freshwater quality. We compare the efficacy of supervised (Linear Discriminant Analysis LDA; Boosted Regression Trees, BRT; and Random Forest, RF) and unsupervised (Modified Self-Organizing Map, MSOM) learning-based methods to predict groundwater redox status in the agriculturally dominated Tasman, Waikato, and Wellington regions of New Zealand. Thresholds applied to regional groundwater-quality samples provide redox status variables and learn heuristics constrained by these variables and applied to spatial factors (climate, elevation, geologic, hydrology soils, and well depth) identify optimal sets of regional predictor variables. A split- sample approach is used to train and test the learning methods ability to predict redox status using the optimal predictor variables. Overall, the supervised methods demonstrate a prediction bias toward oxic conditions and inability to perform statistically well when using independent regional data; for example, consider kappa statistics for BRT (Tasman: 0.42, Waikato: 0.38, Wellington: 0.17), RF (Tasman: 0.42, Waikato: 0.47, Wellington: 0.17 and LDA (Tasman: 0.46, Waikato: 0.32, Wellington: 0.17). By contrast, the unsupervised method performs statistically well when predicting oxic, mixed, and anoxic conditions and corresponding depths when using independent regional data; for example, consider MSOM kappa statistics for Tasman: 0.78, Waikato: 0.80, Wellington: 0.76. The unsupervised learning method provides the added benefits of being (1) able to combine predictions into 3D regional anoxic probability plots for interpreting the spatial influence of paleosols and groundwater flowpaths on redox status, and (2) readily extended to map 3D redox status across New Zealand and other countries despite data bias and sparsity.