Mapping Cadmium Contamination Potential in Surface Soil for Civil Engineering Applications: A Comparative Study of Machine Learning and Deep Learning Models in the Gianh River Basin, Vietnam
Main Article Content
Abstract
Cadmium (Cd) is a toxic heavy metal with significant environmental and human health risks, particularly when accumulated in surface soils. Its presence reduces soil fertility, disrupts microbial ecosystems, and poses long-term ecological threats. This study explores the application of artificial intelligence (AI) models for mapping the potential distribution of Cd contamination in surface soils within the Gianh River Basin, Quang Binh Province, Vietnam. Four machine learning (ML) models Logistic Regression (LR), Radial Basis Function Network (RBFN), Random Forest (RF), and Support Vector Machine (SVM) and four deep learning (DL) model variants (DNN-Opt1 to DNN-Opt4) were developed and compared. The DNN variants differ based on the configuration of hidden layers and neuron counts.
A total of 100 topsoil samples were collected and classified using the Geoaccumulation Index (Igeo), serving as the target variable for supervised learning. Thirteen conditioning factors were used as input variables, including Elevation, Soil Type, Slope, Curvature, proximity to roads and rivers, and seven Landsat 8 spectral bands. The dataset was divided into training (70%) and testing (30%) subsets. Model performance was evaluated using multiple metrics, including the area under the ROC curve (AUC), accuracy (ACC), Kappa coefficient, root mean square error (RMSE), and confusion matrix.
Among the tested models, the DNN-Opt2 variant demonstrated the highest predictive performance with AUC = 0.858, ACC = 73.33%, Kappa = 0.47, and RMSE = 0.45. The resulting contamination potential map, particularly that derived from the RBFN model, categorized the region into five contamination risk levels: very low, low, moderate, high, and very high. This spatial information is critical not only for environmental management but also for assessing risks to groundwater quality and the structural integrity of buildings located in high-risk zones. The study demonstrates the efficacy of deep learning in enhancing predictive accuracy for heavy metal contamination mapping and underscores its practical relevance in civil and environmental engineering applications.