TY - JOUR
T1 - Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils
AU - Shi, Tiezhu
AU - Liu, Huizeng
AU - Chen, Yiyun
AU - Fei, Teng
AU - Wang, Junjie
AU - Wu, Guofeng
N1 - This study was supported by the China Postdoctoral Science Foundation (No. 2016M602521), by Science and Technology Bureau of Suzhou (No. SYN201309), the Scientific Research Foundation for Newly High-End Talents of Shenzhen University, the Basic Research Program of Shenzhen Science and Technology Innovation Committee (No. JCYJ20151117105543692), and Shenzhen Future Industry Development Funding Program (No. 201507211219247860).
Publisher Copyright:
© 2017 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2017/5/4
Y1 - 2017/5/4
N2 - This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies.
AB - This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies.
KW - Feature selection
KW - Heavy metal contamination
KW - Machine-learning
KW - Spectral pre-processing
KW - Visible and near-infrared reflectance spectroscopy
UR - http://www.scopus.com/inward/record.url?scp=85019162183&partnerID=8YFLogxK
U2 - 10.3390/s17051036
DO - 10.3390/s17051036
M3 - Journal article
C2 - 28471412
AN - SCOPUS:85019162183
SN - 1424-8220
VL - 17
JO - Sensors (Switzerland)
JF - Sensors (Switzerland)
IS - 5
M1 - 1036
ER -