Abstract
Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen potential biomarkers under cadmium exposure in 403 urine samples. In positive and negative ionization mode, 4207 and 3558 features were extracted, respectively. We compared seven machine learning algorithms and found that the extreme gradient boosting (XGBoost) and random forest (RF) classifiers showed better accuracy and predictive performance than others. Following 5-fold cross-validation, the value of area under curve (AUC) was both 0.93 for positive and negative ionization modes in XGBoost classifier. In the RF classifier, AUC were 0.80 and 0.84 for positive and negative ionization modes, respectively. We then identified a biomarker panel based on XGBoost and RF classifiers. The incorporation of machine learning models into urine analysis using high-resolution mass spectrometry could allow a convenient assessment of cadmium exposure.
Original language | English |
---|---|
Pages (from-to) | 5184-5188 |
Number of pages | 5 |
Journal | Chinese Chemical Letters |
Volume | 33 |
Issue number | 12 |
Early online date | 7 Mar 2022 |
DOIs | |
Publication status | Published - Dec 2022 |
Scopus Subject Areas
- Chemistry(all)
User-Defined Keywords
- Cadmium exposure
- High-resolution mass spectrometry
- Human urine
- Machine learning
- Metabolic profiles