TY - JOUR
T1 - Prediction of Total Soluble Solids in Apricot Using Adaptive Boosting Ensemble Model Combined with NIR and High-Frequency UVE-Selected Variables
AU - Gao, Feng
AU - Xing, Yage
AU - Li, Jialong
AU - Guo, Lin
AU - Sun, Yiye
AU - Shi, Wen
AU - Yuan, Leiming
N1 - This study was funded by the National Natural Science Foundation of China (32160694 and 62305253), Wenzhou Science and Technology Specialist Project (X2023011), Science and Technology Plan Project of Wenzhou Municipality (N2023008 and G20220037), and Wenzhou Major Technological Innovation and Research Project (ZZN2023004).
Publisher Copyright:
© 2025 by the authors.
PY - 2025/4/1
Y1 - 2025/4/1
N2 - Total soluble solids (TSSs) serve as a crucial maturity indicator and quality determinant in apricots, influencing harvest timing and postharvest management decisions. This study develops an advanced framework integrating adaptive boosting (Adaboost) ensemble learning with high-frequency spectral variables selected by uninformative variable elimination (UVE) for the rapid non-destructive detection of fruit quality. Near-infrared (NIR) spectra (1000~2500 nm) were acquired and then preprocessed through robust principal component analysis (ROBPCA) for outlier detection combined with z-score normalization for spectral pretreatment. Subsequent data processes included three steps: (1) 100 continuous runs of UVE identified characteristic wavelengths, which were classified into three levels—high-frequency (≥90 times), medium-frequency (30–90 times), and low-frequency (≤30 times) subsets; (2) the development of the base optimal partial least squares regression (PLSR) models for each wavelength subset; and (3) the execution of adaptive weight optimization through the Adaboost ensemble algorithm. The experimental findings revealed the following: (1) The model established based on high-frequency wavelengths outperformed both full-spectrum model and full-characteristic wavelength model. (2) The optimized UVE-PLS-Adaboost model achieved the peak performance (R = 0.889, RMSEP = 1.267, MAE = 0.994). This research shows that the UVE-Adaboost fusion method enhances model prediction accuracy and generalization ability through multi-dimensional feature optimization and model weight allocation. The proposed framework enables the rapid, non-destructive detection of apricot TSSs and provides a reference for the quality evaluation of other fruits in agricultural applications.
AB - Total soluble solids (TSSs) serve as a crucial maturity indicator and quality determinant in apricots, influencing harvest timing and postharvest management decisions. This study develops an advanced framework integrating adaptive boosting (Adaboost) ensemble learning with high-frequency spectral variables selected by uninformative variable elimination (UVE) for the rapid non-destructive detection of fruit quality. Near-infrared (NIR) spectra (1000~2500 nm) were acquired and then preprocessed through robust principal component analysis (ROBPCA) for outlier detection combined with z-score normalization for spectral pretreatment. Subsequent data processes included three steps: (1) 100 continuous runs of UVE identified characteristic wavelengths, which were classified into three levels—high-frequency (≥90 times), medium-frequency (30–90 times), and low-frequency (≤30 times) subsets; (2) the development of the base optimal partial least squares regression (PLSR) models for each wavelength subset; and (3) the execution of adaptive weight optimization through the Adaboost ensemble algorithm. The experimental findings revealed the following: (1) The model established based on high-frequency wavelengths outperformed both full-spectrum model and full-characteristic wavelength model. (2) The optimized UVE-PLS-Adaboost model achieved the peak performance (R = 0.889, RMSEP = 1.267, MAE = 0.994). This research shows that the UVE-Adaboost fusion method enhances model prediction accuracy and generalization ability through multi-dimensional feature optimization and model weight allocation. The proposed framework enables the rapid, non-destructive detection of apricot TSSs and provides a reference for the quality evaluation of other fruits in agricultural applications.
KW - apricot
KW - ensemble learning
KW - feature selection optimization
KW - spectral analysis
KW - total soluble solids
UR - http://www.scopus.com/inward/record.url?scp=105002221494&partnerID=8YFLogxK
UR - https://www.mdpi.com/1420-3049/30/7/1543
U2 - 10.3390/molecules30071543
DO - 10.3390/molecules30071543
M3 - Journal article
AN - SCOPUS:105002221494
SN - 1420-3049
VL - 30
JO - Molecules
JF - Molecules
IS - 7
M1 - 1543
ER -