نبذة مختصرة : [Objective] This study aimed to develop a rapid and non-destructive method for determining the moisture content of hawthorn fruits using hyperspectral imaging (HSI) integrated with machine learning algorithms. By evaluating the effects of different fruit orientations and spectral ranges, the research provides theoretical insights and technical support for real-time moisture monitoring and intelligent fruit sorting. [Methods] A total of 458 fresh hawthorn samples, representing various regions and cultivars, were collected to ensure diversity and robustness. Hyperspectral images were acquired in two spectral ranges: visible-near-infrared (VNIR, 400~1 000 nm) and short-wave infrared (SWIR, 940~2 500 nm). A threshold segmentation algorithm was used to extract the region of interest (ROI) from each image, and the average reflectance spectrum of the ROI served as the raw input data. To enhance spectral quality and reduce noise, five preprocessing techniques were applied: Savitzky-Golay (SG) smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative (FD), and second derivative (SD). Four regression algorithms were then employed to build predictive models: partial least squares regression (PLSR), support vector regression (SVR), random forest (RF), and multilayer perceptron (MLP). The models were evaluated under varying fruit orientations (stem-side facing downward, upward, sideways, and a combined set of all three) and spectral ranges (VNIR, SWIR, and VNIR+SWIR). To further reduce the dimensionality of the hyperspectral data and minimize redundancy, four feature selection methods were applied: successive projections algorithm (SPA), competitive adaptive reweighted sampling (CARS), variable iterative space shrinkage approach (VISSA), and discrete wavelet transform combined with stepwise regression (DWT-SR). The DWT-SR method utilized the Daubechies 6 (db6) wavelet basis function at a decomposition level of 1. [Results and Discussions] Both fruit orientation and spectral range had a significant impact on model performance. The optimal prediction results were achieved when the stem-side of the fruit was facing downward, using the SWIR range (940~2 500 nm) and FD preprocessing. Under these conditions, the SVR model exhibited the highest predictive accuracy, with a coefficient of determination (R2ₚ) of 0.860 5, mean absolute error (MAEₚ) of 0.711 1, root mean square error (RMSEₚ) of 0.914 2, and residual prediction deviation (RPD) of 2.677 6. Further feature reduction using the DWT-SR method resulted in the selection of 17 key wavelengths. Despite the reduced input size, the SVR model based on these features maintained strong predictive capability, achieving R2ₚ = 0.857 1, MAEₚ = 0.669 2, RMSEₚ = 0.925 2, and RPD = 2.645 7. These findings confirm that the DWT-SR method effectively balances dimensionality reduction with model performance. The results demonstrate that the SWIR range contains more moisture-relevant spectral information than the VNIR range, and that first derivative preprocessing significantly improves the correlation between spectral features and moisture content. The SVR model proved particularly well-suited for handling nonlinear relationships in small datasets. Additionally, the DWT-SR method efficiently reduced data dimensionality while preserving key information, making it highly applicable for real-time industrial use. [Conclusions] In conclusion, hyperspectral imaging combined with appropriate preprocessing, feature selection, and machine learning techniques offers a promising and accurate approach for non-destructive moisture determination in hawthorn fruits. This method provides a valuable reference for quality control, moisture monitoring, and automated fruit sorting in the agricultural and food processing industries.
No Comments.