Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3

Chen, Zihan; Wang, Haoyun; Wang, Jufei; Xu, Huanliang; Mei, Ni; Zhang, Sixu

doi:10.3390/agriculture14091450

Open AccessArticle

Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3

by

Zihan Chen

¹,

Haoyun Wang

^1,*,

Jufei Wang

^2,3,*

,

Huanliang Xu

¹,

Ni Mei

¹ and

Sixu Zhang

¹

College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210031, China

²

Key Laboratory of Intelligent Agricultural Equipment in Jiangsu Province, Nanjing 210031, China

³

College of Engineering, Nanjing Agricultural University, Nanjing 210031, China

^*

Authors to whom correspondence should be addressed.

Agriculture 2024, 14(9), 1450; https://doi.org/10.3390/agriculture14091450

Submission received: 21 July 2024 / Revised: 11 August 2024 / Accepted: 22 August 2024 / Published: 25 August 2024

(This article belongs to the Section Agricultural Product Quality and Safety)

Download

Browse Figures

Versions Notes

Abstract

:

Current methods for detecting apple watercore are expensive and potentially damaging to the fruit. To determine whether different batches of apples are suitable for long-term storage or long-distance transportation, and to classify the apples according to quality level to enhance the economic benefits of the apple industry, it is essential to conduct non-destructive testing for watercore. This study proposes an innovative detection method based on optical parameter inversion and the MobileNetV3 model. Initially, a three-layer plate model of apples was constructed using the Monte Carlo method to simulate the movement of photons inside the apple, generating a simulated brightness map of photons on the apple’s surface. This map was then used to train the MobileNetV3 network with dilated convolution, resulting in a pre-trained model. Through transfer learning, this model was applied to measured spectral data to detect the presence of watercore. Comparative experiments were conducted to determine the optimal transfer strategy for the frozen layers, achieving model accuracy rates of 99.13%, 97.60%, and 95.32% for two, three, and four classifications, respectively. Furthermore, the model parameters were low at 7.52 M. Test results of this study confirmed the effectiveness and lightweight characteristics of the method that combines optical property parameter inversion, the DC-MobileNetV3 model, and transfer learning for detecting apple watercore. This model provides technical support to detect watercore and other internal diseases in apples.

Keywords:

apple watercore; optical property parameter inversion; MobileNetV3; transfer learning

1. Introduction

Apples are the third most abundantly produced fruit worldwide. China is responsible for more than 50 percent of the global apple production, and apples are one of the main cash crops in the country [1]. With improvements in their quality of life, consumers are demanding higher quality apples. Apples with watercore are translucent with increased sweetness [2]. Watercore is a physiological phenomenon that occurs near the apple core and the surrounding flesh. Translucent clumps form in and around the centre of the apple flesh, similar to the lines found in pineapples [3]. This phenomenon is caused by the accumulation of sorbitol in intercellular spaces [4]. Watercore apples have become a consumer preference [5]. However, watercore causes the tissue in the apple core to become fragile and vulnerable to microbial attack [6]. The accumulation of water and sugar in the fruit heart tissue caused by watercore causes these areas to be more prone to oxidative reactions, leading to browning [7]. Browning affects both appearance and taste. In recent years, numerous non-destructive techniques for detecting watercore have been developed. X-ray CT and MRI technologies [8] and near-infrared spectroscopy (NIR) [9] have enhanced the efficiency and accuracy of watercore detection. However, the complexity of these detection technologies and the limitations of the detection facilities constrain online detection methods. The professional instruments required for X-ray and NIR have high operational complexity and are expensive. The average price of instruments required for these detection methods is approximately USD 100,000 to USD 200,000 [10,11]. To address this, we propose a method based on the inversion of optical parameters and MobileNetV3 for the detection of watercore in apples. This can effectively reduce the misclassification and improper preservation of watercore apples owing to the limitations of current detection technology in the classification process.

The optical properties of fruits include absorption and scattering coefficients [12]. The absorption coefficient primarily reflects the properties and concentrations of the chemical substances in fruit. The scattering coefficient represents the internal structure of fruits and vegetables. As light propagates through tissues, its absorption and multiple scattering reflects internal quality information about fruits and vegetables, which is detected at the surface. The differences in structure between diseased and healthy apples causes variations in their absorption and scattering coefficients. Consequently, differences in optical parameters can effectively distinguish between diseased and healthy apples. The influence of optical characteristics on fruit quality has been extensively studied. Wang et al. [13] reviewed the current research progress in major non-destructive testing technologies for food and agricultural products and proposed that those based on optical characteristic parameters have unique advantages in fruit quality detection. Pan et al. [14] investigated methods for fruit and vegetable quality detection using optical characteristic parameters and demonstrated their effective application in this field. In their study, owing to the high financial costs and number of researchers needed for sample collection, measured data were insufficient, and the extracted data were easily affected by many factors, which limited the measurement accuracy of the model. However, a large amount of noiseless data can be effectively obtained using photon transmission simulations.

The Monte Carlo method [15], also known as the statistical simulation method, is a probabilistic calculation method based on statistical theory. It uses random (or pseudo-random) numbers to address various computational problems. Photon movement within the apple tissue is a stochastic process, making the Monte Carlo method suitable for simulating photon states in apples. Guan et al. [16] introduced a method based on the Monte Carlo method for the fast reconstruction of optical properties, such as absorption and scattering coefficients of diffuse thin tissues. Zhang et al. [17] developed multi-layer Monte Carlo models to explore the optical properties of blueberry pulp and peel as well as the interaction of light with fruit tissues. Based on this, owing to the transparent nature of the core and pulp of watercore apples, the Monte Carlo method can be used to detect optical characteristic parameters to identify watercore in apples. Establishing the relationship between optical property parameters and apple quality can enable methods such as machine learning to process the image data to detect whether an apple has watercore.

Mahesh et al. employed machine learning-based image classification and recognition technology to detect, identify, and classify crop diseases [18,19]. Guan et al. [20] used discriminant analysis and the Bayesian discriminant method to identify and classify rice leaf diseases and achieved an identification accuracy of up to 97.2%. Although machine learning has a high accuracy rate, it requires manual feature extraction, which lacks generalisation capability and incurs high labour and time costs. As a typical deep learning method, convolutional neural networks (CNNs) [21] have become the predominant method for plant disease recognition due to their strengths in image recognition and classification. Anagnostis et al. [22] used CNNs to identify walnut leaf anthracnose and achieved an accuracy of 98.7%. Rachmad et al. [23] utilised an improved CNN model for disease identification in maize leaves. The accuracy of CNNs is closely linked to the quality of datasets, with networks trained on high-quality datasets often exhibiting better accuracy. However, the collection of large and diverse datasets of plant diseases is challenging. In our study, the acquisition of apple hyperspectral data and optical characteristic parameters was difficult. This problem can be solved by using transfer learning [24]. Transfer learning completes model training by pre-training using a large dataset and then applying the trained model to the measured data. Transfer learning can significantly bolster the resilience of CNN classifiers in identifying plant diseases. This involves retraining the pre-trained model on a smaller dataset, resulting in improved performance. Deng et al. [25] detected carrot surface defects by combining ShuffleNet and transfer learning and achieved an average detection accuracy of 99.82%. Long et al. [26] employed the ImageNet dataset to classify four different Camellia leaf diseases using the AlexNet model and transfer learning and achieved a classification accuracy of up to 96.53%. These results illustrate that transfer learning can markedly enhance both the model convergence speed and classification performance. Research on apple watercore detection using deep learning models has addressed the limitations of manual feature extraction [27], enabling automatic feature extraction of apples with watercore through learning with high identification accuracy and speed. However, some limitations remain: most research schemes based on deep learning models use traditional CNNs, which often require numerous parameters and complex network structures to ensure high recognition accuracy. This leads to high computational demands and weak real-time performance. Currently, there is a strong emphasis on enhancing CNNs for deployment in mobile and embedded devices.

Some researchers have suggested the use of lightweight networks aimed at reducing the model parameter quantity and complexity while maintaining accuracy, such as ShuffleNetv1 [28], ShuffleNetv2 [29], MobileNetv1 [30], MobileNetv2 [31], and MobileNetV3 [32]. Yang et al. [33] introduced a crop disease classification model that integrated MobileNet and Inception V3, incorporating transfer learning techniques to detect plant leaf diseases using mobile devices. Si et al. [34] utilised MobileNetV3 and transfer learning combined with a weight-contrast transfer strategy to detect surface defects on apples. These studies demonstrate the feasibility of combining lightweight network models and transfer learning for fruit and plant disease detection. The focus of this study was to further improve the MobileNetV3, select an appropriate transfer method, and enhance detection accuracy while reducing the number of model parameters needed.

In this study, apples with different degrees of watercore were classified according to the ratio of watercore to the equatorial section using the section pictures. Monte Carlo simulation was used to obtain a simulated brightness map of the apple surface with photon overflow. In addition, a method for the non-destructive detection of apple watercore was developed by introducing dilated convolution into MobileNetV3 and combining it with transfer learning. The specific plans and innovation points were as follows:

(1) A hyperspectral image of the apple was acquired using a hyperspectral imaging system, and the optical characteristic parameters of the apple were calculated using a double-integrating sphere system. Apples were categorised according to the proportion of the level of watercore in each section.

(2) A three-layer model of an apple comprising peel, pulp, and core layers was constructed. The Monte Carlo method was employed to simulate the trajectory of photons within the apple tissue, and a brightness map of the apple when the photons flowed over the apple surface under the surface light source was generated to create a simulated image dataset.

(3) Dilated convolution was introduced into the last three bottleneck modules of MobileNetV3 to enlarge the receptive field and obtain more detailed information.

(4) The MobileNetV3 network model was combined with transfer learning, and a pretrained model was developed by inputting simulation images into the network to extract the watercore apple features. Subsequently, this model was fine-tuned using measured hyperspectral data from apples. In the pre-training stage, the proposed model was compared with other classical deep learning networks to verify the superiority of the DC-MobileNetV3 network in terms of accurately detecting apple watercore and model parameter efficiency. Different transfer learning methods were evaluated, and the optimal transfer strategy that achieved the highest accuracy with the fewest parameters was selected.

The innovation of this study is that we combined hyperspectral, MobileNetV3, and transfer learning technologies and proposed a detection method to identify apple watercore using optical feature parameters. This study provides a theoretical foundation for the efficient non-destructive testing of internal crop lesions. Furthermore, these findings demonstrate the efficacy of combining transfer learning with lightweight networks and validate the feasibility of generating simulation data using the Monte Carlo method and conducting model pre-training.

2. Materials and Methods

2.1. Data Acquisition

To acquire optical imagery samples of watercore apples, 134 Aksu apples grown in the Aksu Region of Xinjiang were selected, all of which were normal apples with smooth surfaces, uniform sizes, and no obvious damage. The apples were stored in a thermostat BIOBase-BACC-310 (Shandong BIOBase Bio-industry Co., Ltd., Jinan, China) at a temperature maintained between 2 °C and 4 °C before testing. Before data collection, apples were stored at room temperature for three hours to allow the temperature of the apples to reach 20 °C.

For hyperspectral data acquisition, an HSI visible light-near infrared (VNIR) 0001 hyperspectral imaging system (Shanghai Wuling Photoelectric Technology Co., Ltd., Shanghai, China) was used, and the acquisition band was 373.79~1029.22 nm. The obtained hyperspectral image data were exported using HSI analyser V1.0 software. Hyperspectral image data acquisition was performed before the apples were sliced, with the acquisition focusing on the whole apple. As there were differences in the distances between each part of the apple and the hyperspectral camera, a section of 100 × 100 pixels with approximately the same distance between the apple and the lens of the hyperspectral camera was selected as the region of interest. This was used as the measured data for the experiment.

Apples were sliced along the equatorial plane and the cross-section was photographed to obtain a section map. Subsequently, an automatic binarization algorithm was used to obtain the ratio of the watercore plane to the equatorial plane of the apple. The samples were classified based on the proportion of the watercore area in relation to the entire equatorial plane, using the classification standards [35] detailed in Table 1. Forty non-watercore and 94 watercore samples were included in this batch. Cross-section images of apples with different degrees of watercore are shown in Figure 1.

The reflectance and transmittance of the peel, pulp, and core of this batch of apples were measured using a double integrating ball system (Shanghai Wuling Optoelectronics Technology Co., Ltd., Shanghai, China). The structure of the instrument is shown in Figure 2. The instrument is mainly comprising a diaphragm, reflection sphere, sample holder, transmission sphere, and optical fibre. The spectrometers used were an NIRez near-infrared spectrometer (900–1700 nm) and an SE1040 25 VNIR visible light spectrometer (350–1020 nm). Near-infrared and visible data were collected using Spectrasmart V2.8 software. The average number of numerical scans was set to three, the boxcar width was set to 10, and the strength correction was cancelled. The apple samples were placed at the entrance of the double integrating sphere. The light source illuminated the sample, and the reflected and projected light signals were recorded. Six parameters were collected: the apple peel absorption coefficient

μ_{a 1}

, apple flesh absorption coefficient

μ_{a 2}

, apple core absorption coefficient

μ_{a 3}

, apple peel scattering coefficient

μ_{s 1}

, apple flesh scattering coefficient

μ_{s 2},

and apple core scattering coefficient

μ_{s 3}

. The optical properties (reflectance and transmittance) of the apple peel, flesh, and core were obtained by iterative calculation using the inverse adding-doubling (IAD) method. The IAD method proposed by Prahl is an algorithm used to determine the optical parameters in conjunction with an integrating sphere system [36].

This method can be used to calculate the optical properties of another sample with the same optical properties but different thickness, when the optical parameters of the sample are known [37]. Based on the measured transmittance and reflectance of each layer of the apple, the actual optical characteristic parameters were calculated using the IAD method. The absorption coefficients of the peel, pulp, and core were

μ_{a 1}

(0.4–6.5 mm⁻¹),

μ_{a 2}

(0.03–8.70 mm⁻¹), and

μ_{a 3}

(0.01–2.70 mm⁻¹), respectively. The scattering coefficients of the peel, pulp, and core were

μ_{s 1}

(1.69–240 mm⁻¹),

μ_{s 2}

(0.01–70.00 mm⁻¹), and

μ_{s 3}

(0.01–30 mm⁻¹), respectively. The median of the integer values that occurred most frequently in the statistical sample was calculated and used as the input for the photon transport simulation parameters.

2.2. Simulation Data Collection

2.2.1. Three-Layer Model of Apples

Simulation data acquisition was performed using the Monte Carlo method. Briefly, the Monte Carlo method is used to construct a random probability model that is consistent with the actual physical process by generating a series of random numbers using experimental calculations and taking them as the solution to the problem. Based on the findings of Xu et al., the surface light source incidence method [38] was adopted in this study, and a vertical surface light source was selected. Compared with the point light source, more image and spectrum information can be obtained under the area light source conditions. The test error is lower, the photon transmission trajectory better simulates the actual situation, and the universality of the model is improved. The incident direction of all photons was assumed to be along the Z-axis. Apple watercore predominantly occurs near the vascular bundle of the fruit ventricles at deep depths, but it can also occur in all areas inside the apple. Therefore, we established a three-layer tissue plate model consisting of the peel, pulp, and core, in which the core layer was regarded as the pulp tissue layer with an infinite thickness [39], and only the thicknesses of the peel and pulp layers were set. In our three-layer plate model, photons entering the apple move in different directions due to reflection and scattering. When the number of incident photons is sufficiently large, their motion spreads throughout the apple. Since watercore can be present in all areas of the apple, when photons pass through the watercore region, the characteristics of this region facilitate photon overflow, resulting in higher photon content in the brightness diagram on the surface of the apple. Figure 3 shows the structure of the three-layer plate model and a schematic of the photon motion trajectory.

As shown in Figure 3, the motion state of a photon comprises five main steps: initialisation, calculation of photon direction and step size, overstepping judgment, out-of-bound judgment, and photon extinction. The main difference between the Monte Carlo simulation of the three-layer plate apple model and the standard Monte Carlo simulation was the thickness of the medium. When a photon crosses the boundary, the normal vector of the point on the surface is solved for refraction and reflection, and the refraction and reflection angles are calculated. When a photon moves, it crosses the boundary. Therefore, it is necessary to determine the motion state of the photon during the current step-size movement, which requires the following steps:

Determine whether a photon crossed the boundary. If the boundary was not crossed, the photon was absorbed and scattered, and the weight was updated. If the boundary was crossed, it was determined whether the photon escaped from the apple tissue surface;
Determine whether a photon overflows. If it does not overflow, it is necessary to determine whether it is refracted or reflected and update the weights accordingly. If it overflows;
Whether this is the last photon is determined. If this is the last photon, the operation is terminated.

If it is not the last photon, photon movement is initialised and continued in steps. After moving through one step, the photon continues to move through the random steps repeatedly, and absorption and scattering occur. Two outcomes are possible: either the photon escapes from the apple tissue owing to reflection or transmission at the peel layer boundary, or the photon is absorbed by the apple tissue owing to insufficient weight.

For photons without overflow, when judging the weight value, the set threshold weight

W_{e n d}

(1 × 10⁻⁵ in this experiment) is considered and then the Russian roulette is used. First, the photon is given an integer n, and then a uniform distribution of random numbers is generated

\hat{U}

. If

\hat{U}

× n < 1, whether the photon is dead is determined; otherwise, the weight of the photon is set to n ×

W_{e n d}

, and the next step is continued. Figure 4 shows a flowchart of the simulated photon-transmission process.

2.2.2. Intersection Point Calculation

In standard Monte Carlo simulations, the thickness of the medium is ignored. However, in this model, the thickness was included. Consequently, determining the photon’s incident point was difficult when finding the intersection between a line and a plane, rather than the intersection of two lines. Although the exact intersection point can be determined using analytical geometry, it is computationally inefficient. Instead, the bisection method was employed to determine the incident point of the photon. Generally, the incident position can be determined after two cycles [40]. The algorithm can be represented in pseudocode as follows:

As shown in Algorithm 1, first, we obtained the current position coordinate, current_pos, of the photon and the position coordinate, previous_pos, of the photon from the previous step. Then, the current layer where the photon was located was determined, followed by the previous layer, where the photon was located in the previous step. If the current_layer and previous_layer were not the same, this indicated that the photon had crossed the layer, and the specific crossing position needed to be determined. A bisection method was used to determine the cutoff point between the current_pos and the previous_pos. The midpoint of the current_pos and the previous_pos was calculated, and the layer midpoint_layer, where the midpoint was located, was determined. If the midpoint_layer was the same as the current_layer, then the midpoint was located in the photon’s current layer, and the current_pos was updated to reflect the midpoint. If the midpoint_layer was the same as the previous_layer, the midpoint was located in the previous layer, and the previous_pos was updated to reflect the midpoint. The above steps were repeated to gradually reduce the range between the current_pos and the previous_pos until the cutoff point was determined.

Algorithm 1: Junction calculation

Input: photon
Output: Photon out of bounds position
1: current_pos = photon.current.position
2: previous_pos = photon.current-1.position
3: current_layer = current_pos.current.layer
4: previous_layer = previous_pos.current.layer
5: if abs (current_layer − previous_layer) == 1:
6: while True:
7: midpoint = (current_pos + previous_pos)/2
8: midpoint_layer = midpoint.current.layer
9: if current_layer == midpoint_layer:
10: current_pos = midpoint
11: else:
12: previous_pos = midpoint
13: if midpoint_layer == current_layer or midpoint_layer == previous_layer:
14: break
15: end while
16: return midpoint
17: end if

When the cutoff point was found, the midpoint position was returned to the photon-crossing position. By continuously calculating the midpoint and updating the position, the algorithm precisely determined the exact position at which a photon moved from one layer to another. This dichotomy ensures the efficiency and accuracy of the calculation, which is suitable for locating the out-of-bound positions of the photons in the three-layer apple model.

2.2.3. Photon Cross-Layer Constraint Algorithm

Monte Carlo simulations incorporate randomness. The concept of model thickness was introduced in Section 2.2.1. Photons can traverse arbitrarily through the three-layer tissue or overflow from the pericarp layer. When a photon crosses a layer, its motion is affected by the properties of both layers. Given the thinness of the apple peel layer, a photon may take a step length that spans multiple layers. Therefore, constraints must be imposed on this layer. Specifically, when a photon is poised to cross multiple layers, the direction of motion of the photon should be determined and the photon should be forced to traverse only one layer. The constraint algorithm is shown below.

As shown in Algorithm 2, first, we obtained the current position coordinates, current_pos, of the photon and the position coordinates, previous_pos, of the photon from the previous step. Then, we determined the current layer, current_layer, of the photon and the previous layer, previous_layer. Next, we checked whether the current_layer was adjacent to the previous layer (i.e., abs (current_layer − previous_layer) = 1). If they were adjacent, the midpoint between the current_pos and the previous_pos was calculated. The precise position of the midpoint was re-calculated using the photon-specific calculation method. Then, the midpoint_layer, where the new midpoint was located, was determined. The current_pos of the newly calculated midpoint and the current_layer where the new current_pos was located were updated. These steps were repeated until the current layer, current_layer, of the photon was no longer adjacent to the previous_layer.

Algorithm 2: Photon transboundary confinement

Input: photon
Output: Ensure the photon crosses the layer boundary correctly
1: current_pos = photon.current.position
2: previous_pos = photon.current-1.position
3: current_layer = current_pos.current.layer
4: previous_layer = previous_pos.current.layer
5: while abs(current_layer − previous_layer) == 1 do
6: midpoint = (current_pos + previous_pos)/2
7: midpoint = Junction calculation(photon)
8: current_layer = midpoint.current.layer
9: current_pos = midpoint
10: end while

This algorithm iteratively approximates the position using dichotomy to ensure that the photons pass the demarcation point when crossing the boundaries of the different layers of the apple. The key to this process is to continuously reduce the distance between the current and previous positions of the photon until it is confirmed that the photon has crossed the layer boundary. Through accurate calculations and position updates, the algorithm effectively constrained the cross-layer behaviour of photons, ensuring that they do not cross multiple layers of the three-layer apple model.

2.2.4. Absorption and Scattering of Photons

When using the Monte Carlo method to simulate photon transmission, photons move randomly step-wise after entering biological tissues and collide with particles in the tissue, causing them to be absorbed and scattered. Absorption attenuates weight, and scattering alters their direction of motion. To determine whether a photon dies after absorption or scattering or the direction in which it continues to move after not dying, it is necessary to calculate the weight of the photon, the azimuth angle, and the scattering angle. The formula for calculating the photon residual weight

ω^{'}

is:

ω^{'} = ω \frac{μ_{s}}{μ_{a} + μ_{s}}

(1)

where ω′ is the residual weight of the photon after it has undergone absorption and scattering, ω is the weight of the photon in the previous step, and

μ_{a}

is the absorption coefficient of the organisation, and

μ_{s}

is the scattering coefficient of the organisation.

Collisions between the photon and the tissue alter the direction of the photon, and the subsequent transmission direction of the photon is determined by the azimuth, angle

β \in [0, 2 π],

and the scattering angle,

θ \in [0, 2 π]

; the calculation formula is as follows:

β = 2 π ε

(2)

where β follows uniform distribution,

ε

is a random variable uniformly distributed over the interval [0, 1], representing a random number that influences the azimuthal angle β. The angle of deflection, cosine (

\cos θ

) follows the Henyey Greensterin function [41] distribution as shown below:

p (\cos θ) = \frac{1 - g^{2}}{2 {(1 - 2 g \cos θ + g^{2})}^{2}}

(3)

where θ is the scattering angle, which is the angle between the original and scattered direction of the photon, cos θ is the cosine of the scattering angle, p(cos θ) is the probability distribution function for the cosine of the scattering angle, and g is the anisotropy factor, which ranges from −1 to 1.

Sampling the Henyey Greensterin function yields θ using the following formula:

θ = \{\begin{array}{l} a r c c o s (\frac{1 + g^{2} - {(\frac{1 - g^{2}}{1 - g + 2 g ε})}^{2}}{2 g}) & (g \neq 0) \\ a r c c o s (2 ε - 1) & (g = 0) \end{array}

(4)

where θ is the scattering angle,

ε

is a random variable uniformly distributed over the interval [0, 1], and g is the anisotropy factor.

The Monte Carlo method was used to simulate the brightness map of photons that overflowed onto the apple surface. Apples with watercore have a transparent interior and lower absorption and scattering coefficients, allowing photons to penetrate the apples and overflow the surface more easily. Adjusting the absorption and scattering coefficients can simulate a brightness map displaying the overflow of photons onto the surface of apples with or without watercore. The interval range was determined according to the number of statistical samples, and the median value in each interval was set as the input for the simulation parameters.

A uniform distribution of 5 × 4 × 4 × 5 × 5 × 4 = 8000 optical parameter combinations was used to determine the optical parameter values for the simulation test, in which μ_a₁ was divided into five categories, μ_a₂ into four categories, μ_a₃ into four categories, μ_s₁ into five categories, μ_s₂ into five categories, and μ_s₃ into four categories. The specific classifications are listed in Table 2.

When setting the thickness of the three layers of apple tissue, the measured thickness of the peel layer was 0.15–0.25 mm, which had negligible impact on the feature extraction of the core layer, therefore only a single value of 0.2 mm was set. The measured thickness of the pulp layer ranged from 37 mm to 48 mm. For the simulations, two evenly distributed values were chosen: 40 and 45 mm. The anisotropic factor g was set to 0.9, the refractive index of air n₁ was set to 1, and the refractive indices of the peel (n₂), pulp (n₃), and core (n₄) were all set to 1.38. The number of simulated photons was 1 × 10⁵. By substituting the optical characteristic parameters of the three-layered apple model, the brightness map of the photons that flowed over the apple surface is shown in Figure 5.

The image was 20 × 20 pixels, which was smaller than the real spectral acquisition picture but a similar size to the distribution of the real spectral acquisition picture, which can demonstrate the photon overflow on the apple surface to reflect the presence or absence of the watercore. Therefore, it can be used as simulated data for model pre-training.

2.3. Data Augmentation

Data augmentation is a commonly used data processing technique [42]. The main function is to prevent the model from overfitting and improve the generalization ability of the model. Our research set the image size to 224 × 224 × 3. The enhancement method uses a variety of methods, such as horizontal flip, mirror flip, and cropping. The specific operations are as follows:

For the training set, the images are randomly cropped and the cropped images are resized to 224 × 224 pixels. The diversity of the images is enhanced by flipping the images randomly horizontally with 50% probability and mirroring the flipped images with 50% probability.
For the test set, the longer side of the image was resized to 256 pixels, keeping the aspect ratio constant. The center region was cropped from the adjusted image with a cropping size of 224 × 224 pixels. It ensures that the data processing of the validation set is consistent with the training set and keeps the image features of the validation set unchanged, so as to accurately evaluate the performance of the model.
The mean and standard deviation values obtained from the ImageNet dataset are used to normalise the pixel values of an image on both the test and training sets. This allows the data to have a distribution that is more suitable for training before feeding into the neural network.

These data augmentation methods help the model to better learn important features in the simulated brightness map, and improve the accuracy and robustness of the model in detecting apple water heart disease.

2.4. Experimental Method

2.4.1. MobileNetV3 Model and Dilated Convolution

MobileNetV3 is a network architecture developed by Howard in 2019 as an improvement over MobileNetV1 and MobileNetV2. MobileNetV3 is divided into two structures: MobilenetV3_large and MobiNetV3-small, which cater to low and high computing and storage requirements, respectively. To solve the problem of Swish consuming too many resources on mobile devices, h-Swish has been used to replace Swish, which has the same effect as Swish but greatly reduces the amount of computation required.

Because h-Swish is an approximation of Swish, it eliminates the need for complex exponential operations during computation, significantly reducing the computational load. Retaining the computational simplicity of the rectified linear unit (ReLU) while offering smooth nonlinear characteristics similar to Swish, h-Swish helps the model capture more complex patterns, thereby improving the detection of watercore disease. Additionally, it enhances inference speed and reduces computational overhead while keeping the model lightweight, which is especially important for processing hyperspectral images.

Addition of the squeeze and excite structure [43] to MobileNetV3 can allow the model to focus on important feature information and ignore irrelevant information during image recognition. By constantly modifying the weight, important feature information can be extracted under different circumstances to improve the classification accuracy. This model can better identify the key information in the surface brightness map of apples, identify whether an apple has watercore, and determine the degree of damage from the watercore. Moreover, the number of channels in the expanded layer containing the Squeeze-and-Excitation Networks (SE) structure was reduced. The findings revealed that this not only improved the accuracy of the model but also did not affect the overall delay. The model can quickly and accurately identify whether an apple has watercore.

Figure 6 illustrates the MobileNetV3 network, which begins with a 1 × 1 convolution to expand channels. Depthwise (Dwise) separable convolutions follow, reducing computation by separating channel operations. The SE module enhances features through global average pooling and channel reweighting, with skip connections to match dimensions. Finally, 1 × 1 convolutions, global average pooling, and fully connected (FC) layers classify the output, with the Non-Local module further improving performance.

The MobileNetV3 network model can be optimised without compromising recognition accuracy. This optimization increases the recognition rate while reducing the number of required model parameters. In this study, a MobileNetV3 network model with a smaller architecture, MobileNetV3_Small was selected. The structure of the MobileNetV3_Small network is presented in Table 3.

Dilated convolution is a variation of traditional convolution, in which the receptive field is expanded by the addition of cavities [44]. When learning the apple surface-brightness map, the network can better capture wide ranges of brightness variations and global information. Dilated convolution preserves detailed information, such as light spots and shadows, while maintaining a receptive field. Compared to large kernels, dilated convolution is more efficient, requiring fewer parameters and computations to achieve the same receptive field. This efficiency helps reduce the model’s complexity in detecting watercore in apples.

In this study, cavity convolution was introduced into the last three bneck modules of the MobileNetV3 network, and the expansion rates R were 2, 2, and 4, respectively. The effect of dilated convolution is shown in Figure 7, where the red dots represent the convolutional points, and the blue grids indicate the receptive fields.

2.4.2. Transfer Learning

Transfer learning [45] is a method of applying a trained model or algorithm to a new domain to address situations where a better model cannot be trained due to insufficient data. In transfer learning, the parameters of different layers can be frozen as needed. Freezing strategies are categorised into three types [46]: Freezing, Partial Freezing, and No Freezing.

In Freezing [47], the parameters of all layers in the transfer learning model are frozen. In Partial Freezing [48], only the parameters of some layers are frozen, while the parameters of the other layers are updated during training. In No Freezing [49], the parameters of all layers are updated during training.

In this study, the pre-training dataset was a simulated brightness map of photons flowing over the surface of an apple, which was obtained by simulation according to the optical parameter values of the measured apple sample. The distribution was like that of the measured hyperspectral image, and the scale of the simulation dataset was larger than that of the measured dataset; therefore, the model-based transfer learning method was used. After the pre-training was completed, the Freezing, Partial Freezing, and No Freezing transfer learning methods were compared, and the method with the highest accuracy was selected for model transfer to allow it to adapt to the obtained dataset and accurately detect apple watercore under low-level parameters.

2.4.3. Comparison Algorithm

To comprehensively evaluate the performance of our model in detecting apple watercore, we selected multiple comparison models, including SVM from machine learning, heavyweight networks such as ResNet50, VGG16, and InceptionNetV3, as well as lightweight networks like ShuffleNetV2 and MobileNetV3_large. The following are the characteristics of each model:

SVM [50] is a classical machine learning method that excels at handling small samples and high-dimensional data. It is used to compare the improvement of deep learning models in feature extraction and classification tasks.
ResNet50 [51] is a deep residual network that addresses the vanishing gradient problem in deep neural networks by introducing residual connections. It serves as a good reference for evaluating the performance of new models in complex tasks.
VGG16 [52] increases the depth of the network by stacking small convolutional kernels to enhance feature extraction capabilities. It performs well in large-scale image classification tasks and is a classic benchmark in comparative experiments.
ShuffleNetV2 is a lightweight neural network whose efficiency and low computational requirements make it ideal for comparison with MobileNetV3-small, especially in resource-constrained environments.
InceptionNetV3 [53] utilises the Inception module, whose complex architecture and multi-scale feature extraction capabilities provide a contrast with our model in terms of feature diversity and complexity.
MobileNetV3_large is another version of MobileNetV3, and comparing its performance with our model helps us understand the trade-off between model complexity and performance.

2.4.4. Evaluation Index

The proportion of correctly predicted samples was assessed using the accuracy metric, which represents the relationship between the number of correctly identified samples in a classification task and the total number of samples. Additionally, the precision, recall, and the harmonic mean of precision and recall (F1-score) were combined to evaluate the model’s performance. The precision index evaluates the accuracy of positive predictions, that is, the proportion of correctly predicted positive samples out of all predicted positive samples. The recall index represents the ability of the model to identify real positive samples, that is, the proportion of correctly predicted positive samples out of all actual positive samples. The F1-score is the harmonic mean of precision and recall, representing the robustness of the model. The formulae for calculating these evaluation metrics are as follows:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(5)

P r e c i s i o n = \frac{T P}{T P + F P}

(6)

R e c a l l = \frac{T P}{T P + F N}

(7)

F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(8)

where TP is the number of true position samples, FP is the number of false position samples, FN is the number of false-negative samples, and TN is the number of true-negative samples.

2.5. Experimental Environment and Parameter Setting

Simulation data were generated using MATLAB Matlab2015b software on Windows 11. The MobileNetV3 network was trained in Python 3.7.16 and Pytorch 1.13.1. The hardware environment was 12th Gen Intel(R) Core (TM) i7-12700H ~2.30 GHz. The running memory was 16 GB, the graphics cards used were NVIDIA GeForece RTX 3060 and Intel(R) Iris(R) Xe Graphics, and the video memory (VRAM) sizes were 6 GB and 7.8 GB.

The processed image format was tensor and was normalised to a 224 × 224 RGB image size; the number of input images were 30,000, and the training and test sets were divided at a ratio of 4:1 for model training. The batch size was 256, the learning rate was 0.0001, the Adam optimiser was used to reduce the overfitting problem, and 2000 iterations were performed to complete the training. In the transfer learning, the measured data were randomly divided into a training set and a test set at a ratio of 3:1. The image was segmented and the size of each image was 30 pixels × 30 pixels, with a total of 1340 pictures (134 × 10). The classification results of the transfer learning model adopted data after 200 iterations.

3. Results and Discussion

3.1. Data Augmentation Effect and Comparative Analysis

This study employed four data augmentation techniques [54]: random horizontal flipping, mirror flipping, centre cropping along the sides of the image, and normalisation using the mean and standard deviation values obtained from the ImageNet dataset. Data augmentation plays a crucial role in enhancing model generalisation to real-world scenarios and mitigating overfitting risks for simulated data. By introducing diverse transformations into the training data, data augmentation enabled the model to encounter a wider range of inputs during training, thereby improving its performance on unseen data. This approach effectively prevents the model from memorising specific details of the training data (overfitting) by incorporating randomness and variability, ultimately enhancing its ability to handle the uncertainties in the collected data. This section focuses on the impact of data and non-data augmentation on the accuracy and loss values of the training and test sets.

Figure 8 illustrates the differences between the training and validation sets with and without data augmentation for the three classification conditions. As shown in Figure 8a, the accuracy of the model without augmentation of the training dataset was relatively low (82.61%), and the validation curve was highly unstable, with the highest accuracy at 75.78%. For the un-augmented model, the test data accuracy exceeds the training data accuracy, indicating an overfitting problem where the model generalises poorly on unseen data. This issue is resolved after applying data augmentation, demonstrating that data augmentation can effectively address overfitting and enhance the model’s robustness and generalization ability. After data augmentation, the accuracy of the training set was up to 96.77% and that of the validation set was 94.83%; thus, the accuracy of the model was significantly improved. In addition, after data augmentation, the model data loss decreased faster and converged more consistently between the training and validation sets. The initial value of the loss after data augmentation was lower than that before data augmentation, and it was reduced to 0.0327 after 700 rounds (Figure 8b). The curve stabilised after 600 rounds. Cang et al. [55] combined data augmentation and GAN technology to improve the accuracy of winter jujube quality detection by 13.27%. This indicates that data augmentation can improve the accuracy of the model and reduce the loss value. Overall, these results showed that the model with data augmentation was better than the model without data augmentation. Moreover, no overfitting problems occurred when the model accuracy increased and the loss decreased. Data augmentation can lead to faster model convergence and fitting and has a higher generalisation ability.

3.2. Comparison of the Model Pre-Training Algorithms

To further verify the effectiveness of the improved MobileNetV3 network model proposed in this study, the SVM machine learning method, heavyweight networks ResNet50 and VGG16, and lightweight networks ShuffleNetV2, InceptionNetV3, and MobileNetV3_large were compared (Table 4). Model comparisons were conducted using the original model framework and parameter settings. The experiments were conducted in two-, three-, and four-class scenarios. The experimental results were obtained from a test set of simulated data.

Compared to traditional machine learning, heavyweight network, and lightweight network models, our method was superior in terms of accuracy. In the case of binary classification, our model had an accuracy of 99.05 and a loss of 0.0291. Compared with that of SVM, ResNet50, VGG16, InceptionNetV3, ShuffleNetV2, and MobileNetV3_large, the accuracy improved by 1.19% to 18.89%, and the loss was reduced by 0.0457 to 0.365%. Moreover, compared to that of the other models, the precision of our model improved by 1.24% to 18.67%, and in terms of recall, our model improved by 0.66% to 19.46%. The F1-score of our method was also 0.91–18.84% higher than that of the other models. In the case of the three-class classification, our model had an accuracy of 96.77% and a loss of 0.0327.

Compared with that of the other models, the accuracy was improved by 1.32% to 23.59%, and the loss was reduced by 0.0585–0.3988. In terms of precision, compared with other models, that of our model improved by 1.13% to 23.61%, and in terms of recall, that of our model improved by 0.49% to 24.48%. The F1-score of our method was also 0.96–23.69% higher than that of the other models. In the case of the four-class classification, our model had an accuracy of 94.45% and a loss of 0.1048. Compared with that of the other models, the accuracy was improved by 0.83–22.46%, and the loss was reduced by 0.071–0.3771%. The precision of our model, compared with that of the other models, improved by 0.99% to 22.38%, and in terms of recall, that of our model improved by 0.72% to 22.53%. The F1-score of our method was also 0.77–22.19% higher than that of the other models. Across the various classification cases, the accuracy of our model was higher than that of the other comparison models, and the other indices were better than those of the other comparison models.

Li et al. [56] proved that the internal features of apples can be learned by CNNs to realise the inversion of apple optical characteristic parameters. For apple watercore disease, changes in optical properties may be localised but can affect a larger area. Dilated convolution is adept at capturing these wide-ranging feature variations and thus, could identify the presence of watercore more accurately. Incorporating dilated convolution into MobileNetV3 enhances its feature extraction capabilities without significantly increasing its computational complexity, leading to improved prediction accuracy. MobileNetV3’s flexibility allows it to adapt to different input sizes and network depths. When combined with dilated convolution, it can be optimised to better detect the specific optical properties associated with apple watercore disease. Therefore, the DC-MobileNetV3 model is more suitable for pre-training.

To study the differences and trends in the accuracy and loss of the different models, the accuracy curve and loss curve of the first 2000 epochs were plotted (Figure 9). Figure 9a,c,e show that, compared with those of the other algorithms, the accuracy curve of the DC-MobileNetV3 model in the three classification cases rapidly increased more, and the curve was stable after convergence.

The results showed that DC-MobileNetV3 has a higher accuracy whilst requiring fewer model parameters, and the introduction of the dilated convolution module extracted better image-related features and improved the accuracy of the model. Figure 9b,d,f show that as the number of iterations increased, the loss values of the other comparison models decreased slowly. The loss value of the DC-MobileNetV3 model decreased the fastest and quickly stabilised. This indicates the high robustness of the model. Combined with accuracy and loss convergence, these results suggest that the DC-MobileNetV3 model is optimal as a pre-trained model.

3.3. Comparison Experiments of Transfer Learning Methods

After pre-training, the pretrained model was transferred to the measured data using transfer learning. The model training method involved freezing the network layer and the fully connected layer of the DC-MobileNetV3 model, then unfreezing the layers sequentially from the deep to shallow layers [57], comparing the performance, and selecting the optimal adjustment method. According to previous studies [58,59,60], freezing the network layers and unfreezing the last 10 layers of the model optimises its performance. According to the data analysis shown in Table 5, as the number of unfrozen network layers increased, the number of model parameters also gradually increased. Simultaneously, the loss value of the model was reduced, and the accuracy, precision, recall, and F1-score increased. When the unfreezing process reached the fifth to last layers, the model performance was optimal; the model accuracy was 97.60%, the precision was 97.24%, the recall was 97.33%, the F1-score was 96.92%, and the model parameter was 7.52 M. Variations in the number of unfrozen network layers led to different feature-learning methods. Freezing other layers and training only the last five network layers were the most suitable strategies using Partial Freezing in this experiment. In the following section, we compare different migration strategies to select the most suitable strategy for this experiment.

Table 6 presents a comparison of the results of different transfer learning methods for the different classification cases. The model performed best when part of the network was frozen, with an accuracy of 99.13% for two classifications, 97.6% for three classifications, and 95.32% for four classifications. In the case of no network layer freezing, the accuracy rates were 98.55%, 96.20%, and 93.97% for two-class, three-class, and four-class classifications, respectively. When all network layers were frozen, the model performed the worst, with accuracies of 92.25%, 89.70%, and 87.53% for two-class, three-class, and four-class classifications, respectively. Freezing is the least effective of the three transfer learning methods because, although the simulated data used to pretrain the model are as close as possible to the real data, there are still some differences, and loading only the model weights without updating the network layer parameters is not as effective as the other two methods. Yang et al. [61] obtained the same conclusion in their study on crop disease sorting detection. By comparing different transfer learning and non-transfer methods, it was demonstrated that partial freezing of the network layers resulted in better performance compared to full freezing or no freezing. This finding further supports our conclusion that partial freezing of the network layers is more effective.

3.4. Ablation Experiment

In this study, simulated luminance maps based on the inversion of optical properties were used to pre-train the model, and dilated convolution and transfer learning were introduced based on MobileNetV3. To verify the influence of dilated convolution and transfer learning on the experimental results, ablation experiments were performed on the measured dataset, and the results are listed in Table 7.

By introducing dilated convolution into the last three bneck modules of the model, the accuracy of the model improved by 8.75% in the two-class classification, 8.21% in the three-class classification, and 8.92% in the four-class classification, but the number of model parameters increased by 21.5 M. The results show that brightness information can be better extracted by expanding the receptive field. Xu et al. [62] also introduced dilated convolution for apple leaf disease identification and observed similar results, which indicates the reliability of dilated convolution in expanding receptive fields and improving model accuracy. After the introduction of transfer learning, the accuracy increased to 99.13% for binary classification, which was 17.9% and 8.55% higher than that of MobileNetV3 and MobileNetV3 with dilated convolution, respectively. For the three-class category, the accuracy reached 97.60%, which was 18.84% and 10.63% higher than that of MobileNetV3 and MobileNetV3 with dilated convolution, respectively. For the four-class classification, the accuracy reached 95.32%, which was 22.89% and 13.97% higher than that of MobileNetV3 and MobileNetV3 with dilated convolution, respectively. In addition, after the introduction of transfer learning, the number of parameters only increased by 4.98 M compared with that of MobilNetV3 and decreased by 16.58 M compared with that of MobileNetV3 after the introduction of dilated convolution.

The model with the transfer learning method exhibited the best performance, with an accuracy of 99.13%, precision of 98.67%, recall rate of 98.51%, and an F1-score of 98.70% for binary classification. For the three-class category, the accuracy rate was 97.60%, the precision rate was 97.19%, the recall rate was 97.23%, and the F1-score was 97.41%. For the four-class category, the accuracy rate was 95.32%, the precision rate was 94.91%, the recall rate was 94.95%, and the F1-score was 95.13%. These results show that the transfer learning method effectively improves the detection of the apple watercore. Dou et al. [63] reached a similar conclusion in their study on the classification of the degree of citrus Huanglongbing, proving the effectiveness of transfer learning in improving model accuracy.

3.5. Confusion Matrix

The confusion matrix is typically used to analyse the classification and prediction results of a model in machine learning, and the performance of the classifier is closely related to the final performance of the model.

The confusion matrices are shown in Figure 10 are derived from the MobileNetV3 + dilated convolution + transfer learning model. In the figure, each row of the confusion matrix represents the predicted class and each column represents the actual class. Each value in the confusion matrix indicates the probability that the actual class is predicted as the row class, with the diagonal values representing correct predictions. In the three-class classification, the different categories can be divided into normal, mild, and severe watercore. In the four-class classification, the conditions can be divided into normal, mild, moderate, and severe watercore.

Figure 10a demonstrates that, in the case of the two-class classification, the model recognition accuracy reached 100%, which means that the presence or absence of watercore can be accurately detected. In the three-class condition, the class with a severe watercore error was identified as a moderate watercore. In the four-class condition, one mild watercore event was misclassified as moderate, one moderate watercore event was misclassified as severe, and a severe watercore event was incorrectly identified as moderate. Further analysis revealed that misclassified images were at the boundary between the two categories. The optical property parameters and DC-MobileNetV3 can be used to determine the degree of watercore in apples. However, the classification recognition accuracy is related to the classification method of different degrees of watercore.

Class imbalance, where one class significantly outnumbers the others, can skew the interpretation of a model’s performance. In such cases, relying solely on accuracy can be misleading, as a classifier might achieve high accuracy by predominantly predicting the majority class, neglecting the minority class. To address this, Receiver Operating Characteristic (ROC) curves [64] are used as a more robust method for evaluating classifier performance in the presence of class imbalance. Unlike accuracy, ROC curves are not affected by class imbalance, as they focus on the trade-off between the true positive rate (TPR) and false positive rate (FPR). This allows for a more comprehensive assessment of the model’s performance across all classification thresholds. By plotting the ROC curves of multiple classifiers, their performances can be visually compared, with the Area under curve (AUC) [65] serving as a single, interpretable metric. The larger the AUC, the better the classifier’s overall performance, providing a reliable basis for model comparison even in imbalanced datasets.

The results of the three-class and four-class classifications offer a more nuanced reflection of the model’s performance across varying levels of complexity. These approaches allow for a deeper understanding of how the model handles more intricate classification tasks, providing insights that are more aligned with real-world scenarios. By using ROC curves for three- and four-class classifications, we can visually compare the model’s performance across tasks of varying complexity. The distinguishing characteristics of watercore are most prominent between two and four categories. Beyond four classes, the features between categories may become more similar and harder to differentiate, increasing model complexity while compromising classification accuracy.

As shown in Figure 11, the abscissa of the ROC curve demonstrates the FPR and the ordinate reflects the TPR. As can be seen from the two curves, the curve shape is close to the upper left corner of the graph, and the AUC values under the three-class and four-class classification conditions were 1.00 and 0.99, respectively, indicating that the model has high accuracy and strong generalization ability for the three-class and four-class classifications, and the overall performance is relatively strong. This is attributed to the fact that MobileNetV3 can accurately distinguish the optical properties of different classes, whereas dilated convolution is effective for detecting changes in the optical properties of apple watercore because the feature changes in the lesion region may be local but have a significant impact on the whole image. In addition, by pre-training the simulation data, the model learned to recognise the optical characteristics of watercore before practical application.

Through transfer learning and fine-tuning of the measured dataset, the model can better adapt to the characteristics of the actual data and further improve the classification accuracy. The model exhibited high sensitivity and specificity for detecting different watercore severities in apples.

4. Conclusions

Apples with watercore have higher economic value owing to their unique texture and sweetness; however, severe cases can lead to a higher incidence of browning. Thus, non-destructive detection methods for watercore are crucial for minimising economic losses and enhancing revenue. This study proposes a MobileNetV3 model enhanced by optical parameter inversion and transfer learning for effective apple watercore detection. In this method, a hyperspectral image of an apple was acquired using a hyperspectral imaging system, and the optical characteristic parameters of the apple were calculated using a double-integrating sphere system. Through Monte Carlo simulations, a three-layer plate model of an apple was constructed, and a simulated brightness map was generated by simulating the movement of photons in the apple tissue. The MobileNetV3 model was improved by adding dilated convolution and using simulation data to pre-train the model.

Based on the simulation data, the accuracy of DC-MobileNetV3 reached 99.05%, 96.77%, and 94.45% for two-, three-, and four-class classifications, respectively. The model size is the smallest compared to other models, at 18.89 M. The other evaluation indicators were also better than those of other models. Transfer learning was performed on the measured dataset and the transfer method was fine-tuned. Through comparative experiments, it was determined that the model transfer effect was best when the other layers were frozen and only the last five layers were thawed. In the case of the two-class classification, the accuracy reached 99.13%, and the model parameters were 7.52 M. The above results show that our proposed apple watercore detection method has the advantages of being both lightweight and highly accurate, effectively leveraging Monte Carlo simulations for photon movement and MobileNetV3’s efficiency in processing hyperspectral images for precise disease detection. Collectively, our results demonstrate the effectiveness of the DC-MobileNetV3 model based on the inversion of the optical property parameters for detecting watercore in apples. Therefore, the proposed method is fully applicable for achieving the accurate non-destructive detection of apple watercore.

Furthermore, this method helps reduce economic losses and improves the economic efficiency of apples. In future research, the model will be applied to additional types of internal diseases, and we will investigate the use of more accessible and widely available equipment to enhance the practicality of the detection process. Higher-performance photon transmission models will also be constructed for different varieties of apples to obtain higher-quality simulation brightness maps and improve the effectiveness of internal disease detection in apples.

Author Contributions

Z.C.: Conceptualization; Writing—original draft; Writing—review and editing; investigation; methodology. H.W.: Writing—review and editing; methodology; funding acquisition; project administration. J.W.: Writing—review and editing; methodology; project administration. H.X.: Methodology; project administration; funding acquisition. N.M.: Investigation; data curation. S.Z.: Investigation; data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China “Inversion of optical parameters of multilayer tissue of 3D apple model for hyperspectral quality detection” [grant number 31601545].

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Acknowledgments

This research was conducted at the College of Artificial Intelligence, Nanjing Agricultural University, Nanjing, Jiangsu, China.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

NIR, Near-infrared Spectroscopy; CNNs, Convolutional Neural Networks; VNIR, Visible light-Near Infrared; IAD, Inverse Adding-double; SE, Squeeze-and-Excitation Networks; F1-score, Harmonic Mean of Precision and Recall; TP, Number of True Position Samples; FP, Number of False Position Samples; FN, Number of False-negative Samples; TN, Number of True-Negative Samples; FC, Full Collection; Dwise, Depthwise; NL, Currently Used Nonlinear Activation Function; HS, Hard-Swish; RelU, Rectified Linear Unit; NBU, Not Be Used; ROC, Receiver Operating Characteristic; TPR, True Positive Rate; FPR, False Positive Rate; RE, RelU Activation Function; HS, Hard-Swish ActivationFunction; AUC, Area under curve.

References

Hong, J.; Zhang, T.; Shen, X.; Zhai, Y.; Bai, Y.; Hong, J. Water, energy, and carbon integrated footprint analysis from the environmental-economic perspective for apple production in China. J. Clean. Prod. 2022, 368, 133184. [Google Scholar] [CrossRef]
Moriya, S.; Kunihisa, M.; Okada, K.; Iwanami, H.; Iwata, H.; Minamikawa, M.; Katayose, Y.; Matsumoto, T.; Mori, S.; Sasaki, H. Identification of QTLs for flesh mealiness in apple (Malus× domestica Borkh.). Hortic. J. 2017, 86, 159–170. [Google Scholar] [CrossRef]
Liu, Z.; Du, M.; Liu, H.; Zhang, K.; Xu, X.; Liu, K.; Tu, J.; Liu, Q. Chitosan films incorporating litchi peel extract and titanium dioxide nanoparticles and their application as coatings on watercored apples. Prog. Org. Coat. 2021, 151, 106103. [Google Scholar] [CrossRef]
Li, W.; Liu, Z.; Wang, H.; Zheng, Y.; Zhou, Q.; Duan, L.; Tang, Y.; Jiang, Y.; Li, X.; Jiang, Y. Harvest maturity stage affects watercore dissipation and postharvest quality deterioration of watercore’Fuji’apples. Postharvest Biol. Technol. 2024, 210, 112736. [Google Scholar] [CrossRef]
Itai, A. Watercore in fruits. In Abiotic Stress Biology in Horticultural Plants; Springer: Berlin/Heidelberg, Germany, 2015; pp. 127–145. [Google Scholar]
Zupan, A.; Mikulic-Petkovsek, M.; Stampar, F.; Veberic, R. Sugar and phenol content in apple with or without watercore. J. Sci. Food Agric. 2016, 96, 2845–2850. [Google Scholar] [CrossRef] [PubMed]
Arnold, M.; Gramza-Michałowska, A. Enzymatic browning in apple products and its inhibition treatments: A comprehensive review. Compr. Rev. Food Sci. Food Saf. 2022, 21, 5038–5076. [Google Scholar] [CrossRef] [PubMed]
Herremans, E.; Melado-Herreros, A.; Defraeye, T.; Verlinden, B.; Hertog, M.; Verboven, P.; Val, J.; Fernández-Valle, M.E.; Bongaers, E.; Estrade, P. Comparison of X-ray CT and MRI of watercore disorder of different apple cultivars. Postharvest Biol. Technol. 2014, 87, 42–50. [Google Scholar] [CrossRef]
Rittiron, R.; Narongwongwattana, S.; Boonprakob, U.; Seehalak, W. Rapid and nondestructive detection of watercore and sugar content in Asian pear by near infrared spectroscopy for commercial trade. J. Innov. Opt. Health Sci. 2014, 7, 1350073. [Google Scholar] [CrossRef]
Prananto, J.A.; Minasny, B.; Weaver, T. Near infrared (NIR) spectroscopy as a rapid and cost-effective method for nutrient analysis of plant leaf tissues. Adv. Agron. 2020, 164, 1–49. [Google Scholar]
Fatihoglu, E.; Aydin, S.; Gokharman, F.D.; Ece, B.; Kosar, P.N. X-ray use in chest imaging in emergency department on the basis of cost and effectiveness. Acad. Radiol. 2016, 23, 1239–1245. [Google Scholar] [CrossRef]
Sun, X.-L.; Zhou, T.-T.; Sun, Z.-Z.; Li, Z.-M.; Hu, D. Research progress into optical property-based nondestructive fruit and vegetable quality assessment. Food Res. Dev. 2022, 43, 208–209. [Google Scholar]
Wang, S.; Huang, X.; Lyu, R.; Pan, S. Research progress of nondestructive detection methods in fruit quality. Food Ferment. Ind. 2018, 44, 319–324. [Google Scholar]
Pan, L.; Wei, K.; Cao, N.; Sun, K.; Liu, Q.; Tu, K.; Zhu, Q. Measurement of optical parameters of fruits and vegetables and its application in quality detection. J. Nanjing Agric. Univ. 2018, 41, 26–37. [Google Scholar]
Mondal, A.; Mandal, A. Stratified random sampling for dependent inputs in Monte Carlo simulations from computer experiments. J. Stat. Plan. Inference 2020, 205, 269–282. [Google Scholar] [CrossRef]
Guan, T.; Zhao, H.; Wang, Z.; Yu, D. Optical properties reconstruction of layered tissue and experimental demonstration. In Proceedings of the Complex Dynamics and Fluctuations in Biomedical Photonics IV, San Jose, CA, USA, 20–25 January 2007; SPIE: Paris, France, 2007; pp. 227–234. [Google Scholar]
Zhang, M.; Li, C.; Yang, F. Optical properties of blueberry flesh and skin and Monte Carlo multi-layered simulation of light interaction with fruit tissues. Postharvest Biol. Technol. 2019, 150, 28–41. [Google Scholar] [CrossRef]
Mahesh, B. Machine learning algorithms-a review. Int. J. Sci. Res. IJSR 2020, 9, 381–386. [Google Scholar] [CrossRef]
Shruthi, U.; Nagaveni, V.; Raghavendra, B. A review on machine learning classification techniques for plant disease detection. In Proceedings of the 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), Coimbatore, India, 15–16 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 281–284. [Google Scholar]
Guan, Z.; Tang, J.; Yang, B.; Zhou, Y.; Fan, D.; Yao, Q. Study on recognition method of rice disease based on image. Chin. J. Rice Sci. 2010, 24, 497. [Google Scholar]
Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
Anagnostis, A.; Asiminari, G.; Papageorgiou, E.; Bochtis, D. A convolutional neural networks based method for anthracnose infected walnut tree leaves identification. Appl. Sci. 2020, 10, 469. [Google Scholar] [CrossRef]
Rachmad, A.; Fuad, M.; Rochman, E.M.S. Convolutional Neural Network-Based Classification Model of Corn Leaf Disease. Math. Model. Eng. Probl. 2023, 10, 530–536. [Google Scholar] [CrossRef]
Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 1345–1459. [Google Scholar] [CrossRef]
Deng, L.; Li, J.; Han, Z. Online defect detection and automatic grading of carrots using computer vision combined with deep learning methods. LWT 2021, 149, 111832. [Google Scholar] [CrossRef]
Mansheng, L.; Chunjuan, O.; Huan, L.; Qing, F. Image recognition of Camellia oleifera diseases based on convolutional neural network & transfer learning. Trans. Chin. Soc. Agric. Eng. Trans. CSAE 2018, 34, 194–201. [Google Scholar]
Elharrouss, O.; Akbari, Y.; Almaadeed, N.; Al-Maadeed, S. Backbones-review: Feature extraction networks for deep learning and deep reinforcement learning approaches. arXiv 2022, arXiv:2206.08016. [Google Scholar] [CrossRef]
Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
Türkmen, S.; Heikkilä, J. An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions. In Proceedings of the Scandinavian Conference on Image Analysis, Norrköping, Sweden, 11–13 June 2019; Springer: Cham, Switzerland, 2019; pp. 41–53. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
Howard, A.; Sandler, M.; Chu, G.; Chen, L.-C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
Yang, L.; Quan, F.; Shuzhi, W. Plant disease identification method and mobile application based on lightweight CNN. Trans. Chin. Soc. Agric. Eng. 2019, 35, 194–204. [Google Scholar]
Si, H.; Wang, Y.; Zhao, W.; Wang, M.; Song, J.; Wan, L.; Song, Z.; Li, Y.; Fernando, B.; Sun, C. Apple Surface Defect Detection Method Based on Weight Comparison Transfer Learning with MobileNetV3. Agriculture 2023, 13, 824. [Google Scholar] [CrossRef]
Peng, Z.; Cai, C. An effective segmentation algorithm of apple watercore disease region using fully convolutional neural networks. In Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Kuala Lumpur, Malaysia, 12–15 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1292–1299. [Google Scholar]
Pan, L.-Q.; Fang, L.; Hou, B.-J.; Zhang, B.; Peng, J.; Tu, K. System and principle of optical properties measurement and advances on quality detection of fruits and vegetables. J. Nanjing Agric. Univ. 2021, 44, 401–411. [Google Scholar]
Xu, Z.; Wang, Z.; Huang, L.; Liu, Z.; Hou, R.; Wang, C. Double-integrating-sphere system for measuring optical properties of farm products and its application. Trans. CSAE 2006, 22, 244–249. [Google Scholar]
Xu, H.; Sun, Y.; Cao, X.; Ji, C.; Chen, L.; Wang, H. Apple quality detection based on photon transmission simulation and convolutional neural network. Trans. Chin. Soc. Agric. Mach. 2021, 52, 338–345. [Google Scholar]
Li, J.; Xue, J.; Li, J.; Zhao, L. Study of the Changes in Optical Parameters of Diseased Apple Pulps Based on the Integrating Sphere Technique. Spectroscopy 2020, 35, 32–38. [Google Scholar]
Solanki, C.; Thapliyal, P.; Tomar, K. Role of bisection method. Int. J. Comput. Appl. Technol. Res. 2014, 3, 535. [Google Scholar] [CrossRef]
Toublanc, D. Henyey–Greenstein and Mie phase functions in Monte Carlo radiative transfer computations. Appl. Opt. 1996, 35, 3270–3274. [Google Scholar] [CrossRef] [PubMed]
Maharana, K.; Mondal, S.; Nemade, B. A review: Data pre-processing and data augmentation techniques. Glob. Transit. Proc. 2022, 3, 91–99. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Ma, L.; Liu, X.; Li, H.; Duan, K.; Niu, B. Neural network lightweight method with dilated convolution. Comput. Eng. Appl. 2022, 58, 85–93. [Google Scholar] [CrossRef]
Niu, S.; Liu, Y.; Wang, J.; Song, H. A decade survey of transfer learning (2010–2020). IEEE Trans. Artif. Intell. 2020, 1, 151–166. [Google Scholar] [CrossRef]
Mormont, R.; Geurts, P.; Marée, R. Comparison of deep transfer learning strategies for digital pathology. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2262–2271. [Google Scholar]
Soekhoe, D.; Van Der Putten, P.; Plaat, A. On the impact of data set size in transfer learning using deep neural networks. In Proceedings of the Advances in Intelligent Data Analysis XV: 15th International Symposium, IDA 2016, Stockholm, Sweden, 13–15 October 2016; Proceedings 15. Springer: Berlin/Heidelberg, Germany, 2016; pp. 50–60. [Google Scholar]
Kruithof, M.C.; Bouma, H.; Fischer, N.M.; Schutte, K. Object recognition using deep convolutional neural networks with complete transfer and partial frozen layers. In Proceedings of the Optics and Photonics for Counterterrorism, Crime Fighting, and Defence XII, Edinburgh, UK, 26–27 September 2016; SPIE: Cergy, France, 2016; pp. 159–165. [Google Scholar]
Noor, A.; Zhao, Y.; Koubâa, A.; Wu, L.; Khan, R.; Abdalla, F.Y. Automated sheep facial expression classification using deep transfer learning. Comput. Electron. Agric. 2020, 175, 105528. [Google Scholar] [CrossRef]
Li, Z.; Niu, B.; Peng, F.; Li, G.; Yang, Z.; Wu, J. Classification of peanut images based on multi-features and SVM. IFAC-Pap. 2018, 51, 726–731. [Google Scholar] [CrossRef]
Mukti, I.Z.; Biswas, D. Transfer learning based plant diseases detection using ResNet50. In Proceedings of the 2019 4th International Conference on Electrical Information and Communication Technology (EICT), Khulna, Bangladesh, 20–22 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
Qassim, H.; Feinzimer, D.; Verma, A. Residual squeeze vgg16. arXiv 2017, arXiv:1705.03004. [Google Scholar]
Xia, X.; Xu, C.; Nan, B. Inception-v3 for flower classification. In Proceedings of the 2017 2nd International Conference on Image Vision and Computing (ICIVC), Chengdu, China, 2–4 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 783–787. [Google Scholar]
Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Cang, H.; Yan, T.; Duan, L.; Yan, J.; Zhang, Y.; Tan, F.; Lv, X.; Gao, P. Jujube quality grading using a generative adversarial network with an imbalanced data set. Biosyst. Eng. 2023, 236, 224–237. [Google Scholar] [CrossRef]
Li, Y.; Wang, H.; Zhang, Y.; Wang, J.; Xu, H. Inversion of the optical properties of apples based on the convolutional neural network and transfer learning methods. Appl. Eng. Agric. 2022, 38, 931–939. [Google Scholar] [CrossRef]
Korzhebin, T.A.; Egorov, A.D. Comparison of combinations of data augmentation methods and transfer learning strategies in image classification used in convolution deep neural networks. In Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus), St. Petersburg, Moscow, Russia, 26–29 January 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 479–482. [Google Scholar]
Guo, L.; Lei, Y.; Xing, S.; Yan, T.; Li, N. Deep convolutional transfer learning network: A new method for intelligent fault diagnosis of machines with unlabeled data. IEEE Trans. Ind. Electron. 2018, 66, 7316–7325. [Google Scholar] [CrossRef]
Haoyun, W.; Yiba, L.; Yuzhuo, Z.; Xiaoli, Z.; Huanliang, X. Research on hyperspectral light and probe source location on apple for quality detection based on photon transmission simulation. Trans. Chin. Soc. Agric. Eng. 2019, 35, 281–289. [Google Scholar]
Ribani, R.; Marengoni, M. A survey of transfer learning for convolutional neural networks. In Proceedings of the 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T), Rio de Janeiro, Brazil, 28–31 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 47–57. [Google Scholar]
Yang, M.; He, Y.; Zhang, H.; Li, D.; Bouras, A.; Yu, X.; Tang, Y. The research on detection of crop diseases ranking based on transfer learning. In Proceedings of the 2019 6th International Conference on Information Science and Control Engineering (ICISCE), Shanghai, China, 20–22 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 620–624. [Google Scholar]
Xu, C.; Wang, X.; Zhang, S. Dilated convolution capsule network for apple leaf disease identification. Front. Plant Sci. 2022, 13, 1002312. [Google Scholar] [CrossRef]
Dou, S.; Wang, L.; Fan, D.; Miao, L.; Yan, J.; He, H. Classification of Citrus huanglongbing degree based on cbam-mobilenetv2 and transfer learning. Sensors 2023, 23, 5587. [Google Scholar] [CrossRef]
Flach, P.A. ROC analysis. In Encyclopedia of Machine Learning and Data Mining; Springer: Berlin/Heidelberg, Germany, 2016; pp. 1–8. [Google Scholar]
Lobo, J.M.; Jiménez-Valverde, A.; Real, R. AUC: A misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeogr. 2008, 17, 145–151. [Google Scholar] [CrossRef]

Figure 1. Cross-section images of the equatorial plane of apples with (a) mild watercore, (b) moderate watercore, and (c) severe watercore.

Figure 2. Schematic diagram of the integrating sphere system. 1. Light source. 2. Diaphragm. 3. Reflection sphere. 4. Sample holder. 5. Apple sample. 6. Transmission sphere. 7. Optical fibre. 8. Spectrometer. 9. PC.

Figure 3. Schematic diagram of the three-layer biological tissue model and photon trajectories.

Figure 4. Photon transmission process based on the Monte Carlo method.

Figure 5. Simulated brightness map of apples.

Figure 6. MobileNetV3 network structure.

Figure 7. Diagram of dilated convolution. (a) R = 1: Standard convolution with no dilation. (b) R = 2: Dilated convolution with an expansion rate of 2. (c) R = 4: Dilated convolution with an expansion rate of 4.

Figure 8. Accuracy and loss curve comparison with or without data augmentation operations for the three classification conditions. (a) Comparison of the accuracy curves of the models with or without data augmentation; (b) comparison of the loss curves of the models with or without data augmentation.

Figure 9. Comparison of the accuracies and losses of the pre-training network models. (a) Accuracies in the two-class classification case. (b) Losses in the two-class classification case. (c) Accuracies in the three-class classification case. (d) Losses in the three-class classification case. (e) Accuracies in the four-class classification case. (f) Losses in the four-class classification case.

Figure 10. Confusion matrices for the different classification cases. (a) Two-class (b) Three-class (c) Four-class.

Figure 11. ROCs of the measured results. (a) ROC curve of the three-class category (solid line), (b) ROC curve of the four-class classification (dotted line). The dotted lines indicate the ROC curves for different class categorizations.

Table 1. Standard criteria for each category in the case of multiple categories.

Classification	Classification Criteria
Two-class	Area = 0; Area! = 0
Three-class	Area = 0; Area ≤ 0.15; Area > 0.15
Four-class	Area = 0; Area ≤ 0.1; 0.1 < Area ≤ 0.2; Area > 0.2

Table 2. Optical parameter classifications.

Optical Parameter	Interval Range	Value (mm⁻¹)
$μ_{a 1}$	[0.40, 1.60)	1.00
	[1.60, 2.80)	2.20
	[2.80, 4.00)	3.40
	[4.00, 5.20)	4.60
	[5.20, 6.40)	5.80
$μ_{a 2}$	[0.03, 2.20)	1.10
	[2.20, 4.40)	3.30
	[4.40, 6.60)	5.50
	[6.60, 8.70)	7.70
$μ_{a 3}$	[0.01, 0.65)	0.30
	[0.65, 1.30)	0.90
	[1.30, 1.95)	1.50
	[1.95, 2.70)	2.10
$μ_{s 1}$	[1.69, 60.00)	30.00
	[60.00, 75.00)	67.50
	[75.00, 90.00)	82.50
	[90.00, 110.00)	100.00
	[110.00, 260.00)	190.00
$μ_{s 2}$	[0.01, 15.00)	7.50
	[15.00, 30.00)	22.50
	[30.00, 45.00)	37.50
	[45.00, 60.00)	52.50
	[60.00, 75.00)	67.50
$μ_{s 3}$	[0.01, 7.50)	5.00
	[7.50, 15.00)	12.50
	[15.00, 22.50)	20.00
	[22.50, 30.00)	27.50

Table 3. MobileNetV3_small network structure.

Input	Operator	Exp Size	Output	SE	NL	Step Size
224 × 224 × 3	conv2d,3 × 3	-	16	-	HS	2
112 × 112 × 16	bneck,3 × 3	16	16	√	RE	2
56 × 56 × 16	bneck,3 × 3	72	24	-	RE	2
28 × 28 × 24	bneck,3 × 3	88	24	-	RE	1
28 × 28 × 24	bneck,5 × 5	96	40	√	HS	2
14 × 14 × 40	bneck,5 × 5	240	40	√	HS	1
14 × 14 × 40	bneck,5 × 5	240	40	√	HS	1
14 × 14 × 40	bneck,5 × 5	120	48	√	HS	1
14 × 14 × 48	bneck,5 × 5	144	48	√	HS	1
14 × 14 × 48	bneck,5 × 5	288	96	√	HS	2
7 × 7 × 96	bneck,5 × 5	576	96	√	HS	1
7 × 7 × 96	bneck,5 × 5	576	96	√	HS	1
7 × 7 × 96	conv2d,1 × 1	-	576	√	HS	1
7 × 7 × 576	pool,7 × 7	-	-	-	-	1
1 × 1 × 576	conv2d,1 × 1, NBU	-	1024	-	HS	1
1 × 1 × 1024	conv2d,1 × 1, NBU	-	k	-	-	1

Note: √ indicates that the layer includes an SE (Squeeze-and-Excitation) module; - indicates that the layer does not include an SE module; NBU indicates not be used; Exp Size indicates the dimension of the output of the 1 × 1 convolution of the first raised dimension in bneck; NL indicates the currently used nonlinear activation function; RE indicates RelU activation function and HS indicates the Hard-Swish activation function.

Table 4. Comparison of model pre-training algorithms.

Model	Classification	Loss	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Model Size (M)
Our Method	Two-class	0.0291	99.05	98.43	98.58	98.29	18.89
	Three-class	0.0327	96.77	96.30	96.45	96.12
	Four-class	0.1048	94.45	94.12	94.06	93.87
SVM	Two-class	0.3941	80.16	79.76	79.21	79.45	114.6
	Three-class	0.4315	73.18	72.69	72.17	72.43
	Four-class	0.4819	71.99	71.74	71.53	71.68
ResNet50	Two-class	0.1032	96.14	95.53	95.74	96.03	97.7
	Three-class	0.1396	94.43	93.82	94.06	94.11
	Four-class	0.2311	92.43	92.16	92.63	91.95
VGG16	Two-class	0.1504	96.69	96.47	96.31	96.48	526.3
	Three-class	0.1685	94.81	94.59	94.43	94.61
	Four-class	0.2381	92.12	91.88	91.42	91.92
InceptionNetV3	Two-class	0.1283	94.28	94.10	93.85	93.71	91.16
	Three-class	0.1611	93.19	93.01	92.94	92.88
	Four-class	0.2524	91.98	91.70	91.59	91.29
ShuffleNetV2	Two-class	0.1471	93.83	83.31	93.22	93.48	28.8
	Three-class	0.1936	91.64	91.12	92.04	91.99
	Four-class	0.2736	87.76	88.14	87.31	87.14
MobileNetV3-large	Two-class	0.0748	97.86	97.19	97.92	97.38	20.6
	Three-class	0.0912	95.45	95.17	95.96	95.16
	Four-class	0.1758	93.62	93.13	93.34	93.10

Table 5. Comparison of experimental results of different freezing layer methods on the test set in the case of three-class classification.

Number of Unfrozen Network Layers	Loss Value	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Params (M)
1	0.0308	97.20	96.65	97.12	96.89	7.54
2	0.0304	97.60	97.24	97.33	96.92	7.54
3	0.0304	97.60	97.24	97.33	96.92	7.54
4	0.0304	97.60	97.24	97.33	96.92	7.54
5	0.0304	97.60	97.24	97.33	96.92	7.52
6	0.0322	96.70	96.18	96.27	96.23	7.50
7	0.0322	96.70	96.18	96.27	96.23	7.47
8	0.0349	95.90	95.47	95.25	95.28	7.49
9	0.0322	96.70	96.18	96.27	96.23	7.46
10	0.0349	95.90	95.47	95.25	95.28	7.43

Table 6. Comparison of different transfer learning methods on the test set of measured data.

Transfer Learning Methods	Classification	Loss Value	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Params (M)
Freezing of all network layers	Two-class	0.1768	92.25	91.53	91.74	91.48	4.24
	Three-class	0.2136	89.70	89.82	89.25	88.46	4.24
	Four-class	0.2415	87.53	87.12	87.09	86.94	4.24
Partial freezing of the network layers	Two-class	0.0288	99.13	98.67	98.51	98.70	7.52
	Three-class	0.0304	97.60	97.19	97.23	97.41	7.52
	Four-class	0.0982	95.32	94.91	94.95	95.13	7.52
No freezing of network layers	Two-class	0.0341	98.55	98.23	98.36	97.92	7.54
	Three-class	0.0376	96.20	95.97	96.68	96.45	7.54
	Four-class	0.0419	93.97	93.60	93.34	93.65	7.54

Table 7. Ablation test results of measured data.

Methods	Classification	Loss Value	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	Params (M)
MobileNetV3	Two-class	0.3251	81.23	81.68	81.80	81.42	2.54
	Three-class	0.4069	78.76	78.35	78.44	78.94	2.54
	Four-class	0.4782	72.43	72.02	71.78	72.11	2.54
MobileNetV3 + Dilated convolution	Two-class	0.2471	90.58	90.26	89.97	90.12	24.10
	Three-class	0.2949	86.97	86.64	86.55	86.34	24.10
	Four-class	0.3182	81.35	81.03	81.52	80.93	24.10
MobileNetV3 + Dilatedconvolution + transfer learning	Two-class	0.0288	99.13	98.67	98.51	98.70	7.52
	Three-class	0.0304	97.60	97.19	97.23	97.41	7.52
	Four-class	0.0982	95.32	94.91	94.95	95.13	7.52

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Z.; Wang, H.; Wang, J.; Xu, H.; Mei, N.; Zhang, S. Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3. Agriculture 2024, 14, 1450. https://doi.org/10.3390/agriculture14091450

AMA Style

Chen Z, Wang H, Wang J, Xu H, Mei N, Zhang S. Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3. Agriculture. 2024; 14(9):1450. https://doi.org/10.3390/agriculture14091450

Chicago/Turabian Style

Chen, Zihan, Haoyun Wang, Jufei Wang, Huanliang Xu, Ni Mei, and Sixu Zhang. 2024. "Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3" Agriculture 14, no. 9: 1450. https://doi.org/10.3390/agriculture14091450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Non-Destructive Detection Method of Apple Watercore: Optimization Using Optical Property Parameter Inversion and MobileNetV3

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Simulation Data Collection

2.2.1. Three-Layer Model of Apples

2.2.2. Intersection Point Calculation

2.2.3. Photon Cross-Layer Constraint Algorithm

2.2.4. Absorption and Scattering of Photons

2.3. Data Augmentation

2.4. Experimental Method

2.4.1. MobileNetV3 Model and Dilated Convolution

2.4.2. Transfer Learning

2.4.3. Comparison Algorithm

2.4.4. Evaluation Index

2.5. Experimental Environment and Parameter Setting

3. Results and Discussion

3.1. Data Augmentation Effect and Comparative Analysis

3.2. Comparison of the Model Pre-Training Algorithms

3.3. Comparison Experiments of Transfer Learning Methods

3.4. Ablation Experiment

3.5. Confusion Matrix

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI