*3.3. Radiative Transfer Modelling*

Radiative transfer modelling is a physically based approach that uses physical laws to simulate the interaction of electromagnetic radiation with vegetation (e.g., reflection, transmission, and absorption) [180]. The RTMs simulate vegetation spectra (e.g., leaf reflectance and transmittance) using vegetation biophysical and biochemical properties (e.g., chlorophyll and water contents) in the forward mode, and for inversion of these variables from spectral measurements in the inverse mode [181]. PROSAIL is one of the most widely used RTMs. This model is an integration of the leaf-level PROSPECT model and canopy-level SAIL model and is capable of simulating canopy reflectance using leaf properties (e.g., chlorophyll and water contents), canopy structural parameters (e.g., LAI and leaf angle), and soil reflectance [18].

PROSAIL has also been used in agricultural environments for investigating crop and soil properties. For instance, Casa and Jones [182] inverted PROSAIL and a ray-tracing canopy model with spectroradiometer-measured hyperspectral reflectance data and imaging spectrometer-acquired hyperspectral image data, respectively, for estimating canopy LAI and evaluated factors influencing the estimation accuracy (e.g., the non-homogeneous surface caused by the crop row structure). Richter et al. [98] utilized PROSAIL for estimating LAI, fCover, canopy chlorophyll, and water content from hyperspectral images and compared its performance to other methods (e.g., artificial neural network). Richter et al. [183] applied PROSAIL to investigate similar vegetation variables and analyzed the accuracy and efficiency of this method. Wu et al. [184] examined the sensitivity of vegetation indices to vegetation chlorophyll content using simulated results from the PROSPECT model and suggested

a few well-performed indices. Locherer et al. [74] attempted to estimate vegetation LAI using the PROSAIL model and multi-source hyperspectral images and tested several techniques (e.g., different cost functions and types of averaging methods) used for the inversion process. Yu et al. [37] estimated a range of vegetation phenotyping variables (e.g., LAI and leaf chlorophyll) using hyperspectral imagery and PROSAIL and examined the sensitivity of different spectral ranges to the parameters in the PROSAIL model.

Compared with the regression models discussed in previous sections, the RTMs have been less used in the literature for investigating agricultural features due mainly to their high model complexity and computational intensity. For instance, a wide range of parameters need to be considered in RTM (e.g., chlorophyll, carotenoids, water contents, leaf area index, leaf angles, solar angles, and soil reflectance, along with other parameters, in the PROSAIL model) and the users need to use different techniques (e.g., merit function, look-up table) to facilitate the forward and inversion operations of the model. In addition, it costs much more computing time than the regression models to achieve the predictions of target vegetation variables. However, it is also well known that the regression models tend to be site and time specific and are not readily transferable to other geographical regions or different times over the site [166]. In contrast, RTM is a more transferable approach owning to the fact that it is established based on physical laws and does not require training data for rebuilding the model. In addition, RTM is capable of estimating a range of vegetation properties in one model, while regression models typically can only estimate one variable [36,185].

#### *3.4. Machine Learning and Deep Learning*

Machine learning algorithms, including support vector machine regression (SVM) and RF, are powerful tools for analyzing hyperspectral information since they can process a large number of variables (e.g., spectral reflectance and vegetation indices) efficiently [186]. Machine learning has been widely used in the remote sensing field for estimating properties of ground features or classifying different ground covers [36,114,187]. Researchers have also used different machine learning algorithms and hyperspectral images for agricultural applications. SVM has been a commonly used algorithm in previous research for prediction or classification purposes. For instance, Honkavaara et al. [123] estimated crop biomass using SVM and UAV-acquired hyperspectral imagery. Bostan et al. [51] utilized SVM for classifying different crop types and achieved high classification accuracy. Ran et al. [93] used KNN and SVM classifiers for investigating tillage practices in agricultural fields and compared their performances. RF is another commonly used algorithm for investigating agricultural features with hyperspectral imagery. For instance, Gao et al. [188] successfully classified weed and maize using RF and lab-based hyperspectral images. Using ground-based hyperspectral reflectance data acquired by an ASD spectroradiometer, Siegmann and Jarmer [189] evaluated the performance of RF, SVM, and PLSR for estimating crop LAI and confirmed the good performance of RF. Similarly, using hyperspectral reflectance, Adam et al. [190] attempted to detect maize disease with the RF model. Overall, machine learning models generally have robust performances for investigating agricultural features using hyperspectral imagery.

Deep learning is a subset of machine learning and extends machine learning by adding more "depth" (i.e., hierarchical representation of the dataset) in the model [191,192]. It is a popular approach in recent years for recognizing patterns in remote sensing images and thus for investigating various ground features. Deep learning has been commonly used in the remote sensing field for image classification, such as land cover classification [193–195] and the identification of ground features (e.g., buildings) [196]. Deep learning has also been applied to precision farming to solve complicated issues. Existing studies are, for example, investigating the estimation of crop yield using CNN and multispectral images together with climate data [197], plant disease detection using CNN and smartphone-acquired images [198], crop classification using 3-D CNN and multi-temporal multispectral images [199], and classification of agricultural land cover using deep recurrent neural network and multi-temporal SAR images [200]. Kamilaris and Prenafeta-Boldú [191] reviewed applications of deep learning in agriculture and food production, although not all studies used remote sensing images. Singh et al. [201] reviewed a range of deep learning methods and their applications, specifically in plant phenotyping. Up to now, deep learning has not been well explored for processing and analyzing remote sensing images, especially hyperspectral images, for agricultural applications. Considering the capacity of deep learning for studying feature patterns in images and the rich information in hyperspectral imagery, the integration of the two has a wide range of agricultural applications (e.g., crop classification, weed monitoring, crop disease detection, and plant stress evaluation). Further research in these areas is warranted.

Machine learning or deep learning is capable of processing multi-source and multi-type data [202]. For instance, besides multi-type remote sensing images (e.g., optical, thermal, LiDAR, and Radar), other sources of data, such as weather, irrigation, and historical yield information, can also be incorporated in the modelling process for a possibly better evaluation of targeted agricultural features [203]. Although machine learning and deep learning models are powerful, it is also critical to keep in mind that these models require large-quantity and high-quality training samples to achieve robust performances [202]. Insufficient training datasets or data with issues (e.g., data incompleteness, noise, and biases) may cause undesired model performances.

In summary, different analytical methods (e.g., linear regression, advanced regression, machine learning and deep learning, and RTM) have different levels of complexity, performance, and transferability. More detailed comparisons on these methods are listed in Table 7. Overall, linear regression is the easiest method to use, and its performance is generally acceptable, although this method can be highly influenced by the choice of predictor variables and quality of the sample data. The advanced regression (e.g., PLSR) mostly performs better than the linear regression since it involves multiple variables in the model and is less sensitive to data noise. RTM (e.g., PROSAIL) is capable of producing multiple data products (e.g., chlorophyll, water, and LAI) with reasonably high accuracies. One essential advantage of this method is its high transferability. However, this method has the highest complexity as it requires a wide range of parameters and extensive programming. In terms of machine learning, many algorithms, such as RF and SVM, are well established and mostly performed well in previous studies. Some programming and model adjustments are needed for this method to achieve optimal performance. Deep learning is a relatively new method and is increasingly popular in recent years. Appropriate model design and programming are critical for this approach. It also requires a substantial amount of training data and computing resources to achieve a good model performance.


