1. Introduction
The ability to sense and quantify spatial variability in parameters of interest within a field is a key component of precision agriculture [
1]. In situ and proximal sensing are commonly used for real−time control of agricultural inputs. Remote sensing is suitable for prescriptive management, where measurements are used to build prescription maps that are in turn used to control equipment as it traverses a field. Remote sensing is currently among the most widely studied topics in precision agriculture [
2] and the recent advances in small unmanned aircraft systems (sUAS) and miniaturized sensors have provided new tools applied to remote sensing research [
3,
4]. Remote sensing using sUAS has covered a wide range of applications including sensing biomass and nitrogen status [
5], monitoring wheat production [
6], and monitoring rangelands [
7]. UASs provide a versatile method for remote data collection with a relatively high spatiotemporal resolution when compared to conventional satellite- and ground-based methods [
8].
Multispectral, thermal, or visible light cameras are most commonly deployed for sUAS-based remote sensing [
9,
10,
11,
12,
13,
14]. Most of the commercially available sensors are designed to work in one or two ranges of wavelengths to reduce sensor cost and data processing complexity. Typically, a small set of narrow−band ranges that are sensitive to one or more field parameters are selected to create an index [
15,
16]. A ubiquitous index in crop production is the normalized difference vegetation index (NDVI), which typically uses red and near-infrared (NIR) light to estimate crop vigor. While relatively simple to apply, vegetation indices tend to correlate with a myriad of parameters, which makes distinguishing the actual source of variability difficult.
Portable spectrometers are relatively inexpensive tools that can be used to measure a continuous complete spectrum across a wide range of wavelengths. Recent advances in portability and control have led to the ability to mount spectrometers on sUAS platforms [
17,
18]. In these studies, two identical spectrometers (STS, Ocean Optics) were deployed. One spectrometer was oriented towards the ground and measured the reflectance from a reference white target. The other was mounted on a UAS to measure reflectance from land targets. The ratio of the land target reflectance and the reference white target was considered as compensated reflectance from the land target. Unlike hyperspectral cameras, spectrometers only collect a single spatial measurement representing a circular or elliptical area. Equipment costs and data processing requirements are substantially reduced when using spectrometers versus hyperspectral cameras in instances where spatial resolution is not important.
For lab-based spectrometry, measurements are taken under controlled light conditions, which is an advantage that does not exist for UAS-deployed spectrometers under field condition with frequent changes in ambient light. Experiments that collect spectral measurements are typically conducted during favorable conditions such as full sun around solar noon in order to reduce the effect of ambient light change in measurements and maximize reflectance [
19]. Ambient light variability caused by atmospheric conditions reduces the accuracy of measurements derived from spectral data [
20]. Thus, spectral measurement systems typically require some form of field calibration to account for ambient light conditions. Calibration of spectral measurement systems is challenging due to the large number of factors that can influence spectral response [
21]. Targets with known reflectivity are a vital element in a typical calibration process [
22]. The empirical line method is one of the common approaches for calibrating spectral data against variable illumination. In this approach, tarps or panels with known reflectivity are placed in a field during data collection. By finding the relationship between known reflectance values and the raw intensity measurements of the sensor, an equation is obtained and then applied to all measurements [
17,
18]. The data collection period is limited since changing Sun angle during data acquisition affects the reflectance [
23]. Transient cloud cover can also substantially affect the amount of ambient light present over short durations. Another shortfall more specific to hyperspectral imaging is the practical limitation of having tarps or other reference targets in all images, especially when high resolution data is desired or a large area is covered [
24].
Devising a method that can keep track of ambient light changes while measuring the raw reflectance from a spectral target using an uncalibrated spectrometer would be useful in precision agriculture research and on-farm applications where ambient light conditions cannot be controlled. By automating this measurement process through concurrent ambient light detection, a compensated reflectance can be obtained for every single wavelength in the spectrum at a low cost and under various ambient light conditions [
25,
26].
Obtained spectrums can be analyzed partially or entirely to estimate different agricultural indices in a field. Nevertheless, calibrating these sensors for various ambient light conditions and avoiding saturation remain a challenge. Ground-based field spectrometers are mostly limited to data collection in a specific period and ambient light condition [
27,
28,
29,
30] and use repeated reference measurements from calibration tarps [
31]. This process does not scale well to UAS-based applications where large areas are covered and extended time periods with changing ambient light conditions are necessary to acquire measurements.
An alternative to processing a small number of wavelengths into an index is to use the full measured spectrum. Supervised machine learning algorithms are a convenient way to model spectral data. Spectral data are collected from samples with known parameters and used to train the model through a variety of techniques. A subset of data is withheld from training and used to validate or test the model. Different machine learning algorithms have already been used for classification of hyperspectral images [
32], weed detection [
33], plant disease detection [
34], biotic stress detection [
35], water quality monitoring [
36], human learning [
37,
38], and many other applications. Several studies focused on developing algorithms and methods for feature selection to reduce the dimensionality of very large datasets [
39,
40]. Compressing the dataset into a smaller set of components reduces processing time and avoids overfitting the model to the data [
41].
Applying machine learning has become less difficult due to advances in computational software, such as MATLAB, that include graphical interfaces for organizing and processing data. Models from one−dimensional spectral data (e.g., intensity vs. wavelength) derived from a few thousand samples can be trained in several minutes using a personal computer. The speed at which models can be trained and validated makes testing a wide range of models feasible. In [
42], an ensemble approach to machine learning was used to classify moisture content (MC) of bare soil and wheat stalk residues from spectral data collected in a laboratory controlled experiment. Twenty turn-key models available in MATLAB (R2015b, The Mathworks, Natick, MA, USA) were trained and used to classify MC at seven levels between 3.3% and 30% on a gravimetric basis. Performance varied between 35% and 96% classification accuracy, depending on the model used, and several models had large deviations in performance when classifying soil versus stalk moisture content. Results indicated that choosing a model solely from past performance in literature may not yield optimal performance for a new dataset.
Variability in ambient light conditions and performance of classification methods remains a challenge despite substantial progress in using hyperspectral camera and spectrometer-based remote sensing used to classify agricultural parameters. Many agricultural parameters (e.g., moisture content) are dynamic, which makes collecting comprehensive datasets of spectral data paired with reference measurements expensive. Calibration targets with parameters that do not change under varying ambient conditions are useful first step in testing new sensing and data processing methods before moving on to more complex scenarios.
Objectives
The application area of this work is remote sensing in precision agriculture using portable spectrometers. Previous work using spectrometers under variable ambient light conditions revealed the need to compensate for ambient light to optimize instrument sensitivity and improve the feasibility of classifying targets from spectral signatures [
14,
43]. This study aimed to expand upon the previous work by devising methods to update integration time (measurement period) and incorporate calibrated irradiance measurements. Specific objectives included:
Fabricate a set of “grayscale” calibration targets and quantify their spectral reflectance relative to a calibration standard.
Develop methods for adjusting integration time and incorporating irradiance measurements to automate ambient light compensation.
Test the ability of the system to classify different targets under a wide range of ambient light conditions.
The grayscale targets presented in this study were used in lieu of agriculture targets to simplify testing ambient light compensation methods prior to moving on to more complex scenarios.
2. Materials and Methods
2.1. Instrumentation
Two spectral measurement systems were deployed for data collection—an ambient light system for collecting downwelling solar irradiance and a reflectance system for collecting upwelling reflectance measurements from targets located underneath the sensors (
Figure 1). Each system consisted of three Ocean Optics STS spectrometers in the ultraviolet (UV), visible (VIS), and near−infrared (NIR) ranges; a Raspberry Pi 3 (RPi) embedded computer (Model B V1.2, Raspberry Pi Foundation, Cambridge, United Kingdom); and a custom 3D printed plastic enclosure for mounting each system to a test stand. The test stand aligned the reflectance system 1 m above reflectance targets and positioned the ambient light system directly above the reflectance system.
The UV (STS−UV−L−25−400−SMA, Ocean Optics, Largo, FL, USA), VIS (STS−VIS−L−50−400−SMA, Ocean Optics, Largo, FL, USA), and NIR (STS−NIR−L−25−400−SMA, Ocean Optics, Largo, FL, USA) spectrometers used in the ambient light system were equipped with a direct−attach cosine corrector (CC−3−DA, Ocean Optics, Largo, FL, USA) and factory calibrated to convert raw intensity measurements to units of energy (μJ). The cosine corrector provided a 180° field-of-view (FOV) facing upward and normal to the ground. The UV and NIR spectrometers had an optical resolution of 1.5 nm and the VIS spectrometer had an optical resolution of 3 nm. Spectrometer integration times were fixed at 1000 ms for the UV and NIR spectrometers and 180 ms for the VIS spectrometer. Ambient light spectrometer configurations were selected based on the manufacturer’s recommendation.
The UV (STS−UV−L−100−400−SMA, Ocean Optics, Largo, FL, USA), VIS (STS−VIS−L−100−400−SMA, Ocean Optics, Largo, FL, USA), and NIR (STS−NIR−L−100−400−SMA, Ocean Optics, Largo, FL, USA) spectrometers used in the reflectance system were equipped with a direct−attach collimating lens (74−DA, Ocean Optics, Largo, FL, USA). The 100 μm slit combined with the collimating lens produced an elliptical FOV with a semi−major axis length of 9 cm and a semi-minor axis length of 4 cm. All three spectrometers used in the reflectance system had an optical resolution of 6 nm. Spectrometer integration time varied continuously as described in
Section 2.3. Reflectance spectrometer configurations were selected based on the manufacturer’s recommendation. Spectrometer specifications for the ambient light and reflectance systems are summarized in
Table 1.
2.2. Reflectance Targets
Five 0.3 × 0.3 m birch plywood targets painted in varying shades of gray (Glidden Premium Exterior Acrylic Flat Base GL6111 and GL6112, Glidden, Pittsburgh, PA, USA) and one target laminated with a 0.8 mm thick sheet of polytetrafluoroethylene (PTFE), were fabricated as reflectance targets to be placed underneath the spectrometers. Each painted target, labeled T1 through T5 in
Figure 2, received two coats of white primer and two coats of paint. The PTFE target, labeled T6, was previously fabricated for a separate study and contained a threaded insert for mounting on a tripod. Targets were offset roughly 12 cm from the center of the reflectance spectral measurement system to ensure that the threaded insert in T6 was not within the FOV when collecting reflectance data.
The relative reflectance of each target was quantified to determine if unique spectral signatures other than changes in average intensity existed, which would oversimplify target classification. A Spectralon calibration standard (WS−1−SL, Ocean Optics, Largo, FL, USA) was used as a benchmark to represent 100% relative reflectance. Actual reflectivity of the Spectralon standard was specified at 99% between 400 and 1500 nm and greater than 96% between 250 and 400 nm. Thus, any non-linearity in reflectivity of the Spectralon calibration target were ignored in this study. A halogen light source (HL−2000−FHSA, Ocean Optics, Largo, FL, USA) was used to illuminate a portion of the target through optical fibers in a backscatter reflectance probe (QR200−12−MIXED, Ocean Optics, Largo, FL, USA). Light reflected from the targets entered separate sets of optical fibers that were fed into two spectrometers (HR4000−7−VIS−NIR & NIRQuest512, Ocean Optics, Largo, FL, USA). The combined spectrometers enveloped the wavelengths observed using the STS UV, VIS, and NIR spectrometers and overlapped at 900 nm. However, spectral response below 400 nm was clipped due to poor sensitivity of the HR4000 spectrometer at shorter wavelengths that resulted in excessive noise in relative reflectance measurements. Nine spectral measurements were taken at uniformly spaced locations across each reflectance target and averaged to quantify relative reflectance response as a function of wavelength.
2.3. Reflectance Spectrometer Integration Time
Integration time refers to the period over which a spectrometer detector collects light. Increasing the integration time has the effect of applying a gain to the spectral signal, making weak signals more distinct or unique features more discernable. However, increasing integration time by an excessive amount reduces sampling rate and will eventually lead to saturation in spectral data at one or more wavelengths when the maximum charge that can be stored in an individual pixel has been reached. A saturated measurement is not useful for signal classification. Hence, the optimal scenario is for each measurement to be taken with the maximum integration time that does not result in saturation. In practice, a buffer between the maximum intensity of any wavelength in a spectrum and the saturation level should be maintained to accommodate noise and other uncontrolled processes that could result in saturation.
Reflectance intensity varies when the ambient light condition changes (e.g., due to varying cloud coverage and angle of illumination). A fixed target will produce varying spectral responses using a passive spectrometer if integration time is set constant. A method to automatically update integration time based on the ambient light condition and the spectral response from the previous measurement was devised to optimize spectrometer sensitivity. The process started with setting an initial integration time on each reflectance spectrometer and recording a measurement. A Python script continuously running on a Raspberry Pi then read the most recent measurement. Outliers in the spectral data due to hot pixels (defective pixels that always return a saturated value) were detected and removed. The maximum intensity of the spectrum was determined and compared to the maximum possible intensity without saturation. All STS spectrometers used an identical linear imaging sensor (ELIS−1024, Panavision SVI, Woodland Hills, CA, USA) and intensity at each wavelength was reported as a 14-bit integer value ranging from 0 to 16,383. The units associated with this measurement are referred to herein as counts since they represent the raw output of the analog-to-digital conversion process used to quantify the charge at each pixel of the linear imaging sensor. The maximum desired intensity was set to 12,000 counts to provide a threshold in the event ambient light conditions between measurements were rapidly increasing.
The function used to automatically update integration time is shown in Equation (1).
represents the integration time for the next measurement in units of milliseconds.
is the maximum intensity observed in counts for all wavelengths in the current measurement.
is the maximum desired intensity in counts, set to 12,000 for this study.
is the integration time for the current spectral measurement in milliseconds.
The initial integration time prior to the first measurement was set low enough to not result in saturation at any wavelength (UV: 100 ms; VIS: 35 ms; NIR: 100 ms). In the event that a subsequent measurement exhibited saturation, integration time was reset to the initial value and the process of determining the optimal integration time restarted.
2.4. Data Collection
Data were collected over five days during September 2017 (9/14/17, 9/15/17, 9/18/17, 9/19,17, and 9/21/17). The ambient light and reflectance systems mounted to the test stand were installed on the roof of the Charles E. Barnhart Building in Lexington, Kentucky (38.027030 N, 84.509641 W). The test stand was oriented to provide an unobstructed line of sight to the Sun so that shadows from the test stand or surrounding objects would not be cast on the targets or ambient light system. Samples were collected in ten-second intervals over a duration of 2 to 3 h on each day. Each sample included three separate measurements that were stored in a tab-delimited text file and averaged to form the sample. The time of measurement and the serial number of the spectrometer were used to define the filenames of text files that stored raw spectral measurements. This filename scheme helped facilitate tracking measurements between the six spectrometers over time. Roughly 3900 pairs of ambient light and reflectance measurements were collected using each spectrometer across all targets.
2.5. Compensating for Variable Ambient Light
Ambient light measurements were calibrated from raw intensity in counts to units of energy using a look-up table containing coefficients for different wavelength provided by the spectrometer manufacturer. Equation (2) was used to apply the calibration.
is the calibrated measurement in units of microjoules.
is the raw ambient light measurement intensity in counts.
is the calibration data in units of counts microjoule
−1. λ is the individual wavelength.
Three successive compensation modes were considered for incorporating the effect of ambient light into reflectance measurements and each mode was evaluated based on the prediction accuracy when classifying targets using machine learning algorithms described in
Section 2.7. The automatic integration time method described in
Section 2.3 was considered as the first ambient light compensation method (M1). Updating the integration time based on the previous sample optimized the sensitivity of the reflectance spectrometers to the current ambient light conditions. The second ambient light compensation method (M2) divided the resulting intensity value from the reflectance spectrometer by the current integration time (
) in units of milliseconds to produce intensity relative to integration time. Because all spectra measured using compensation method M1 were anticipated to have similar average intensities, dividing by the integration time would rescale the spectra to have average intensities similar to if a fixed integration time had been used but without sacrificing sensitivity. The third ambient light compensation method (M3) incorporated the calibrated ambient light energy measurements by wavelength as shown in Equation (3).
is the compensated reflectance measurement in units of counts ms
−1 μJ
−1.
is the raw reflectance intensity in counts. The quantity 1500 was the average dark signal present when no light entered the spectrometer and was subtracted from the raw reflectance intensity to remove the offset and provide a zero value when no light was present. By incorporating ambient light energy, compensation method M3 was expected to improve classification accuracy of similar targets when ambient light spectra changed due to uncontrolled external conditions (e.g., cloud coverage, sun angle).
2.6. Spectral Data Preprocessing
Measurements from each spectrometer covered a distinct range of wavelengths in rough increments of 0.5 nm. The actual spectral ranges for the UV, VIS, and NIR spectrometers were 184−667 nm, 338−825 nm, and 634−1124 nm, respectively. Since data at many of the wavelengths were likely to be highly correlated, partial least squares (PLS) regression was used to reduce the dimensionality of the dataset, solve collinearity issues, and speed up the machine learning classification process. PLS regression reduced the number of input parameters (wavelengths) by representing the full spectrum with a small set of regression components. The optimal number of regression components was obtained using two parameters—the estimated mean squared prediction error when classifying a target and the variance explained in the output variable (target) by the input data (spectral response). The number of regression components in which a high variance in output variable was explained with a low prediction error was considered as the optimal number of input components. The PLS regression method and associated optimization was conducted using MATLAB (R2017a, The Mathworks, Natick, MA, USA).
2.7. Target Classification using Machine Learning
The Classification Learner app in MATLAB was used to train 22 different turn−key machine learning algorithms to classify targets based on pre−processed reflectance spectra. An ensemble approach was used here rather than targeting a particular algorithm since the underlying methodology was not of particular interest to this study. The algorithms are generally categorized as decision trees, discriminant analysis, support vector machines (SVM), nearest neighbor classifiers, and ensemble classifiers. Pre-processed spectral data were fed into individual algorithms as a matrix where columns represented regression components (predictors) and rows represented instances of each measurement. The last column (response) was allocated to target codes (T1 through T6). The dataset was randomly subdivided into a training dataset (70%), a validation dataset (15%), and a testing dataset (15%). The training dataset was used to develop the prediction model. The validation dataset was used to determine how well the model has been trained based on the expected output. Model properties, such as classification error and overfitting index were estimated during the validation step to determine if sufficient data had been used to train the model. The testing dataset was used to quantify the classification accuracy of the model by comparing frequency of correct classifications on data not used to train or validate the model. Each model was trained five times with randomly distributed training, validation, and testing data to assess variability when training the model from a finite number of samples. Spectrometer types (UV, VIS, and NIR) and ambient light compensation methods (M1, M2, and M3) were trained independently to determine the best performing combinations with respect to different machine learning algorithms.
2.8. Statistical Analysis
Three spectrometers and three compensation modes were considered in this experiment. It was desired to see if there were any significant differences between various types of spectrometers and the ambient light compensation methods in terms of target classification accuracy. The optimal machine learning algorithm was tested for each combination of compensation mode and spectrometer type to determine if significant differences in target classification accuracy existed. The experiment was set up with a factorial design using spectrometer type and ambient light compensation method (3 × 3). The classification accuracy results were subjected to analysis of variance (ANOVA) and a multiple comparison test was conducted using the anova2 and multcompare functions in MATLAB (R2017a), respectively. The anova2 function tested for significant differences in factors (i.e., spectrometer type and ambient light compensation method) and their interactions. A significance level of 0.05 was used for ANOVA. The multcompare function used the output of the anova2 function to test determine which pairs of factors were significantly different by applying Tukey’s honest significant difference (HSD) procedure. The null hypothesis was that there were no significant differences between spectrometer type and compensation mode with the prediction accuracy of the optimal model.
4. Discussion
While not explicitly hypothesis driven, the underlying assumption of this experiment was that dividing normalized spectral measurements by their integration time (M2) and individual wavelengths using ambient light measurements (M3) would improve prediction accuracy when classifying multiple “grayscale” targets across a wide range of ambient light conditions. The results showed that simply optimizing integration time to produce the most sensitive measurement (M1) was the best approach to maximize prediction accuracy.
The results for compensation mode M3 were not surprising given that a second set of calibrated instruments were used to collect the ambient light measurements. The ambient light spectrometers had different optical resolutions from the reflectance spectrometers and, although they report the same wavelengths, incoming light was not distributed across the sensor in the same manner. It might have been more appropriate to simply compute the average ambient light energy from the ambient light spectrometers before applying the normalization rather than by individual wavelength, but the method used in this experiment was chosen to be consistent with existing literature [
17,
18]. Another potential source of uncertainty is that the integration times of the ambient light spectrometers were fixed while the reflectance spectrometers varied. This resulted in measurements over different periods that may not capture the same variability in ambient light conditions.
The results for compensation mode M2 were not expected given that dividing by the integration time is a scalar operation. A plausible explanation is that the signal that distinguished the targets is not the average intensity but the variability between wavelengths. It is unlikely that the ambient light spectrometers incorrectly applied the desired integration time or incorrectly reported the actual integration time. The small reduction in prediction accuracy may have been due to the rounding that occurred when using integer operations.
While the difference in prediction accuracy between the NIR and the UV/VIS spectrometers was significant, the actual amount was small. Much of this difference can likely be attributed to the targets used. The painted targets did not reflect light uniformly as compared to the Spectralon calibration standard. The most obvious discrepancies between the targets occurred in the UV and VIS ranges, hence the better performance by these spectrometers. A set of “greyscale” calibrated standards with more uniform reflectance would better reveal differences in spectrometer performance. Ultimately, the actual target will define which type of spectrometer should be used for remote sensing. Future work should use more challenging targets, such as crops in a breeding study or soils for moisture analysis, rather than simple “grayscale” targets.
The best performing machine learning methods for classifying targets presented in this study should not be considered optimal for all scenarios. The simplicity of the “greyscale” targets likely masked the true difficulty in classifying parameters in natural targets. Previous work [
14] did show that models developed using support vector machines and ensemble bagged trees perform well on agricultural targets, but several of the well-performing models presented here previously failed when using agricultural targets. This emphasizes the importance of not selecting a machine learning model solely based on performance in one domain and further reinforces the need to test ambient light compensation techniques using actual targets for a given application.