1. Introduction
Variety classification plays a pivotal role in assessing yield and optimizing rice-field cultivation. Different rice varieties possess distinctive characteristics, including growth habits [
1,
2,
3], nutrient requirements [
4,
5,
6], disease resistance [
7,
8,
9,
10], and tolerance to environmental stressors, such as drought, heat, and pests [
11,
12]. Identifying the best-performing varieties in each region allows farmers to enhance their yields and increase profitability. Accurate variety classification leads to more precise yield estimation. Additionally, diverse rice varieties require specific management practices, including fertilization and irrigation. Tailoring these practices to the particular needs of each variety, made possible by accurate variety prediction, enhances crop health and boosts yields. Variety classification is also crucial for predicting plant capabilities for handling threats such as pests and diseases [
13]. Moreover, yield and pest/disease management are critical considerations in agricultural insurance assessments, underscoring the significance of variety classifications in the initial stages of such evaluations. For this purpose, researchers have made considerable efforts to develop remote sensing technology, including the utilization of drones or unmanned aerial vehicles (UAVs), which have made significant strides in agriculture [
14,
15,
16,
17].
With their remote sensing technology, UAVs have been growing rapidly in the last decade. They offer high spatial resolution for agriculture and flexible and low-cost monitoring, especially for frequent or scheduled monitoring [
18,
19,
20,
21,
22,
23,
24]. In line with this, sensors for UAVs are also increasing [
25,
26]. The multispectral sensor is one of the sensors developed as a payload for UAVs, imitating the multispectral imager on satellite platforms, such as Thematic Mapper (TM) on Landsat 8 or Multispectral Instrument (MSI) on Sentinel 2, with more detail in spatial resolution. Multispectral sensors attached to UAV are utilized in agriculture primarily to calculate vegetation indices, such as the traditional normalized difference vegetation index (NDVI), to determine the conditions of plants. The NDVI derived from a multispectral sensor on a UAV is accurate, with high determination values compared to direct measurement [
27]. The leaf chlorophyll index (LCI) is another vegetation index used to estimate leaf chlorophyll content, an indicator of photosynthetic activity. LCI relies on reflectance values from red-edge and near-infrared wavelengths, which are sensitive to the chlorophyll content in plant leaves [
28,
29,
30,
31]. Research indicates that the LCI has the potential to offer more accurate estimates of chlorophyll content than other indices such as the NDVI and the chlorophyll vegetation index (CVI) [
27,
28,
29,
30,
31].
Existing studies have employed approaches such as extreme learning machine (ELM), support vector machine (SVM), and random forest (RF) algorithms [
32,
33] to accurately classify paddy growth stages and map the spatial distribution of paddy rice fields. These studies collectively suggest that the SVM and RF algorithms are suitable for classifying different paddy varieties using remote sensing data. However, the classification of paddy rice varieties using UAV data has not been extensively explored. Although machine learning techniques have been extensively studied for various topics, such as yield estimation, their application for rice variety classification using multispectral data from UAVs is still limited. Wang et al. [
34] investigated hyperspectral imaging (HSI) to distinguish rice varieties and assess their quality, successfully generating a classification map that visualized distinct rice varieties. However, it is essential to note that the analysis was limited to the rice grains. Another study by Darvishsefat et al. [
35] evaluated the spectral reflectance of rice varieties using hyperspectral remote sensing, demonstrating the potential for accurately mapping the cultivated areas of rice varieties based on hyperspectral remotely sensed data. Nevertheless, when considering the advantages of multispectral sensors, such as better performance, cost-effectiveness, and improved spatial coverage compared to hyperspectral sensors [
36,
37,
38], the reduced complexity and data processing requirements of multispectral sensors outweigh the benefits of the higher spectral resolution offered by hyperspectral data. Hyperspectral remote sensing, while capturing a more detailed spectral signature, requires sophisticated data processing techniques and considerable computational resources. Consequently, there is an opportunity for further research on integrating machine learning algorithms with the multispectral data collected by UAVs for rice variety classification.
In this study, we employed a multispectral sensor attached to a quadcopter UAV to classify paddy varieties and explore machine learning applications for classifying them. We evaluated several machine learning algorithms, including a neural network (NN), decision tree, SVM, RF, naïve Bayes, and logistic regression. Additionally, we examined the effectiveness of ensemble algorithms, such as AdaBoost, gradient boosting, and a combination of high-performing algorithms as stacked learners. The performance of these algorithms was assessed by classifying three rice varieties, INPARI-32, INPARI-33, and INPARI-43 [
39,
40,
41], which are high-yield rice varieties developed by the Indonesian Agency for Agricultural Research and Development (IAARD) in collaboration with the International Rice Research Institute (IRRI), at three different growth stages: six, nine, and twelve weeks after planting (WAP). This study aimed to determine the most effective algorithm for accurately classifying rice varieties, considering features such as reflectance from multispectral bands and vegetation indices. By enhancing automated systems for rice variety classification, this study contributes to improved agricultural practices. It offers a reliable method for farmers, researchers, and stakeholders to efficiently identify and manage specific rice varieties at the optimal growth stage.
The structure of this article is as follows. The Introduction provides an overview of the research problem, its significance, and our research objectives. In the Methods section, we detail the methods and techniques employed in our study, including a description of the study location and instruments, data collection procedures, experimental setup, data analysis methods, feature selection, accuracy assessment procedures, and feature contribution analysis. We present the key findings of our research in the Results section and thoroughly discuss them in the Discussion section. Finally, we conclude our study and propose potential areas for improvement in the Conclusions and Future Work section.
4. Discussion
Every remote sensing technology has its advantages and disadvantages, and there are no exceptions when considering UAVs as the primary tool for monitoring rice varieties, as reported in this study. The most notable limitation is the limited capacity of UAVs to cover extensive areas. Their restricted spatial coverage may require additional flight missions, thereby creating logistical complexities in dealing with large areas. Satellite remote sensing provides a solution to address these challenges because it offers broad spatial coverage. Moreover, some satellite imagery already has a very high spatial resolution, such as GeoEye-1, developed and managed by the European Space Agency (ESA), and included in the WorldView constellation, which consists of four WorldView satellites and GeoEye-1. It has a spatial resolution of 41 cm/pixel, which is almost equivalent to that of a UAV. In terms of the temporal resolution, which is related to the revisit time, it offers a 4.6-day temporal resolution, which is quite high. However, one major drawback of high-resolution satellite imagery is its unaffordability owing to its high cost. Hence, UAVs remain superior in terms of spatial, spectral, and temporal resolution, even though they have limited spatial coverage. Furthermore, the development of UAVs has now entered the era of fixed-wing UAVs, which offers more efficient and broader coverage.
This study investigated the application of machine learning to classify different paddy rice varieties using UAV data. Our analysis involved exploring various algorithms to identify the most effective approach tailored to this task. We also aimed to identify the most suitable growth stage for rice plants for a reliable variety classification. Our findings revealed that machine learning algorithms perform differently depending on the growth stage of the plant, with twelve WAP found to be the most optimum growth stage. Overall, the NN emerged as the most effective algorithm in this study.
Several studies have indicated that the growth stage of paddy rice plants affects transpiration, photosynthesis, and the response to soil water potential, as well as the impact of nitrogen application on yield [
63,
64]. In this study, we examined rice images captured at six, nine, and twelve WAP. During these stages, paddy rice undergoes significant physiological and morphological changes, which can affect its spectral reflectance. Our study revealed that twelve WAP is the optimum growth stage for rice variety classification, with NN as the best machine learning algorithm. This is due to several factors, such as morphological differences, increased canopy development, stable feature representation, and reduced intraclass variability. By the twelfth WAP, paddy rice plants have reached a more advanced growth stage, and there are more pronounced morphological differences among different plant varieties. These differences can include variations in leaf shape, size, color, and overall plant structure. These characteristics reflect the differences in the spectral reflectance captured by the UAV sensor. Machine learning algorithms can leverage these distinctive features to classify different varieties more accurately. It is also worth noting that the intraclass variability of the twelfth week after planting was the lowest, as shown in
Figure 6. Such a reduction in intraclass variability plays a crucial role in improving classification accuracy, as it minimizes the overlap and confusion between different plant varieties. By reducing the variations within each class and enhancing the separability between classes, machine learning algorithms can achieve higher accuracy and reliability in the classification process. This observation is supported by previous studies, which highlighted the positive impact of reduced intraclass variability on classification performance. Fusheng et al. [
65] reduced intraclass variability through instance-level embedding adaptation, significantly improving the classification accuracy of few-shot learning tasks. Hence, the combination of leveraging distinctive features and reduced intraclass variability at the twelve WAP contributes to the improved accuracy of machine learning algorithms in classifying plant varieties.
In line with the findings of this study, previous research by Tan et al. [
66] also supported the superiority of the NN algorithm. They demonstrated that NNs outperformed maximum likelihood in land cover classification using Landsat multispectral data. Similarly, Etheridge et al. [
67] conducted a classification study based on overall error rate. They found that the probabilistic NN exhibited the highest level of reliability, followed by the backpropagation and categorical learning networks. These findings highlight the consistent success of NN-based approaches for different classification tasks.
Other studies have explored the effectiveness of NNs in various agricultural applications. For example, Senan et al. [
68], Bouguettaya et al. [
69,
70,
71], and Ramesh and Vydeki [
72] utilized deep learning techniques based on a convolutional neural network (CNN) for paddy leaf disease classification. Muthukumaran et al. [
73], Amaratunga et al. [
74], and Abdullah et al. [
75] performed paddy yield prediction and forecasting using artificial NNs (ANNs). Although these studies did not specifically investigate the classification of rice varieties, the successful utilization of NNs in related agricultural tasks further emphasizes the efficiency of these algorithms.
NNs are suitable for classification because of their efficiency, reliability, and accuracy, surpassing those of mainstream methods. Despite longer training times, the precise outcomes outweigh this drawback. Our study noted that the NN’s extended training times were still faster than those of some ensemble algorithms. The NNs required approximately 3 s, while the others required less than 0.2 s. Therefore, it is important to consider increased training time when applying this algorithm. Their unique ability to handle multi-classification tasks and adapt through backpropagation sets them apart, excelling in intricate classification of paddy rice varieties. Despite the potential drawbacks of training time, NN accuracy and adaptability make it the ultimate choice for paddy rice classification.
In this study, we explored the predictive power of various features for classifying different rice varieties. Interestingly, we observed that different features exhibited varying degrees of effectiveness for this classification task. Among the features examined, Green emerged as the most influential and impactful in accurately discerning rice varieties. Remarkably, this feature surpassed well-established vegetation indices, such as the LCI, GNDVI, NDVI, and LAI, in its ability to accurately classify the rice varieties treated in the present study. This finding highlights the significance of this feature and its potential to capture the essential characteristics that differentiate rice varieties.
Our results shed light on the importance of considering a diverse range of features when undertaking variety classification. Although vegetation indices have traditionally been relied upon for this purpose, our results emphasize that alternative features, such as the Green feature, can offer superior performance and contribute significantly to the accurate classification of rice varieties.
The selection of Green reflectance as the most important feature in determining rice variety can be attributed to several factors. First, Green reflectance captures vital information about vegetation health and vigor, as it represents the amount of light reflected by the plants in the relevant wavelength range. This feature is particularly meaningful for rice varieties, as their growth and development rely heavily on chlorophyll content, which affects their overall health and productivity. Higher Green reflectance values indicate healthier and more vigorous vegetation, which is indicative of specific rice varieties.
Second, the Green reflectance feature may possess unique spectral characteristics that are particularly discriminative for differentiating rice varieties. It can capture subtle differences in leaf structure, pigmentation, or physiological traits among the varieties, which are not adequately captured by other vegetation indices such as the LCI, GNDVI, NDVI, or LAI. These indices often focus on specific wavelengths or combinations of wavelengths, whereas Green reflectance provides a more comprehensive representation of the overall greenness of vegetation. Fu et al. [
76] suggested that leaf greenness could potentially function as a convenient indicator for identifying genotypes with elevated photosynthetic capabilities.
In addition to its importance as a predictor, the Green reflectance parameter exhibits noteworthy intraclass variability. As mentioned earlier, the variability of a feature within each class provides valuable insight into its discriminative power. In the context of our study, Green reflectance demonstrated a significantly lower standard deviation across all growth stages when compared to other features.
Figure 6 visually represents the distribution of Green reflectance and highlights its narrower variability. This reduced variability suggests that Green reflectance possesses distinct and consistent spectral characteristics that enable it to effectively differentiate among different rice varieties. The ability to discern subtle variations in Green reflectance across growth stages makes it a reliable feature for classification. This finding emphasizes the importance of the “green” reflectance as a significant contributor to the classification process.
Additionally, the high contribution of the Green reflectance feature could also be influenced by the specific growth stage of the rice plants during twelve WAP. Different growth stages exhibit varying spectral signatures, and during this stage, the Green reflectance feature may have stronger discriminatory power in distinguishing rice varieties. It is essential to consider the growth dynamics of rice plants and the corresponding physiological changes that occur at different stages when interpreting the significance of these features. The visualization provided in
Figure 8 further supports the importance of the Green reflectance feature, demonstrating its prominent position as an essential predictor. The potential decrease in the AUC of the classification in the absence of this feature highlights its critical role in achieving accurate and reliable classification results for rice varieties.
These findings indicate that different features may be important for classifying different rice varieties. This information can be valuable for designing more effective machine learning algorithms for classifying rice varieties based on remote sensing data. For example, algorithms that weigh certain features more heavily for certain varieties may be able to achieve higher accuracy in classifying these varieties. Overall, this analysis provides important insights into the relationship between different features and classification of different rice varieties, which can inform future research in this area.
This study achieved its highest classification accuracy (0.644) using data from the twelfth WAP, which is significant within the context of this research methodology. As there are no prior studies available for comparison, this outcome is promising and underscores the novelty of our approach, suggesting potential enhancements through further research. Future investigations should explore alternative vegetation indices in conjunction with multispectral bands to improve classification. Deep learning algorithms, renowned for their ability to automatically extract relevant features, hold promise for this task owing to advancements in hardware and software technologies and the abundance of available data. Additionally, evaluating various paddy rice varieties can bolster the model’s capabilities and enhance the robustness of the research for reliable analysis of the study area.