Next Article in Journal
Computer Vision Algorithm for Characterization of a Turbulent Gas–Liquid Jet
Next Article in Special Issue
Enhancing Smart Agriculture Monitoring via Connectivity Management Scheme and Dynamic Clustering Strategy
Previous Article in Journal
Numerical Study of the Influence of the Type of Gas on Drag Reduction by Microbubble Injection
Previous Article in Special Issue
Sensing Spontaneous Combustion in Agricultural Storage Using IoT and ML
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Early-Stage Identification of Powdery Mildew Levels for Cucurbit Plants in Open-Field Conditions Based on Texture Descriptors

by
Claudia Angélica Rivera-Romero
1,
Elvia Ruth Palacios-Hernández
2,
Osbaldo Vite-Chávez
1 and
Iván Alfonso Reyes-Portillo
3,*
1
Unidad Académica de Ingeniería Eléctrica Plantel Jalpa, Universidad Autónoma de Zacatecas, Jalpa 99601, Mexico
2
Facultad de Ciencias, Universidad Autónoma de San Luis Potosí, San Luis Potosí 78295, Mexico
3
Academia de Ingeniería en Sistemas y Tecnologías Insdustriales, Universidad Politécnica de San Luis Potosí, San Luis Potosí 78295, Mexico
*
Author to whom correspondence should be addressed.
Inventions 2024, 9(1), 8; https://doi.org/10.3390/inventions9010008
Submission received: 7 December 2023 / Revised: 27 December 2023 / Accepted: 28 December 2023 / Published: 3 January 2024

Abstract

:
Constant monitoring is necessary for powdery mildew prevention in field crops because, as a fungal disease, it modifies the green pigments of the leaves and is responsible for production losses. Therefore, there is a need for solutions that assure early disease detection to realize proactive control and management of the disease. The methodology currently used for the identification of powdery mildew disease uses RGB leaf images to detect damage levels. In the early stage of the disease, no symptoms are visible, but this is a point at which the disease can be controlled before the symptoms appear. This study proposes the implementation of a support vector machine to identify powdery mildew on cucurbit plants using RGB images and color transformations. First, we use an image dataset that provides photos covering five growing seasons in different locations and under natural light conditions. Twenty-two texture descriptors using the gray-level co-occurrence matrix result are calculated as the main features. The proposed damage levels are ’healthy leaves’, ’leaves in the fungal germination phase’, ’leaves with first symptoms’, and ’diseased leaves’. The implementation reveals that the accuracy in the L * a * b color space is higher than that when using the combined components, with an accuracy value of 94% and kappa Cohen of 0.7638.

1. Introduction

Agriculture is one of the primary resources and involves a large community of plants as various crops with different environmental conditions. A large part of a country’s economy involves the export of agricultural products that are sold daily for human consumption. The diagnosis and prevention of pathologies in crops are required tasks in agriculture. The excessive use of pesticides, inappropriate farming practices, and the abandonment of plant-disease-infected regions are causing agricultural losses. In addition, farmers confront several problems every day, such as fungal plant diseases. Different plants are highly susceptible to damage by fungi. In the case of cucurbits, there are limited studies about the damage caused by diseases and pests. According to previous studies, fungal infections such as powdery mildew (PM) begin with spore germination [1,2,3]. This fungus is the most common type of disease found in open-field crops.
Various techniques involving mathematical and computational processes using information obtained by digital images can be used for disease detection. Currently, some methods use images for disease and pest classification under conditions of infected plants [4,5,6]. Plant disease detection using on image-processing technologies generally involves a methodology that includes plant disease image acquisition, image processing, image segmentation through the region of interest, feature extraction and selection, and the application to disease detection. An image provides sufficient information to identify characteristics that describe the severity and stage of the disease [7] because the leaves of the plant are the first organ that shows symptoms of a disease. Methods such as classification algorithms have been developed over the years for disease identification. These algorithms consist of feature extraction from images of plants experiencing problems at different disease stages.
There is a need for early-stage identification in general in diseased plants when the first symptoms appear in order to take effective control measures against a fungus. This paper proposes a method for early PM detection in cucurbit leaves based on digital images according to predefined PM damage levels. Then, the problem becomes training a set of classifiers. The first stage consists of training the classifier to distinguish between two PM damage levels. The second stage involves a voting scheme that determiens the PM damage level.
In the literature, some methodologies for detection applied to fungal diseases have been proposed. However, there are still open problems and unsolved issues related to the classification of powdery infection and prevention, such as the early detection of powdery mildew in cucurbit leaves. This scenario comprises the detection of the first symptoms of initial powdery mildew germination, which is a crucial phase for implementing management strategies to achieve eradication. Some studies only identified the disease when the plants have symptoms. However, the real problem is cases in which the plants have not yet shown symptoms. From this perspective, the innovations of the proposed methodology include (i) early symptom detection in natural conditions of crops with fungal disease; (ii) a feature extraction process in which the transformed and processed images are feature descriptors of, for example, the texture in an image; (iii) statistical feature selection is executed with the feature data to reduce the number of color components and descriptors; (iv) a nondestructive methodology for crop plants; (v) sample images are in natural lighting conditions; and (vi) disease detection and classification of powdery mildew infection in cucurbit leaves considering three phases of damage: the germination stage, the first symptoms, and when the leaves have the fungus.

Literature Review

Kumar et al. [8] introduced a novel exponential spider monkey optimization method to fix the significant features from a set of features generated using a subtractive pixel adjacency model. Through a support vector machine, plants were classified as diseased or healthy. The selected features for the spider monkey optimization increased the classification reliability to an accuracy of 92.12% with 82 selected features. A hybrid prediction model was developed by Lamba et al. [9] for predicting various levels of severity of blast disease using diseased plant images. This work was based on the percentage of leaf area affected by the disease. The features were extracted from an image dataset with a convolutional neural network approach. The classification accuracy of the severity level of blast disease was 97%. Kaya et al. [10] proposed a novel approach based on deep learning for plant disease detection by fusing RGB and segmented images. They considered two images as the input to a multiheaded dense-net-based architecture. They used the Plant Village database with 38 classes. The accuracy was 98.17%. Xu et al. [11] proposed a vision system with an integrated reflection–transmission image acquisition module, human–computer interaction module, and power supply module for rapid Huanglongbing (HLB) detection in the field. With six classes of identification (healthy; HLB pre-symptomatic; zinc, magnesium, or boron deficiency; or HLB-positive), a step-by-step classification model with four steps was used. The results showed that the model had an accuracy of 96.92% for all categories of samples and 98.08% for multiple types of HLB identification.
Sabat-Tomala et al. [12] used support vector machine and random forest as two machine learning algorithms to discriminate Solidago spp., Calamagrostis epigejos, and Rubus spp. using hyperspectral aerial images. Kasinathan et al. [13] proposed a method of insect detection based on morphological features. The classification used nine to twenty-four insect classes using shape features and machine learning techniques. The machine learning models applied for the comparison were support vector machine, K-nearest neighbors, artificial neural network, naïve Bayesian model, and convolutional neural network (CNN). The algorithm consisted of foreground extraction and insect contour detection.
Recently, Yag et al. [14] used a new hybrid plant leaf difference as a classification model, including a flower pollination algorithm, a support vector machine, and a convolutional neural classifier. The two-dimensional discrete wavelet transform used image datasets from apple, grape, and tomato plants for feature extraction. Fernandez et al. [15] conducted a study to find the spectral changes caused by the downy mildew pathogen Podosphaera xanthii on cucumber leaves. They adapted principal component analysis to the spectral characteristics of healthy and diseased leaves. The authors used a linear support vector machine classifier, achieving an accuracy of 95%.

2. Materials and Methods

The general scheme, which involves a machine learning approach for the early detection of PM damage, is shown in Figure 1. Image acquisition and preprocessing are the first steps, which are followed by a feature extraction process through texture descriptors. The application selects the optimal features based on a comparison test. Binary classifiers achieve multiclassification in combination with a voting scheme and SVM blocks. Finally, the performance is evaluated with parameters that determine the optimal classification of PM damage level in cucurbit leaves.

2.1. Acquisition

We used an image database of cucurbit plants and leaves consisting of six distinct crops in diverse locations and natural conditions: San Luis Potosí (San Luis Potosí), Jalpa (Zacatecas) and Yuriria (Guanajuato) in Mexico. During September–December 2015 (21°69′42.1″ N, 102°97′34.5″ and 20°08′08″ N, 101°01′52″), January–April 2016 (21°65′89.7″ N, 102°96′80.6″ and 21°69′43″ N, 102°97′09.5″), September—November 2016 (20°21′92.7″ N; 101°10′11.6″ W), and March–June 2017 21°59′75.7″ N, 103°01′52.3″), a sampling process was used to record the phenological data of the plants.
In open-field crops, irrigation systems were used, and preventive treatments were applied every three days for leafminers, whiteflies, spider mites, downy mildew, and powdery mildew. At the same frequency, leaves were imaged during the growing season in the morning under natural field crop conditions. The database consists of 51,260 images. Each leaf was sampled from the unfolded stage on the main stem to senescence, from the 1st true leaf to the 21st leaf of each plant. During sampling days, D 1 D 19 image samples were collected. The camera of a mobile device was fixed on a plastic structure with a distance of 20 cm between the blade and the device, with a resolution of 2448 × 3264 with a 13 megapixels resolution in Joint Photographic Experts Group (JPEG format) and RGB color space (red, blue and green).

2.2. Proposed Powdery Mildew Damage Levels

Different plants are susceptible to fungal damage. In the case of cucurbit plants, there are limited studies on disease and pest damage. According to previous studies, fungal infections such as PM start with the germination of spores [1,2,3]. This disease has a spore germination cycle. The fungus appears on mature leaves when the plant is in the flowering and fruit development stages. The spore germination stage occurs when the infection structure is being formed. This process occurs over three to seven days before the first symptom becomes visible on the leaf surface. Some changes in the spore germination cycle at phenological stages S 1 to S 8 on sampling days D 1 to D 19 are considered as basic information for detecting different levels of leaf damage.
In Rivera-Romero et al. [16], a statistical analysis was conducted for determining a timeline (Figure 2) with sampling days and phenological growth stages with the visual assessment of PM signs and symptoms. At T 1 , leaf development ( S 1 ), lateral bud formation ( S 2 ), and inflorescence emergence ( S 5 ) during the first nine sampling days ( D 1 D 9 ) are considered. From D 10 to D 12 , between the main stages of flowering and fruit development ( S 6 and S 7 ), leaves with damage level T 2 are monitored. Leaves with damage level T 3 are found in the main stages of fruit formation ( S 7 ) between D 13 and D 16 . The main fruit and seed growth and ripening stages are shown in S 7 and S 8 and in leaves at T 4 from D 17 to D 19 days of sampling. Four levels of PM damage are then defined (Figure 3): T 1 for healthy leaves, T 2 for leaves with germinating spores, T 3 for leaves with early symptoms, and T 4 for diseased leaves.
Because cucurbit leaves have five lobes, a region of interest (ROI) was selected for analysis. The leaves were divided into six sections ( R 1 R 6 ) to investigate where the first symptoms were visible. A total of 465 leaves were then selected, of which 284 had first symptoms in the same region. Figure 4 shows the leaf division, where regions R 3 and R 4 have a higher incidence of first symptom appearance. This ROI selection is in agreement with the knowledge of local farmers, who confirmed that the first symptoms appear in the central upper lobe ( R 4 ) of basal and mature leaves at the flowering stage.

2.3. Preprocessing

Images were divided into four sets according to the assessed level of damage. Image samples were defined as I ( x , y ) , where x represents the number of rows, and y is the number of columns of a matrix, as shown in the ROI. The ROI image was defined as R ( s , t ) , where s are the rows, and t are the columns of a matrix that comprise the cropped image. All these images correspond to the R 4 region in the red, green, and blue (RGB) color space with a size of 200 × 200 megapixels. The ROI image dataset consisted of 5906 samples: 3610 samples were used for the damage level in T 1 , 760 samples were used for T 2 , 734 samples were used for T 3 , and T 4 consisted of 802 samples. A contrast setting C ( p , q ) , where p is the row, and q is the columns, in a matrix in the range of [ 0.4 0.7 ] was employed to enhance the highlighting, followed by a spatial color transformation to different color spaces T ( s , t ) , where s is the number of rows, and t is the number of columns in a matrix. Then, a color transformation was applied to the RGB color space samples into different color spaces, separating each sample into all color components (CCs): gray levels ( G ( i , j ) ), HSV (hue (H), saturation (S), and value (V)), H ( i , j ) ), L * a * b (luminance (L), chrominance * a (A), chrominance * b (B), L ( i , j ) ), and YCbCr (luma component (Y), chroma blue difference (Cb), and chroma red difference (Cr), Y ( i , j ) ), where i is the number of rows and j represents the number of columns in a matrix. Thirteen processed images were extracted. Each processed sample of the different spatial colors was separated and labeled into all og its CCs. With the color space transformation, the total dataset contained 76,778 samples. Figure 5 shows the different color spaces of an ROI image.

2.4. Feature Extraction

In this study, the color components of each space were analyzed to obtain relevant information. From the gray-level co-occurrence matrix (GLCM), texture features were extracted. The GLCM is a statistical method that takes into account the spatial relationship of pixels.
A GLCM matrix corresponds to a CC image considering the 255 gray levels, represented by the function P ( I , i , j , d , θ ) , where i represents the gray level location ( x , y ) in image I ( x , y ) , and j represents the gray level of the pixel at a distance d = 1 from the location ( x , y ) with an orientation angle and normalized with Equation (1) [17].
p ( i , j ) = P ( i , j , 1 , 0 ) i , j P ( i , j , 1 , 0 )
Figure 6 shows the computation of a GLCM matrix of an image with gray intensity levels, where the neighboring pixel pairs could be matched with four different reference angles (0°, 45°, 90°, and 135°). Figure 7 shows the GLCM matrices generated from the H i components in the HSV color space.
The TDs contain some information about shade, texture, shape, and color, describing the distribution, homogeneity, contrast, constant color, intensity, and gray levels of brightness. The TD equations based on the GLCM are presented in Table 1. An explanation of the feature extraction process used in our approach is given in Figure 8.
A total of 260 features (20 TDs × 13 CCs) were extracted from 76,778 GLCM matrices from the Color Component Image Dataset (CC-ID), which generated a texture dataset of 1,535,560, labeled as abbreviated texture descriptor names followed by the color component: DTs-CCs. An example is the texture feature diss B B “dissimilarity” and the blue component (BB) of the RGB color space image. The texture dataset was normalized using the minimum and maximum values of each row.

2.5. Feature Selection

The feature selection process consists of finding the best set of features that allows us to differentiate the four levels of damage. Statistical methods are used for the selection process, the flow diagram of which is shown in detail in Figure 9 [16].
First, a Lilliefors test was performed to assess the normality of the trait data set, followed by an analysis of variance (ANOVA) to obtain statistical significance values, which was then followed by Tukey’s test for multiple comparisons. The Lilliefors test compares the sample scores to a set of normally distributed scores with the same mean and standard deviation; the null hypothesis is that “the sample distribution is normal” [20,21,22]. Parameter values are “1” in the Lilliefors test with a determined h value for each feature. Table 2 presents examples of three texture features (diss, homo, and idmn) in all component colors, where the calculated h value = 1 means that the data have a normal distribution, and h value = 0 means that they do not. As a result, we discarded 64 texture features.
Analysis of variance (ANOVA) is a statistical method used to test for differences between two or more mean values. ANOVA is applied to test general rather than specific differences between mean values; we used ANOVA to test the null hypothesis H 0 in Equation (2) that the average values of the four PM damage levels ( T 1 , T 2 , T 3 , and T 4 ) are equal for each texture characteristic.
H 0 : μ T 1 = μ T 2 = μ T 3 = μ T 4
The F statistic and p < 0.000001 were calculated for all texture characteristics. Following this, for each pair of information ( T 1 versus T 2 , T 1 versus T 3 , T 1 versus T 4 , T 2 versus T 3 , T 2 versus T 4 , and T 3 versus T 4 ), a multicomparison was performed with Tukey’s test [20], labeled with different lowercase letters (“a”, “b”, “c” or “d”), and assigned when the comparison between the mean value of each damage level was different; otherwise, it was the same lowercase letter. If the same lowercase letter appeared in two, three, or four levels, there was no significant difference in their respective texture characteristic. Finally, 53 texture features had significantly different mean values among the four damage levels. Table 3 presents only 17 texture features that were more sensitive to the discrimination of the four damage levels. Figure 10 shows the results of ANOVA and Tukey’s test of the diss B B feature, which presents mean values with significant differences between the four damage levels, and the auto A feature, which could separate only T 1 from T 2 , T 3 , and T 4 .

2.6. Formation of the Feature Vectors

Ten feature vectors were created for the training, validation, and testing processes from the 53 features (TD) with a significant difference between the four classes (PM damage levels) considered. The features were listed in ascending order (significance value F) to form the first group of five vectors ( F 1 , …, F 5 ) containing six TDs comprising the different color space characteristics. The second group of vectors ( G 1 , …, G 5 ) contained the same number of TDs and components from the same color space (Table 4). The characteristics matrix was 35,436 × 6 texture features × 4 damage levels.

2.7. Proposed Multiclass Classification Framework

The main objectives of the proposed framework are to implement support vector machines (SVMs) to classify PM damage levels ( T 1 T 4 ) and predict the early phase. In addition, a multiclass problem is found in multiple binary classification cases, called one-vs.-one, resulting in a class comparison between each class. A block multiclassifier with k ( k 1 ) / 2 binary classifiers (SVMs) was constructed, where k is the class number. An SVM is trained with different kernels (polynomial, sigmoidal, and radially-based Gaussian functions) to find the optimal hyperplane [23]. Hyperplane minimizing and estimating h are performed using h e s t = R 2 | | w | | 2 + 1 , where R is the diameter of the smallest sphere including all training data, and | | w | | is the vector of standard Euclidean weights. Therefore, an SVM classifies correctly when the parameters Γ (confidence interval) and h e s t working with different values are minimied. In this study, 60% of the data were used for the training and validation, and 40% were used for the test. Six binary SVMs ( M 1 , , M 6 ) were trained with their corresponding two different classes of input data for all the feature vectors defined above. Table 5 and Table 6 show the results for the SVMs trained with feature vectors whose components belong to different color spaces F 1 , , F 5 , and the same color space G 1 , , G 5 , respectively. In both tables, p is the degree of the polynomial, ω is the variable parameter for the sigmoidal function, and σ is the parameter for the radial basis function. The selected SVMs were those with the minimum values. The 2D graphs and 3D hyperplanes; and training, validation, and error results with different kernels and feature vectors are described in Figure 11 and Figure 12, respectively.
The parameters for the best SVMs (different and same color spaces) were established using the one-versus-one (OVO) method for all class combinations ( T 1 versus T 2 , T 1 versus T 3 , T 1 versus T 4 , T 2 versus T 3 , T 2 versus T 4 , and T 3 versus T 4 ). Using a four-block voting scheme ( V 1 , , V 4 ), the final ranking decision of the assigned classes is made. When classes have the same number of votes, the one with the lowest index is selected. Figure 13 describes the OVO method and the voting scheme and classes that define each block. The best results of the testing stage are depicted in Figure 14 and Figure 15 for the features of different color spaces and the same color space, respectively.

2.8. Performance Evaluation

Metrics were calculated to evaluate the results. In this study, we used the confusion matrix, Cohen’s kappa coefficients, accuracy, sensitivity, false positive range, F-statistic coefficients, and specificity to determine the performance of the proposed system. A confusion matrix allowed us to obtain the performance of the system in terms of the proportion of the total number of classified data: true positive ( T P ), which is the proportion of positive cases correctly identified; false positive ( F P ), which is the fraction of negative cases incorrectly classified as positive; true negative ( T N ), which is the proportion of negative samples correctly classified; and false negative ( F N ), which is the proportion of positive cases incorrectly distinguished [24,25,26]. The metrics were accuracy ( A C C ; Equation (3)), sensitivity ( S N ; Equation (4)), specificity ( S P ; Equation (5)), precision ( P R E ; Equation (6), false positive rate ( F P R ; Equation (7)), and F score ( F β ; Equation (8)) [23,27].
A C C = T P + T N T P + T N + F N + F P
S N = T P T P + F N = T P P
S P = T N T N + F P = T N N
where P is the positive total, and N is the negative total.
P R E C = T P T P + F P
F P R = F P T N + F P = 1 S P
F β = ( 1 + β 2 ) ( P R E C S N ) β 2 P R E C + S N
where natural values in β are 0.5, 1, and 2, which was settto 1 in this case. Of the total of all the tests, the obtained metrics for system performance evaluation (percentages of A C C , S N , S P , P R E C , F P R , and F β ) are shown in Table 7 and Table 8. From the classified data from each test, we defined the confusion matrix ( n × m ), where the rows (n) indicate the damage levels, and the columns (m) are the classes provided by the model. From this matrix, we can see when one class is confused with another. The diagonal components contain the sum of all the correct predictions, and the other diagonal components reflect the errors of the misclassified data [24,25]. Cohen’s kappa coefficient (Equation (9)) is a statistical measurement of the interevaluator agreement for qualitative data or categorical variables. Its use in feature selection is suitable for testing the performance of models [28,29].
k a p p a = ( d q ) ( n q )
where d is the sum of correctly classified data, and q is the sum of each line and column in the confusion matrix to be divided by the total number of samples n with kappa in [0–1], with concordance observed with degrees of agreements (between k 0 and k 0.2 is negligible, k 0.21 and k 0.4 is discreet, k 0.41 and k 0.6 is moderate, k 0.61 and k 0.8 is substantial, and k 0.81 and k 1 is perfect).

3. Results

The extracted ROI images are proposed to obtain texture descriptors in three different color spaces and gray images. Next, we defined six feature vectors as a set with components from different color spaces and gray images. In addition, we detailed six feature vectors with color components from different color spaces and another six feature vectors with the same color space. All these vectors contained the resulting 53 selected features. We used each feature vector into the trained multiclass SVM classifier, which categorized the input left image into four classes: healthy leaves, leaves with spore germination, leaves with the first symptoms, and diseased leaves: classes T 1 , T 2 , T 3 , and T 4 , respectively. A total of 130 samples of each class were used to test the system.

3.1. Different Color Space Feature Vectors

Table 9 shows the confusion matrices of the proposed early disease detection system for the feature vectors containing different color space characteristics. It shows the overall correctly classified and misclassified results of the defined disease levels for each feature vector.
Table 9 shows that the best success rate of the multiclass SVM classifiers with different color space feature vectors was 88.68% ( F 1 ). Still, all the feature vectors F 1 F 5 could discriminate the four damage levels. For early detection, the optimal feature vector was F 1 . Table 7 lists the performance evaluation results (Section 2.8), showing that the best feature vector was F 1 because this vector resulted in the best accuracy value of 93.1% and had a kappa = 0.7874, verifying the results in Table 10.

3.2. Same Color Space Feature Vectors

Table 11 contains the confusion matrices of the proposed early-disease detection system for the feature vectors with the same color space characteristics. The best success rate achieved with the multiclass SVM classifiers with feature vectors using components of the same space color was 89.76% ( G 3 ).
In feature vectors containing components of the same color space, higher accuracies were achieved in G 3 and G 4 , with 94.4% and 91.4%, respectively (Table 8, in Section 2.8), with kappa = 0.7638 and kappa = 0.7835, respectively. These values are confirmed for the values presented in Table 11.
The final results and the features that are best for this diagnostic system are in Table 10, where the best feature vector is G 3 . Deriving from the L * a * b color space produces the best component colors used for the identification according to the texture descriptors.
The features are combined according to the behavior of the pixels in the images. Gray-level co-ocurrence matrix textural properties such as contr, energy, corrm, corrp, dvarh, dissi, and idmnc, combined with color components such as R, GG, V, L, Y, and G, are accurate descriptors for diseased leaves versus healthy leaves and the intermediate damage levels, in general, without any specification required of the signs or symptoms. Our results are a textural analysis, which have the potential of being develoedp into a valuable evaluation tool that improves the diagnosis assessment of cucurbit plants. The features formed with contr and dvarh in the RGB, L * a * b, YCbCr, and gray color spaces are texture descriptors that describe significant differences among the four detected damage levels of powdery mildew. Contrast (contr) is a good feature for powdery mildew disease. This feature agrees with the texture characteristic, having high contrast values for large texture changes. The variance in statistics is a measurement that describes the spread between gray levels in an image. In our case, the variance difference (dvarh) measures how far the gray levels in the GLCM were from the mean value. Energy (energ) is a characteristic of the RGB color space in our GLCM that describes significant differences among the four damage levels. This measurement represents the local uniformity of the gray levels. It is an excellent descriptor for differentiating between white spots and infectious disease without uniformity in the samples regarding the first signs and symptoms. Also, energy is the angular second moment representing the uniformity in an image. We interpreted it as powdery mildew disease causing localized heterogeneity in a disease-specified area on the leaf, while the spores cause heterogeneous disorder throughout the whole leaf image. Contrast measures the quantity of local changes in an image. It reflects the sensitivity of the textures to changes in intensity. It returns a measure of the intensity contrast between a pixel and its neighborhood. Therefore, we considered high contrast as relevant for describing the signs he fungal disease. Contrast was 0 for a constant image in our samples that had a lot of variation in color. C. pepo L. leaves have a local variation with consistently higher values. If a gray-scale difference occurs continually, the texture becomes coarse, and the contrast becomes large. Correlation is a descriptor that measures how correlated a pixel is to its neighborhood. It was used as a measure of the linear dependenciesof gray tone in our image samples. Feature values range from 1 to 1, defining a perfect negative and a positive correlation in the gray levels, respectively. The inverse difference moment normalized (idmnc) presents the difference between the neighboring intensity values that are normal by the total number of discrete intensity values. This means that in C. pepo L. leaves, all gray values of each damage level are considered according to 255 gray intensity values. The dissimilitude (dissi) in our samples showsedthe variability between the gray levels that describe each damage level. For instance, a leaf with powdery mildew infection in an advanced state would present white spots with the green color disappearing. In cases of gray-color space, all the descriptors (corrp, corrm, homom, contr, dvarh, idmnc, and dissi) have a coincidence with the YCbCr space color in the Y and CR components.

4. Discussion

The present study provides a reference for the detection of PM damage levels in cucurbits. A feature dataset proved to be the best model for detecting four levels of PM damage on cucurbit leaves under natural crop conditions. As a result, the number of variables was reduced to minimize the calculation time. Using the images characterized with texture descriptors, it was possible to obtain a diagnosis using the leaves. The images describe the color changes visible on the leaves when symptoms are present. Therefore, the proposed texture descriptors are the result of calculations that integrate variables derived from the incidence of gray levels in the images of healthy and diseased leaves. Such is the case of a fungal disease that modifies pigmentation by affecting the photosynthesis process. Therefore, an image could describe this condition. Our main idea was to use a combination of color components and texture descriptors to detect infection through leaf color and texture. This study identified healthy and diseased leaves, defined as T 1 for healthy leaves and T 4 for diseased leaves, and two intermediate damage levels, T 2 for leaves with germinating spores and T 3 for leaves showing the first symptoms. Currently, to identify plant disease cases, methodologies are applied when symptoms are visible on leaves. However, early detection is the main problem and a main focus of studies on crops under field conditions. Thus, an advantage would be achieved by identifying the early stage T 2 of disease. The germinating spores on leaves are not visible and cannot be detected using a particular feature in image processing. As such, the implementation of these proposed features in future applications of sample classification for infected plants could lead to controlling the disease in time. A texture descriptor is a measurement that shows the heterogeneity in an image that is difficult to see with the human eye. Haralick textures [19] have been used for medical and biological research [18,30,31]. Haralick textures reveal the properties of the spatial distribution in a texture image. Computer diagnosis has been widely applied to characterize, quantify, and detect numerous plant situations such as the recognition of different leaves, medicinal plant classification, and detection of plant diseases and pests, for instance, in winter wheat, maize, citrus, and soybean [4,32]. Texture descriptors such as energy, entropy, contrast, homogeneity, and correlation have often been used in the literature [18,30,31,32,33]. These descriptors are measures that describe some visible features of leaves. The combination of these features changed when the leaf was diseased with PM, which indirectly modified the pixels in an image regarding the changes in the pigments. an infected leaf, the photosynthesis process slows, resulting in reductions in the chlorophyll and carotene contents. Then, the image contains different pixels with white or yellow spots according to the internal leaf conditions. In healthy young leaves, the green is lighter when they are younger. Mature leaves are darker green and have gray spots.
In some studies, image processing and other metrics have been used to identify differences in plant diseases. However, agreement on the damage levels of a fungal disease is still lacking. Researchers have focused on the discrimination of various diseases caused by insects, viruses, and other pathologies. For cucurbit plants, some studies involved monitoring fungal or viral diseases with chemical analysis. Machine learning models have been used to distinguish plant diseases. According to a literature review, a limited number of researchers have studied disease damage level detection. To make the best decision for the control and monitoring of disease to enable protective measures to be implemented, the damage of plants over time must be measured using an optimal tool. In general, some methodologies can identify diseases, pests, viruses, and bacteria according to pathologies in different plants and crops. Image processing, spectroscopy, machine vision, remote monitoring, and hyperspectral imaging are tools that have been used for the identification of various visible symptoms and problems [34,35,36]. The white powdery mycelium covering the leaf surface affects the pixels in an image. The changes that occur on the leaf surface modify the pixels. A texture descriptor helps to identify gray levels based on disease damage. The proposed method calculates texture descriptors from ROI images of different spatial colors with various scales and crop databases. As shown in the results, the fungal disease of cucurbit plants affects the spectral signature of leaves in different ways, depending on the internal structure and disease characteristics.
An image displays these changes through the modification in gray levels. This study provides evidence that the analysis of textured features in an image simplifies the detection of a fungus based on the intensity of color feature data. We based the developed methodology on the most relevant combination of texture descriptors of a plant disease in combination with the health status, proposed damage levels, and stage classification, to identify healthy and diseased cucurbit plants.
Pydipati et al. [4] used the color co-occurrence method based on hue, saturation, and color intensity characteristics with uniformity, mean intensity, variance, correlation, product moment, inverse difference, entropy, and contrast using stepwise discriminant analysis to differentiate normal and diseased citrus leaves for diseases such as greasy spot, melanosis, and scab. They obtained identification accuracies between 100% and 95%. In this case, the diseases were detected when the symptoms were located on the leaves and were only compared with the characteristics of each pathogen. The backpropagation perceptron multilayer neural network performed classifications with descriptor textures in different medicinal plants, with accuracy values of between 75 and 80%. Ehsanirad et al. [32] developed a GLCM using texture descriptors such as autocorrelation, contrast, correlation, dissimilarity, energy, and entropy. In addition, they performed principal component analysis (PCA) for classification based on the leaf recognition of thirteen types of plants. The overall accuracy was 78.46%, and for PCA, the accuracy was 98.46%. Similarly, Malegori et al. [33] focused on the identification of biofilms and layers of microorganisms coated on a surface for materials such as steel, plastic, and ceramics through image analysis to define contaminated samples using defined texture and PCA descriptors. Similarly, for plant identification, Kadir et al. [31] implemented a Bayesian classifier in combination with shape, color, and texture features. On the flavia and foliage datasets, accuracies of 95% and 97.19% were achieved, respectively. However, in other work, texture descriptors were used in applications for the medical area to evaluate the sensitivity to Haralick texture features according to the gray levels used in the texture analysis: for glioma and prostate symptoms in patients by Brynolfsson et al. [18] and for the diagnosis of skin diseases such as allergic skin disorders, and viral, bacterial, and fungal skin diseases by Arabi et al. [35].
In our work, the analyses focused on identifying different pathologies when leaves are diseased and differences between the stage of disease using texture descriptors. Applying the concepts of texture descriptors, we developed an improved methodology with the ability to be used with image datasets with a deep approach that could provide the early detection of a fungal disease. All these methods have in common a classification process that is applied to different problems with crop plants in different environmental conditions. Several methodologies present high classification accuracies. Some of the processes are robust and complex. Nevertheless, they achieve the purpose of differentiating diseases. The main idea is to find different methods that are adapted to different crops to enable prevention. Identifying plan damage in time is a complicated task, but with a discrimination method that measures damage status, early identification is possible, even if it is not visible. In our analyses, we tested methods to identify diseased leaves in outdoor crops. Future applications will explore the utility of these results in outdoor conditions to discriminate leaves. In this study, one of the limitations is the optical devices and experimental issues that affect the quality of the sample image. Depending on the environment and real conditions in an open field, in an image, the hue of objects may change with brightness. In real conditions, the sun, natural lighting, texture, distance from the camera, time, and weather can be influential factors that affect the image acquisition of plants. These are all parameters to consider for a sampling strategy that includes external conditions during data acquisition and problem identification. Therefore, the planned follow-up control will help identify leaves undergoing continuous changes during growth stages when a fungal disease or pathogen is present.
Two computational limits are noted: i) the image preprocessing time required for feature extraction according to the quality of the images and the proposed feature extraction method, and ii) the time required for the training and validation process used for future classification. To select a binary classifier, it is necessary to identify the behavior of the samples for a cross-validation process. The classification process depends on the number of samples and the portion of training data used in the implemented machine learning method. Different studies have worked on diseases with different features and methods. In our approach, we selected the features of images as the proposed optimal texture descriptors for cucurbit leaves. As a result, we obtained a high-accuracy performance with these features. Therefore, we consider our method useful for future studies and applications in plants with similar characteristics. With these results, some characterized color components converted into texture descriptors produced sufficient class separability to classify the proposed four PM damage levels and identify the fungal disease at an early stage. Finally, 53 extracted features could differentiate damage levels, according to the results of statistical tests. In this work, we implemented algorithms based on RGB sample images, a contrast algorithm, color transformation, and GLCM calculation to obtain a series of texture descriptors as features. We then used statistical analysis to reduce these features and evaluate the ability of the models to differentiate the damage levels of each feature. A feature dataset emerged as the best model for detecting four levels of PM damage in cucurbit leaves. As a result, we reduced the number of variables to reduce computational time. Nevertheless, we considered images with features under highly variable outdoor lighting conditions. This study tested a methodology for identifying diseased leaves in open-field growing conditions. Future applications will explore the utility of these results in outdoor conditions using the proposed method to analyze similar leaves with different damage in a data set.

5. Conclusions

The control of fungal disease and its ecological impact is expensive due to the need for a prevention point to reduce and optimize the applications of chemical treatments. The visual monitoring of a fungal disease is difficult because a diseased leaf may be mistaken for a healthy leaf due to the absence of visible symptoms during spore germination. First, we obtained a collection of images under open-field conditions of cucurbits. Second, we processed the images via contrast adjustment and color transformation. From the sample images, we then performed a feature extraction process using the texture descriptors of the sample images. Next, we employed statistical tools such as the Lilliefors test, one-way ANOVA, and Tukey’s test to demonstrate the effectiveness of the method in assessing PM disease severity levels. Fifty-three texture descriptors from color components in the L*a*b, HSV, and YCbCr color spaces were found to be capable of showing potentially significant differences among e four PM damage levels. A sample dataset of the four cucurbit classes at different stages was used. This proposed methodology is suitable for disease detection for a variety of cucurbitaceous plants given the similarity in their growth stages and planting areas. However, it is a subject requiring further deep analysis and study. Technologies such as machine learning, big data, and the Internet of Things (IoT) could help with the sampling and data collection phases given the wide range of varieties, environmental conditions, and number of samples and taking into account all the parameters involved such as climate, lighting, and optical devices. It could be laborious and limit such investigations. For other varieties and crops, the proposed method may not contribute to optimal detection, but it may contribute to a feasible comparison strategy for future implementations under field conditions to detect different diseases.

Author Contributions

Conceptualization, C.A.R.-R. and E.R.P.-H.; methodology, C.A.R.-R. and E.R.P.-H.; software, C.A.R.-R. and E.R.P.-H.; formal analysis, C.A.R.-R., E.R.P.-H., O.V.-C. and I.A.R.-P.; investigation, C.A.R.-R., E.R.P.-H., O.V.-C. and I.A.R.-P.; visualization, O.V.-C. and I.A.R.-P.; supervision, O.V.-C. and I.A.R.-P.; project administration, C.A.R.-R. and E.R.P.-H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data are unavailable due to privacy and ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PMPowdery mildew
RGBRed, green, and blue
HSVHue, saturation, and value
L*a*bLuminance, red, and blue crominance
YCbCrLuma component, Cb and Cr chroma components
ANOVAAnalysis of variance
ROIRegion of interest
GLCMGray-level co-ocurrence matrix
CCColor component
TDTexture descriptor

References

  1. Barickman, T.C.; Horgan, T.E.; Wilson, J.C. Efficacy of fungicide applications and powdery mildew resistance in three pumpkin cultivars. Crop Prot. 2017, 101, 90–94. [Google Scholar] [CrossRef]
  2. Gudbrand, O.A. Methods for Detection of Powdery Mildew in Agricultural Plants with Hyperspectral Imaging. Master’s Thesis, Norwegian University of Life Sciences, Ås, Norway, 2017. [Google Scholar]
  3. Burdon, J.J.; Zhan, J. Climate change and disease in plant communities. PLoS Biol. 2020, 18, e3000949. [Google Scholar] [CrossRef] [PubMed]
  4. Pydipati, R.; Burks, T.; Lee, W. Identification of citrus disease using color texture features and discriminant analysis. Comput. Electron. Agric. 2006, 52, 49–59. [Google Scholar] [CrossRef]
  5. Camargo, A.; Smith, J. Image pattern classification for the identification of disease causing agents in plants. Comput. Electron. Agric. 66, 121–125. [CrossRef]
  6. Pawar, P.; Turkar, V.; Patil, P. Cucumber disease detection using artificial neural network. In Proceedings of the International Conference on Inventive Computation Technologies, ICICT, Coimbatore, India, 26–27 August 2016. [Google Scholar]
  7. Costa Lage, D.A.; Marouelli, W.A.; da S. S. Duarte, H.; Café-Filho, A.C. Standard area diagrams for assessment of powdery mildew severity on tomato leaves and leaflets. Crop Prot. 2015, 67, 26–34. [Google Scholar] [CrossRef]
  8. Kumar, S.; Sharma, B.R.; Sharma, V.K.; Sharma, H.; Bansal, J.C. Plant leaf disease identification using exponential spider monkey optimization. Sustain. Comput. Inform. Syst. 2020, 28, 100283. [Google Scholar] [CrossRef]
  9. Lamba, S.; Kukreja, V.; Baliyan, A.; Rani, S.; Ahmed, S.H. A Novel Hybrid Severity Prediction Model for Blast Paddy Disease Using Machine Learning. Sustainability 2023, 15, 1502. [Google Scholar] [CrossRef]
  10. Kaya, Y.; Gürsoy, E. A novel multi-head CNN design to identify plant diseases using the fusion of RGB images. Ecol. Inform. 2023, 75, 101998. [Google Scholar] [CrossRef]
  11. Xu, Q.; Cai, J.; Ma, L.; Tan, B.; Li, Z.; Sun, L. Custom-Developed Reflection–Transmission Integrated Vision System for Rapid Detection of Huanglongbing Based on the Features of Blotchy Mottled Texture and Starch Accumulation in Leaves. Plants 2023, 12, 616. [Google Scholar] [CrossRef]
  12. Sabat-Tomala, A.; Raczko, E.; Zagajewski, B. Comparison of support vector machine and random forest algorithms for invasive and expansive species classification using airborne hyperspectral data. Remote Sens. 2020, 12, 516. [Google Scholar] [CrossRef]
  13. Kasinathan, T.; Singaraju, D.; Uyyala, S.R. Insect classification and detection in field crops using modern machine learning techniques. Inf. Process. Agric. 2020, 8, 446–457. [Google Scholar] [CrossRef]
  14. Yağ, İ.; Altan, A. Artificial Intelligence-Based Robust Hybrid Algorithm Design and Implementation for Real-Time Detection of Plant Diseases in Agricultural Environments. Biology 2022, 11, 1732. [Google Scholar] [CrossRef] [PubMed]
  15. Fernández, C.I.; Leblon, B.; Wang, J.; Haddadi, A.; Wang, K. Cucumber powdery mildew detection using hyperspectral data. Can. J. Plant Sci. 2022, 102, 20–32. [Google Scholar] [CrossRef]
  16. Rivera-Romero, C.A.; Palacios-Hernández, E.R.; Trejo-Durán, M.; Rodríguez-Liñán, M.d.C.; Olivera-Reyna, R.; Morales-Saldaña, J.A. Visible and near-infrared spectroscopy for detection of powdery mildew in Cucurbita pepo L. leaves. J. Appl. Remote Sens. 2020, 14, 044515. [Google Scholar] [CrossRef]
  17. Mattonen, S.A.; Huang, K.; Ward, A.D.; Senan, S.; Palma, D.A. New techniques for assessing response after hypofractionated radiotherapy for lung cancer. J. Thorac. Dis. 2014, 6, 375–386. [Google Scholar]
  18. Brynolfsson, P.; Nilsson, D.; Torheim, T. Haralick texture features from apparent diffusion coefficient (ADC) MRI images depend on imaging and pre-processing parameters. Sci. Rep. 2017, 7, 4041. [Google Scholar] [CrossRef] [PubMed]
  19. Dinstein, I.; Shanmugam, K.; Haralick, R.M. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610–621. [Google Scholar]
  20. Johnson, R.A.; Wichern, D.W. Applied Multivariate Statistical Analysis, 6th ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2007. [Google Scholar]
  21. Conover, W.J. Practical Nonparametric Statistics, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 1998. [Google Scholar]
  22. Lilliefors, H.W. On the kolmogorov-smirnov test for normality with mean and variance unknown. J. Am. Stat. Assoc. 1967, 62, 399–402. [Google Scholar] [CrossRef]
  23. Rumpf, T.; Mahlein, A.K.; Steiner, U.; Oerke, E.C.; Dehne, H.W.; Plümer, L. Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance. Comput. Electron. Agric. 2010, 74, 91–99. [Google Scholar] [CrossRef]
  24. Deng, X.; Liu, Q.; Deng, Y.; Mahadevan, S. An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Inf. Sci. 2016, 340–341, 250–261. [Google Scholar] [CrossRef]
  25. Salla, R.; Wilhelmiina, H.; Sari, K.; Mikaela, M.; Pekka, M.; Jaakko, M. Evaluation of the confusion matrixmethod in the validation of an automated system for measuring feeding behaviour of cattle. Behav. Process. 2018, 148, 56–62. [Google Scholar]
  26. Ma, J.; Du, K.; Zheng, F.; Zhang, L.; Gong, Z.; Sun, Z. A recognition method for cucumber diseases using leafsymptom images based on deep convolutional neural network. Comput. Electron. Agric. 2018, 154, 18–24. [Google Scholar] [CrossRef]
  27. Griffel, L.; Delparte, D.; Edwards, J. Using support vector machines classification to differentiate spectralsignatures of potato plants infected with potato virus y. Comput. Electron. Agric. 2018, 153, 318–324. [Google Scholar] [CrossRef]
  28. Ohsaki, M.; Wang, P.; Matsuda, K.; Katagiri, S.; Watanabe, H.; Ralescu, A. Confusion-matrix-based kernel logistic regression for imbalanced data classification. IEEE Trans. Knowl. Data Eng. 2017, 29, 1806–1819. [Google Scholar] [CrossRef]
  29. Vieira, S.M.; Kaymak, U.; Sousa, J.M.C. Cohen’s kappa coefficient as a performance measure for feature selection. In Proceedings of the International Conference on Fuzzy Systems, Barcelona, Spain, 18–23 July 2010; pp. 1–8. [Google Scholar]
  30. Pare, S.; Bhandari, A.K.; Kumar, A.; Singh, G.K. An optimal color image multilevel thresholding technique using grey-level co-occurrence matrix. Expert Syst. Appl. 2017, 87, 335–362. [Google Scholar] [CrossRef]
  31. Kadir, A. A model of plant identification system using GLCM, lacunarity and shen features. Res. J. Pharm. Biol. Chem. Sci. 2014, 5, 1–10. [Google Scholar]
  32. Ehsanirad, A.; Sharath Kumar, Y.H. Leaf recognition for plant classification using GLCM and PCA methods. Orient. J. Comput. Sci. Technol. 2010, 3, 31–36. [Google Scholar]
  33. Malegori, C.; Franzetti, L.; Guidetti, R.; Casiraghi, E.; Rossi, R. GLCM, an image analysis technique for early detection of biofilm. J. Food Eng. 2016, 185, 48–55. [Google Scholar] [CrossRef]
  34. Mukherjee, G.; Chatterjee, A.; Tudu, B. Study on the potential of combined glcm features towards medicinalplant classification. In Proceedings of the 2016 2nd International Conference on Control, Instrumentation, Energy Communication (CIEC), Kolkata, India, 28–30 January 2016; pp. 98–102. [Google Scholar]
  35. Arabi, P.M.; Joshi, G.; Deepa, N.V. Performance evaluation of glcm and pixel intensity matrix for skin textureanalysis. Perspect. Sci. 2016, 8, 203–206. [Google Scholar] [CrossRef]
  36. Barbedo, J.G.A. Using digital image processing for counting whiteflies on soybean leaves. J. Asia-Pac. Entomol. 2014, 17, 685–694. [Google Scholar] [CrossRef]
Figure 1. Proposed methodology for PM damage level detection, where image collection is used for feature extraction and selection. A multiclassification is operated with the results of the classification process. In the end, a performance evaluation is conducted to verify the optimal classification.
Figure 1. Proposed methodology for PM damage level detection, where image collection is used for feature extraction and selection. A multiclassification is operated with the results of the classification process. In the end, a performance evaluation is conducted to verify the optimal classification.
Inventions 09 00008 g001
Figure 2. A timeline of the sampling days and the phenological growth stages to identify PM damage levels. The phenological stages ( S 1 to S 8 ) and the sampling days ( D 1 to D 19 ) are considered as basic information. Then, four PM damage levels are defined: T 1 for healthy leaves, T 2 for leaves with spore in germination, T 3 for leaves with the first symptoms, and T 4 for diseased leaves.
Figure 2. A timeline of the sampling days and the phenological growth stages to identify PM damage levels. The phenological stages ( S 1 to S 8 ) and the sampling days ( D 1 to D 19 ) are considered as basic information. Then, four PM damage levels are defined: T 1 for healthy leaves, T 2 for leaves with spore in germination, T 3 for leaves with the first symptoms, and T 4 for diseased leaves.
Inventions 09 00008 g002
Figure 3. Visual evaluation of cucurbit leaves where four PM damage levels were defined: (a) T 1 : healthy leaves, (b) T 2 : leaves with spore in germination, (c) T 3 : leaves with the first symptoms, and (d) T 4 : diseased leaves.
Figure 3. Visual evaluation of cucurbit leaves where four PM damage levels were defined: (a) T 1 : healthy leaves, (b) T 2 : leaves with spore in germination, (c) T 3 : leaves with the first symptoms, and (d) T 4 : diseased leaves.
Inventions 09 00008 g003
Figure 4. Exploration by parts of the leaf for the selection of the region of interest (ROI): (a) division of the leaf, central part ( R 1 ), lower right lobe ( R 2 ), upper right lobe ( R 3 ), upper central lobe ( R 4 ), upper left lobe ( R 5 ) and lower left lobe ( R 6 ), (b) first symptoms at R 4 .
Figure 4. Exploration by parts of the leaf for the selection of the region of interest (ROI): (a) division of the leaf, central part ( R 1 ), lower right lobe ( R 2 ), upper right lobe ( R 3 ), upper central lobe ( R 4 ), upper left lobe ( R 5 ) and lower left lobe ( R 6 ), (b) first symptoms at R 4 .
Inventions 09 00008 g004
Figure 5. Preprocessing of the ROI images starting with the color transformation and separation of color components (CCs), where the sample image ( I ( x , y ) ) is the original image, which is followed by the analysis of ROI results in a new sample in RGB ( R ( s , t ) ), then a contrast adjust ( C ( p , q ) ) is performed to obtain the transformation of the image ( T ( s , t ) ) in the different color spaces ( G ( i , j ) , L ( i , j ) , H ( i , j ) , Y ( i , j ) ) and the separation for color components.
Figure 5. Preprocessing of the ROI images starting with the color transformation and separation of color components (CCs), where the sample image ( I ( x , y ) ) is the original image, which is followed by the analysis of ROI results in a new sample in RGB ( R ( s , t ) ), then a contrast adjust ( C ( p , q ) ) is performed to obtain the transformation of the image ( T ( s , t ) ) in the different color spaces ( G ( i , j ) , L ( i , j ) , H ( i , j ) , Y ( i , j ) ) and the separation for color components.
Inventions 09 00008 g005
Figure 6. Calculation of the GLCM matrix in a gray image. The distance is d = 1 , and the angle is θ = 0 : (a) gray image, (b) gray levels I ( x , y ) , and (c) GLCM matrix with the paired pixels g ( i , j ) .
Figure 6. Calculation of the GLCM matrix in a gray image. The distance is d = 1 , and the angle is θ = 0 : (a) gray image, (b) gray levels I ( x , y ) , and (c) GLCM matrix with the paired pixels g ( i , j ) .
Inventions 09 00008 g006
Figure 7. Processed image ( I ( x , y ) and U ( s , t ) ); color transformation ( H ( s , t ) ); components H 1 ( s , t ) , H 2 ( s , t ) , and H 3 ( s , t ) ; and their GLCM matrices G 1 ( i , j ) , G 2 ( i , j ) , and G 3 ( i , j ) with 255 gray levels.
Figure 7. Processed image ( I ( x , y ) and U ( s , t ) ); color transformation ( H ( s , t ) ); components H 1 ( s , t ) , H 2 ( s , t ) , and H 3 ( s , t ) ; and their GLCM matrices G 1 ( i , j ) , G 2 ( i , j ) , and G 3 ( i , j ) with 255 gray levels.
Inventions 09 00008 g007
Figure 8. Process of feature extraction through the color component images.
Figure 8. Process of feature extraction through the color component images.
Inventions 09 00008 g008
Figure 9. Feature selection process consists of a Lilliefors test, then an analysis of variance, and Tukey’s test.
Figure 9. Feature selection process consists of a Lilliefors test, then an analysis of variance, and Tukey’s test.
Inventions 09 00008 g009
Figure 10. Results of the ANOVA and Tuke’s test: (a) mean values of the damage levels of diss-BB; (b) Tukey’s test, where the means of the damage levels are significantly different; (c) mean values of the damage levels of auto-A; (d) Tukey’s test, where the means of T 2 , T 3 , and T 4 are equal but significantly different from T 1 .
Figure 10. Results of the ANOVA and Tuke’s test: (a) mean values of the damage levels of diss-BB; (b) Tukey’s test, where the means of the damage levels are significantly different; (c) mean values of the damage levels of auto-A; (d) Tukey’s test, where the means of T 2 , T 3 , and T 4 are equal but significantly different from T 1 .
Inventions 09 00008 g010
Figure 11. Kernel selection in the multiclassification system with the feature vectors in different color spaces with the optimal hyperplane: (a) linear kernel in 2D with diss B B versus cont V , (b) 3D optimal hyperplane, (c) training and validation data with error in SVM T 1 versus T 2 , (d) polynomial kernel in 2D with auto V versus savg G , (e) 3D optimal hyperplane, (f) training and validation data with error in SVM T 3 versus T 4 , (g) sigmoidal kernel in 2D with ener G G versus dvar A , (h) 3D optimal hyperplane, (i) training and validation data with the error in SVM T 2 versus T 3 , (j) radial base function kernel in 2D with diss Y versus inf 1 B B con kernel RBF, (k) 3D optimal hyperplane, and (l) training and validation data with the error in SVM T 2 versus T 4 .
Figure 11. Kernel selection in the multiclassification system with the feature vectors in different color spaces with the optimal hyperplane: (a) linear kernel in 2D with diss B B versus cont V , (b) 3D optimal hyperplane, (c) training and validation data with error in SVM T 1 versus T 2 , (d) polynomial kernel in 2D with auto V versus savg G , (e) 3D optimal hyperplane, (f) training and validation data with error in SVM T 3 versus T 4 , (g) sigmoidal kernel in 2D with ener G G versus dvar A , (h) 3D optimal hyperplane, (i) training and validation data with the error in SVM T 2 versus T 3 , (j) radial base function kernel in 2D with diss Y versus inf 1 B B con kernel RBF, (k) 3D optimal hyperplane, and (l) training and validation data with the error in SVM T 2 versus T 4 .
Inventions 09 00008 g011
Figure 12. Kernel selection in the multiclassification system with the feature vectors in different color space with the optimal hyperplane: (a) radial base function kernel in 2D with auto V versus dent S , (b) 3D optimal hyperplane, (c) training and validating data with the error in the SVM T 3 versus T 4 , (d) linear kernel in 2D with idmn G versus diss G , (e) 3D optimal hyperplane, (f) training and validate data with the error in the SVM T 1 versus T 2 , (g) polynomial kernel in 2D with dvar C R versus homo Y , (h) 3D optimal hyperplane, (i) training and validate data with the error in the SVM T 2 versus T 4 , (j) radial base function kernel in 2D with ener V versus entr S con kernel RBF, (k) 3D optimal hyperplane, and, (l) training and validate data with the error in the SVM T 1 versus T 2 .
Figure 12. Kernel selection in the multiclassification system with the feature vectors in different color space with the optimal hyperplane: (a) radial base function kernel in 2D with auto V versus dent S , (b) 3D optimal hyperplane, (c) training and validating data with the error in the SVM T 3 versus T 4 , (d) linear kernel in 2D with idmn G versus diss G , (e) 3D optimal hyperplane, (f) training and validate data with the error in the SVM T 1 versus T 2 , (g) polynomial kernel in 2D with dvar C R versus homo Y , (h) 3D optimal hyperplane, (i) training and validate data with the error in the SVM T 2 versus T 4 , (j) radial base function kernel in 2D with ener V versus entr S con kernel RBF, (k) 3D optimal hyperplane, and, (l) training and validate data with the error in the SVM T 1 versus T 2 .
Inventions 09 00008 g012
Figure 13. One-versus-one multiclassification method. The main inputs are the support vectors s 1 , , s 6 ), the validation data for each binary classifier M 1 , , M 6 , and σ . Each block V 1 , , V 4 contains the different support vector machines for multiple classification.
Figure 13. One-versus-one multiclassification method. The main inputs are the support vectors s 1 , , s 6 ), the validation data for each binary classifier M 1 , , M 6 , and σ . Each block V 1 , , V 4 contains the different support vector machines for multiple classification.
Inventions 09 00008 g013
Figure 14. SVM binary classifiers: (a) test data F 1 and SVM-classified data, (b) test data F 2 and SVM-classified data, (c) test data F 3 and SVM-classified data, (d) test data F 4 and SVM-classified data, and (e) test data F 5 and SVM-classified data.
Figure 14. SVM binary classifiers: (a) test data F 1 and SVM-classified data, (b) test data F 2 and SVM-classified data, (c) test data F 3 and SVM-classified data, (d) test data F 4 and SVM-classified data, and (e) test data F 5 and SVM-classified data.
Inventions 09 00008 g014
Figure 15. SVM binary classifiers with components of the same color space: (a) test data G 1 and SVM-classified data, (b) test data G 2 and SVM-classified data, (c) test data G 3 and SVM-classified data, (d) test data G 4 and SVM-classified data, and (e) test data G 5 and SVM-classified data.
Figure 15. SVM binary classifiers with components of the same color space: (a) test data G 1 and SVM-classified data, (b) test data G 2 and SVM-classified data, (c) test data G 3 and SVM-classified data, (d) test data G 4 and SVM-classified data, and (e) test data G 5 and SVM-classified data.
Inventions 09 00008 g015
Table 1. Texture descriptors (DTs) equations [4,17,18,19].
Table 1. Texture descriptors (DTs) equations [4,17,18,19].
EquationDTsTexture Descriptors
i , j ( i , j ) p ( i , j ) autoAutocorrelation
i , j i j 2 p ( i , j ) contContrast
i , j { i × j } × p ( i , j ) { μ x × μ y } σ x × σ y corrCorrelation 1
i , j { i + j μ x μ y } 4 × p ( i , j ) cproCluster Prominence 1
i , j { i + j μ x μ y } 3 × p ( i , j ) cshaCluster Shade 1
i , j i j · p ( i , j ) dissDissimilarity
i , j p ( i , j ) 2 enerEnergy
i , j p ( i , j ) l o g 2 ( p ( i , j ) ) entrEntropy
i , j 1 1 ( i j ) 2 p ( i , j ) homoHomogeneity 1
m a x i , j p ( i , j ) maxpMaximum Probability 1
i , j ( i μ ) 2 p ( i , j ) sosvSum of Squares
i , j i p x + y ( i ) savgSum Average
i , j ( i j ) 2 p ( i , j ) svarSum Variance
i , j p x + y ( i ) l o g ( p x + y ( i ) ) sentSum Entropy
i , j ( k μ x x y ) 2 p x y ( k ) dvarDifference Variance 1
i , j p x + y ( i ) l o g 2 ( p x + y ( i ) ) dentDifference Entropy
H X Y H X Y 1 m a x ( H X , H Y ) inf 1 Information Measure of Correlation 1  2 3
1 e x p [ 2 ( H X Y 2 H X Y ) ] inf 2 Information Measure of Correlation 2  2 3
i , j { i j } 2 × p ( i , j ) indnInverse Difference Normalized
i , j 1 1 + ( i j ) 2 p ( i , j ) idmnInverse Difference Moment Normalized
1  μ x , μ y and σ x , and σ y are the median and standard deviation of p x and p y , respectively. 2  H X Y = entr, where H X and H Y are the entropies of p x and p y , respectively. 3  H X Y 1 = i , j p ( i , j ) l o g { p x ( i ) p y ( j ) } and H X Y 2 = i , j p x ( i ) p y ( j ) l o g { p x ( i ) p y ( j ) } .
Table 2. An example of the results of two features submitted to the Lilliefors test. For each CC (G—gray, R—red, GG—green, BB—blue, H—hue, S—saturation, V—value, L—luminance, A—a * red and green coordinates, B—b * yellow coordinates with blue, Y—luma, CB—Cb chrominance difference of blue, CR—Cr chrominance difference of red). The features are shown with their four damage levels. If the h - v a l u e in “ 0 ” appears at any level, the feature is discarded for not complying with the normality condition.
Table 2. An example of the results of two features submitted to the Lilliefors test. For each CC (G—gray, R—red, GG—green, BB—blue, H—hue, S—saturation, V—value, L—luminance, A—a * red and green coordinates, B—b * yellow coordinates with blue, Y—luma, CB—Cb chrominance difference of blue, CR—Cr chrominance difference of red). The features are shown with their four damage levels. If the h - v a l u e in “ 0 ” appears at any level, the feature is discarded for not complying with the normality condition.
GrayRGBHSVL * a * bYCbCr
TDsGRGGBBHSVLABYCBCR
T 1 diss1111111111111
T 2 1111111111110
T 3 1110111111110
T 4 0100110111011
T 1 homo1111111111111
T 2 1111111111111
T 3 1111111111111
T 4 1111111111111
T 1 idmn1111111111111
T 2 1111111111111
T 3 1111111111111
T 4 1101111111111
Table 3. Examples of results of Tukey’s test by feature listed in order according to the ability to separate the four damage levels of PM.
Table 3. Examples of results of Tukey’s test by feature listed in order according to the ability to separate the four damage levels of PM.
FeatureF Statistic T 1 T 2 T 3 T 4
ener B 184.7abcd
corr G 174.7abcd
homo V 171.2abcd
corr G 158.6abcd
ener G G 143.2abcd
ener V 142.6abcd
dent A 134.5abcd
sosv V 71.4abcd
dvar A 125.5abcd
idmn A 124.4abcd
cpro G G 122.4abcd
homo G 119.4abcd
entr S 112.7abcd
homo Y 111.3abcd
cont L 109.7abcd
dvar L 109.7abcd
dvar G G 105.9abcd
Table 4. Formation of the features vectors with the combination of six TDs features belonging to different color spaces ( F 1 , …, F 5 ), and the features belonging to components of the same color space ( G 1 , …, G 5 ).
Table 4. Formation of the features vectors with the combination of six TDs features belonging to different color spaces ( F 1 , …, F 5 ), and the features belonging to components of the same color space ( G 1 , …, G 5 ).
TDsVectorFeatures
Different F 1 auto V , dent S , svar V , svag L , sosv V , savg G
color F 2 entr R , homo R , idmn R , idmn C R , dvar R , cont R
space F 3 dvar C R , cont C R , idmn Y , idmn G , idmn G G , cont Y
combinations F 4 cont L , homo Y , entr S , homo G , cpro G G , idmn A
F 5 cont A , dvar A , dent A , ener V , ener G G , corr L
Same G 1 sent R , entr R , idmn G G , cont G G , diss B B , inf 1 B B
color G 2 dent S , entr S , auto V , svar V , sosv V , ener V
space G 3 diss L , savg L , idmn A , cont A , dvar A , ener B
combinations G 4 diss Y , homo Y , corr Y , idmn C R , dvar C R , cont C R
G 5 diss G , savg G , idmn G , cont G , dvar G , homo G
Table 5. Support vector machines M 1 , , M 6 with different space color components F 1 , , F 5 with the kernels’ linear, polynomial, sigmoidal, and radial base function for the selection of the SVM.
Table 5. Support vector machines M 1 , , M 6 with different space color components F 1 , , F 5 with the kernels’ linear, polynomial, sigmoidal, and radial base function for the selection of the SVM.
KernelSVMp/ ω / σ R 2 h est Γ | | w | | 2 % Error
Lineal M 1 -433.361.0 × 10150.0 + 90.742.4 × 101217.4
Lineal M 3 -448.281.1 × 10150.0 + 9.53i2.6 × 101218.6
Polynomial M 4 439171.2 × 10160.0 + 3.0i3105.630
Polynomial M 6 439897.1 × 10160.0 + 8.7i1801.7920.2
Sigmoidal M 1 12192.81096.50.0 + 1.67i50017.6
Sigmoidal M 4 71614.58072.30.0 + 9.1i50043.4
RBF M 1 0.50.9978460.174.1048461.180
RBF M 2 0.50.9977464.954.1237465.980
RBF M 4 0.50.9979493.854.2359494.870
RBF M 5 0.50.9979487.924.2132488.940
RBF M 3 10.9972413.453.9135414.600
RBF M 6 10.9974456.074.0885457.220
Table 6. Support vector machines with components of a space color M 1 , , M 6 with the kernels’ linear, polynomial, sigmoidal and radial base function for the selection of the SVM.
Table 6. Support vector machines with components of a space color M 1 , , M 6 with the kernels’ linear, polynomial, sigmoidal and radial base function for the selection of the SVM.
KernelSVMp/ ω / σ R 2 h est Γ | | w | | 2 % Error
Lineal M 1 -478.891.1 × 10150.0 + 9.74i2.4 × 101216.8
Lineal M 3 -455.571.0 × 10150.0 + 8.59i2.2 × 101215
Polinomial M 1 69.95 × 10184.97 × 10210.0 + 26i50016
Polinomial M 6 69.28 × 10181.24 × 10150.0 + 98.2i0.00010
Sigmoidal M 1 32137.451065.10.0 + 11.8i50017.2
Sigmoidal M 2 32104.091052.50.0 + 11.1i50017.8
RBF M 2 10.9957458.484.098460.420
RBF M 3 0.50.9979469.964.1434470.950
RBF M 4 0.50.9978491.794.2280492.790
RBF M 5 0.50.9979488.644.2160489.640
RBF M 1 20.979934,676.9925.835,385.50
RBF M 6 10.9962764.315.1414767.170
Table 7. Computed parameters for the performance evaluation of the classified data F 1 , F 2 , F 3 , F 4 , and F 5 .
Table 7. Computed parameters for the performance evaluation of the classified data F 1 , F 2 , F 3 , F 4 , and F 5 .
Vector ACC (%) SN SP PREC FPR F β
F 1 93.10.8320.9650.8870.03585.8
F 2 88.40.7000.9450.8110.05575.1
F 3 88.90.6820.9580.8440.04275.5
F 4 90.00.7280.9570.8500.04378.4
F 5 91.20.7540.9640.8750.03681.0
Table 8. Computed parameters for the performance evaluation of the classified data G 1 , G 2 , G 3 , G 4 and G 5 .
Table 8. Computed parameters for the performance evaluation of the classified data G 1 , G 2 , G 3 , G 4 and G 5 .
Vector ACC SN SP PREC FPR F β
G 1 87.30.6250.9560.8240.0440.711
G 2 90.80.7760.9520.8430.0480.808
G 3 94.40.8770.9670.8980.0330.887
G 4 91.40.7520.9680.8870.0320.814
G 5 87.30.6780.9380.7860.0620.728
Table 9. Confusion matrix of the classified test data F 1 F 5 .
Table 9. Confusion matrix of the classified test data F 1 F 5 .
b) SVM- F 1 T 1 T 2 T 3 T 4 Classified% Correct
T 1 716027989.87
T 2 29001181.82
T 3 0012012100.00
T 4 2002450.00
Test data7515124106
% Correct94.6760.00100.0050.00 88.68
d) SVM- F 2 T 1 T 2 T 3 T 4 Classified% Correct
T 1 465225583.64
T 2 47001163.64
T 3 30701070.00
T 4 200171989.47
Test data551291995
% Correct83.6458.3377.7889.47 81.05
f) SVM- F 3 T 1 T 2 T 3 T 4 Classified% Correct
T 1 924119893.88
T 2 45301241.67
T 3 210030.00
T 4 3006966.67
Test data1011047122
% Correct91.0950.000.0085.71 84.43
b) SVM- F 4 T 1 T 2 T 3 T 4 Classified% Correct
T 1 744218191.36
T 2 3310742.86
T 3 1160875.00
T 4 10281172.73
Test data798119107
% Correct93.6737.5054.5588.89 85.05
d) SVM- F 5 T 1 T 2 T 3 T 4 Classified% Correct
T 1 863209194.51
T 2 000000.00
T 3 600060.00
T 4 300121580.00
Test data953212112
% Correct90.530.000.00100.00 87.50
Time5–6 ms
Table 10. Final performance evaluation.
Table 10. Final performance evaluation.
VectorFeatures ACC Kappa% Correct
F 1 auto V , dent S , svar V ,0.9310.787488.68
savg L , sosv V , savg G
F 5 cont A , dvar A , dent A ,0.9120.784187.50
ener V , ener G G , corr L
G 3 diss L , savg L , idmn A 0.9440.763889.76
cont A , dvar A , ener B
G 4 diss Y , homo Y , corr Y 0.9140.783588.68
idmn C R , dvar C R , cont C R
Table 11. Confusion matrix with the classified test data G 1 , G 2 , and G 5 in components of the same color space.
Table 11. Confusion matrix with the classified test data G 1 , G 2 , and G 5 in components of the same color space.
b) SVM- G 1 T 1 T 2 T 3 T 4 Classified% Correct
T 1 563216290.32
T 2 02002100.00
T 3 60601250.00
T 4 103111573.33
Test data635111291
% Correct88.8940.0054.5591.67 82.42
d) SVM- G 2 T 1 T 2 T 3 T 4 Classified% Correct
T 1 825429388.17
T 2 29101275.00
T 3 2010333.33
T 4 2005771.43
Test data881467115
% Correct93.1864.2916.6771.43 84.35
b) SVM- G 3 T 1 T 2 T 3 T 4 Classified% Correct
T 1 10102010398.06
T 2 44401233.33
T 3 000000.00
T 4 30091275.00
Test data108469127
% Correcs93.52100.000.00100.00 89.76
d) SVM- G 4 T 1 T 2 T 3 T 4 Classified% Correct
T 1 9000090100.00
T 2 64001040.00
T 3 400040.00
T 4 200020.00
Prueba102400106
% Corrects88.24100.000.000.00 88.68
f) SVM- G 5 T 1 T 2 T 3 T 4 Classified% Corrects
T 1 673027293.06
T 2 97301936.84
T 3 32501050.00
T 4 320202580.00
Test data8214822126
% Corrects81.7150.0062.5090.91 78.57
Time5–6 ms
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rivera-Romero, C.A.; Palacios-Hernández, E.R.; Vite-Chávez, O.; Reyes-Portillo, I.A. Early-Stage Identification of Powdery Mildew Levels for Cucurbit Plants in Open-Field Conditions Based on Texture Descriptors. Inventions 2024, 9, 8. https://doi.org/10.3390/inventions9010008

AMA Style

Rivera-Romero CA, Palacios-Hernández ER, Vite-Chávez O, Reyes-Portillo IA. Early-Stage Identification of Powdery Mildew Levels for Cucurbit Plants in Open-Field Conditions Based on Texture Descriptors. Inventions. 2024; 9(1):8. https://doi.org/10.3390/inventions9010008

Chicago/Turabian Style

Rivera-Romero, Claudia Angélica, Elvia Ruth Palacios-Hernández, Osbaldo Vite-Chávez, and Iván Alfonso Reyes-Portillo. 2024. "Early-Stage Identification of Powdery Mildew Levels for Cucurbit Plants in Open-Field Conditions Based on Texture Descriptors" Inventions 9, no. 1: 8. https://doi.org/10.3390/inventions9010008

Article Metrics

Back to TopTop