Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis

Çetinkaya, Süleyman; Tandirovic Gursel, Amira

doi:10.3390/app15158690

Open AccessArticle

Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis

by

Süleyman Çetinkaya

and

Amira Tandirovic Gursel

^*

Department of Electrical and Electronics Engineering, Adana Alparslan Türkeş Science and Technology University, 01250 Sarıçam, Adana, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(15), 8690; https://doi.org/10.3390/app15158690

Submission received: 23 June 2025 / Revised: 22 July 2025 / Accepted: 28 July 2025 / Published: 6 August 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

The health of crops is a major challenge for productivity growth in agriculture, with plant diseases playing a key role in limiting crop yield. Identifying and understanding these diseases is crucial to preventing their spread. In particular, greenhouse pepper leaves are susceptible to diseases such as mildew, mites, caterpillars, aphids, and blight, which leave distinctive marks that can be used for disease classification. The study proposes a seven-class classifier for the rapid and accurate diagnosis of pepper diseases, with a primary focus on pre-processing techniques to enhance colour differentiation between green and yellow shades, thereby facilitating easier classification among the classes. A novel algorithm is introduced to improve image vibrancy, contrast, and colour properties. The diagnosis is performed using a modified VGG16Net model, which includes three additional layers for fine-tuning. After initialising on the ImageNet dataset, some layers are frozen to prevent redundant learning. The classification is additionally accelerated by introducing flattened, dense, and dropout layers. The proposed model is tested on a private dataset collected specifically for this study. Notably, this work is the first to focus on diagnosing aphid and caterpillar diseases in peppers. The model achieves an average accuracy of 92.00%, showing promising potential for seven-class deep learning-based disease diagnostics. Misclassifications in the aphid class are primarily due to the limited number of samples available.

Keywords:

pepper diseases; diagnosis; colour enhancement; deep learning; pre-trained VGG16 model

1. Introduction

Peppers are fruits that belong to the nightshade family. They are excellent low-calorie sources of vitamins A and C, antioxidants, potassium, folic acid, and fibre, and their consumption is associated with a reduced risk of certain chronic illnesses, such as cancer and heart disease. In addition to being a popular ingredient in many dishes, peppers can be enjoyed raw, making them a highly versatile component of a well-balanced diet. In addition, peppers contain various bioactive compounds that are of interest for pharmaceutical and cosmetic applications [1,2].

Peppers are warm-season crops harvested under glass for an average of two months but require delicately balanced growing conditions. Hence, their cultivation can be viewed from two interrelated perspectives: maintaining high productivity and nutritional value while minimising pesticide use. Although dangerous to human health, pesticides are still indispensable in the fight against viruses, bacteria, fungi, and other harmful organisms known to cause various diseases in pepper plants, affecting multiple parts of the plant [3]. Some of those diseases particularly target the leaves, leading to distinct and noticeable visual signs and symptoms [4]. Common diseases affecting the leaves of greenhouse peppers include mildew, mites, worms, aphids, and blight. Among these, aphids and worms are particularly detrimental.

A timely diagnosis is essential for timely treatment to prevent the spread of the disease. Yet, another vital advantage of the timely diagnosis is preventing unnecessary and over-spraying. Conventional methods such as observations, laboratory kits, and biochemical tests have traditionally been used in agriculture to detect greenhouse pepper diseases. However, diagnosis using these methods is, for various reasons, time-consuming and costly. At this stage, AI, which enables machines to mimic human-like behavior, seems to be a potential problem-solver for this and similar multi-parametric tasks [5].

In recent years, various DL models have become very popular due to their high effectiveness and practicality. With the development of science and technology, they have outperformed many traditional models in various fields of industry and medicine [6,7,8,9]. In addition, DL algorithms, known for their high performance in object detection and classification, have been widely tested in agricultural diagnostics [10,11,12], and some of those are focused specifically on healthy leaves and leaves with bacterial spots of pepper with accuracy up to 83.89% [12,13]. Referring to previous studies, DL algorithms, such as AlexNet, SqueezeNet, and Modified ResNet50, have been tested in disease diagnosis of various plants with good results. Still, few of them are related to pepper diseases [14]. Both the limited number of datasets in this issue and the limited number of images in existing sets are important obstacles. Refers to dataset classification using the VGGNet architecture, which has also been investigated in several studies, with conspicuous accuracy results achieved [15]. One of those has been obtained with an extensive open dataset comprising approximately 48,331 images of healthy and diseased plant leaves [16]. Notable results have been reported using pre-trained AlexNet and VGGNet models. The most widely tested DL algorithms typically consist of the input, hidden, and output layers. These algorithms are primarily applied to images from the PlantVillage public dataset [17].

It is notable that, differently from the studies mentioned above, this study focuses on diagnosing pepper leaf diseases using the modified VGG16Net algorithm, trained on images from a private dataset. VGG16Net is chosen due to its high accuracy in handling complex datasets. To optimise the training process, the hidden layers of VGG16Net were frozen, and three additional layers were introduced to fine-tune, effectively reducing the duration of the training. Unlike previous studies, this study incorporated user-controlled adjustments to evaluate the algorithm’s performance. Several contributions of this study should be highlighted. Firstly, this study proposes a modified VGGNet-based model that diagnoses six of the most common pepper diseases with specific spots on the leaves. Although the method of catching the spots is much more comprehensive, it is largely avoided due to difficult-to-distinguish colour transitions. For more successful feature extraction, the colour enhancement is set to emphasise the characteristic green and yellow tones through an algorithm designed for this purpose. This is the first work on diagnosing aphid and caterpillar classes via DL models. Finally, the results are a promising step towards more conscious and effective pesticide use.

The rest of the paper is organised as follows: Section 2 briefly reviews previous studies in retrospective analysis of the dataset, algorithms, and methodologies to detect leaf diseases. Section 3 is devoted to the method on which the analysis is based, while Section 4 summarises the results. Section 5 emphasises the discussion. Finally, Section 6 presents the conclusion and future work.

2. Related Works

The literature survey focuses on prior studies on detecting plant diseases, which were published in the last decades, employing DL algorithms, such as AlexNet, DenseNet, VGGNet, MobileNet, and ResNet. Ten studies, chosen because they resemble our work, are chronologically summarized below.

Since it was first proposed in 1998 by LeCun et al. for image recognition, DL has been significantly developed and enriched through various studies and research [18]. However, paving the way for agricultural tasks was a challenge, mainly due to the lack of an adequate database. Until recently, DL studies in agriculture, especially in plant disease diagnosis [19], were almost non-existent except for sporadic practical applications. In the last couple of years, researchers have focused on improving the accuracy of plant disease detection by employing various algorithms and applying different databases to a wide range of plant species.

A crucial milestone in AI applications for agriculture tasks was the release of the PlantVillage database, which comprises 38 classes of 14 plant species and remains one of the most extensive datasets in the field. Sardogan et al. utilised a CNN model to identify tomato leaf diseases using the PlantVillage database [20,21,22]. To further enhance the classification process, they implemented the learning vector quantisation (LVQ) algorithm, which achieved an accuracy rate of 88%. Demonstrating its effectiveness with such accuracy paved the way for the broadened usage of CNN-based classification techniques for detecting leaf diseases. In the study, tomato leaves are classified into five classes, including four disease classes and one healthy class.

The same year, Rangarajan and his team employed pre-trained AlexNet and VGGNet models to categorise tomato leaf images from the same dataset into seven distinct classes [23]. Their experimental results demonstrated a notable accuracy rate of 97.4%, primarily attributed to the AlexNet architecture. Despite the methodological similarities between studies, their study achieved a significantly higher accuracy than other studies’ findings. Another study based on VGGNet was conducted by Ferentinos and colleagues, who evaluated and compared the performance of various architectures trained on an open database comprising approximately 87,000 photographs of healthy and diseased plant leaves for 25 plant species, including peppers [24]. Among the architectures compared, VGGNet achieved an accuracy rate of 99.53%, surpassing other models in this dataset.

The following year, Kaya et al. proposed a combined VGGNet–AlexNet methodology tested on four databases, including the PlantVillage database [25]. Different types of plants, including peppers, were classified as healthy and diseased. Instead of plant-based accuracy, the accuracy rates are given separately in a more general form for each dataset. The best score obtained for binary classification on PlantVillage is 99.80%. Pepper leaves from the same dataset were also used by Wu and colleagues, who worked on detecting one of the most common bacterial diseases, Xanthomonas campestris [13]. Their study utilised a triple-classifier VGG16-based model to classify leaf images into healthy, mildly infected, and strongly infected groups.

Das proposed an alternative classification method for diagnosing infected pepper leaves by detecting bacterial spot patterns [26]. Two VGGNet models, VGG16 and VGG19, were employed separately as binary classifiers of 2475 images to distinguish healthy and infected leaves with an accuracy of around 96% and 97%, respectively. In general, advances in DL in the last decade have led to tremendous progress in classification tasks, so the fertile field of recognizing pepper leaf disease is no exception. Over time, basic binary classification gave way to more complex multiclassifications. Several five-class classification models encompassing healthy and four diseased categories have recently been reported [27].

Rababa et al. also identified bacteria-infected pepper leaves by conducting four DL models, including VGGNet [28]. The results obtained were compared according to standard criteria, and the best accuracy was found to be 58%. Begum et al. obtained much better results for accuracy and other parameters [29]. In their study with a novel technique for parameter tuning, the proposed model was compared with four DL models, including VGG16. All models were applied as binary classifiers on 1855 pepper leaves to distinguish infected from healthy leaves. AUC values achieved by the proposed and the VGG16 models were 0.99 and 0.92.

It could be inferred from a comprehensive literature review that most of the studies employing the VGGNet conducted in this field have been focused on disease diagnosis as healthy/diseased classification, regardless of plant type. However, among the algorithms employed, the VGGNet image processing algorithm has demonstrated exceptional efficacy on complex and large datasets, consistently achieving superior accuracy rates. This study aims to assess the accuracy of the VGG16 model using a dataset that contains seven different classes. The primary objective of the research is to optimize performance and minimize time loss by leveraging the pre-trained VGG16 architecture.

3. Methodology

This section details a proposed CNN model’s training and testing processes with the schematic diagram shown in Figure 1. The raw dataset consists of 1600 original digital images that underwent a comprehensive pre-processing pipeline before being used for training with a pre-trained VGG16 architecture. The convolutional and max-pooling layers were frozen, and dense and dropout layers were added to expedite the training process and improve the model’s performance. Below are further details about the dataset, pre-processing, and classification process.

3.1. Dataset

A custom dataset consists of 1679 pepper leaf images captured by an iPhone digital camera with original dimensions of 1200 × 1600 pixels. Sample leaf images for all categories presented in the study are shown in Figure 2. The dataset includes 300 healthy and 1379 infected leaf images isolated against a uniform background to ensure consistency in visual analysis. The “infected” group consists of diseases such as 123 aphid-infected leaves, 300 leaves burnt, 145 caterpillar-infested leaves, 300 mildew-infected leaves, 211 mite-infected leaves, and 300 leaves infected by leafworms. Thus, the dataset comprises images categorised into seven labelled classes representing healthy and various diseased conditions. In total, 70% of the images were used for training, while the remaining portion was utilised for testing and validation in a 2:1 ratio.

The images in each class were split according to the recorded ratios using the data adjustment function split_data, which creates a class-by-class, up-to-date status across training, continuous, and test sets. Class weights were maintained, preserving the classes during the split. Accordingly, training, test, and validation sets contain 1174, 335, and 164 images, respectively.

3.2. Pre-Processing Strategy

Data pre-processing is a set of steps to improve the image quality of the raw dataset for more efficient analysis. This improves performance in recognising the image and generalising across different image variations. The pre-processing model proposed in the study consists of four steps: resizing, cropping, colour enhancement, and data augmentation. More details about the pre-processing steps are given below.

3.2.1. Cropping and Resizing

The leaf images were originally captured at a resolution of

1200 \times 1600

pixels. To enable more effective analysis and training, the non-leaf parts of the images were manually cropped, and the resolution was reduced to

500 \times 1000

pixels. However, the VGG16 models work with standard image dimensions of

224 \times 224

pixels. This size is suitable for efficient computation and helps shorten the training time. To ensure compatibility with the VGG16 architecture, all images were resized to

224 \times 224

pixels using the Img_Width, Img_Height = 224, 224 program [30]. Figure 3 illustrates two before-and-after examples of cropping and resizing.

3.2.2. Augmentation

Data augmentation is a set of techniques for artificially creating new data samples to increase the variety of training data for better model generalisation. In the study, data augmentation was performed through eight techniques with details listed in Table 1. With nine samples per image, the number of samples in the training dataset increased to 10,080.

Augmentation may cause some empty pixels to appear around the process-exposed area. Empty pixels are fixed with the ‘fil_mode=’nearest’ function by filling the gaps with values of neighbouring pixels. The enhancement is followed by highlighting disease features through colour enhancement, using pre-processing_function=enhance_colors, to streamline the training process. Figure 4 illustrates various techniques employed in the data augmentation process, along with their corresponding visual examples. These techniques have enhanced data diversity, contributing to improved training performance.

3.2.3. Colour Enhancement

The algorithm with the flowchart diagram shown in Figure 5 is designed to intensify the green and yellow colours in the image. The algorithm enhances images’ vibrancy, contrast, or other colour properties. This algorithm makes the colour tones of the images more prominent and helps the model to recognise better colour variations. The images are converted from RGB to the HSV (Hue, Saturation, Value) colour space. Lower and upper bounds are defined for green and yellow in the HSV colour space. Then, the cv2.inRange() function is used to detect and mask the green and yellow areas in the image. The detected green and yellow areas are then converted back to the RGB colour space.

This process is typically employed for images featuring green and yellow tones, such as those of plants, to enhance the vividness and prominence of these colours. By increasing the saturation of the green and yellow hues, the objective is to assist the model in better distinguishing and recognising these colours. In the present study, various masking techniques were utilised. Given that our research is centred on plant diseases, we specifically implemented masking operations on the green and yellow colour regions. Figure 6 presents some sample images where the saturation of green and yellow colours has been enhanced.

3.3. Classification with Modified VGG16

Proposed by the Visual Geometry Group at the University of Oxford [31], the VGG16 model is a CNN architecture that is characterized by a depth of 16 layers, including 13 convolutional layers and three fully connected layers. VGG-16 is renowned for its simplicity, versatility, and enviable performance on a range of computer vision tasks, including image classification and object recognition. As a result, it remains a popular choice for many DL applications.

In this study, to further improve its performance, the VGG16 model was previously trained on ImageNet, excluding the original fully connected layers by setting the include_top parameter to False. After initialization on the ImageNet dataset, the convolutional and MaxPooling layers of the model were set to non-combining (frozen), which reduced the learning of redundant information. Additionally, to enhance the learning process and accelerate classification, several new layers were introduced into the architecture. These can be ordered as follows:

A dense layer with 256 distributions and ReLU activation, followed by a 50% dropout layer;
A second dense layer with 128 distributions and ReLU activation, followed by a 50% dropout layer;
Finally, an output security layer with softmax activation, with lengths up to the number of classes.

The add-on layers can be characterized as flattened, dense, and dropout layers. The flattening layer is necessary to feed the output into the fully connected layers, and the dense layer is added as a fully connected layer using the ReLU activation function. The dense layer is also adjusted according to the number of classes in the training data, as it is the classification layer. The dropout is used to prevent overfitting. By adding a flattened dense and dropout layer, the desired level of learning was achieved, which in turn shortened the classification time.

The model is customized to successfully tackle the specific task of classifying pepper leaf diseases by leveraging the advantages of the large and complex VGG16 model, combined with the newly added layers. This approach also ensures faster training and more efficient performance.

The training batch size was 32. During training, Adam optimizer was utilized to adjust learning rates due to its high performance on large datasets and complex models, resulting from efficient memory usage and automatic adaptation of all parameters after switching to new learning rates. The algorithm was used with a learning rate of

λ = 0.0001

. Categorical cross-entropy was used as the loss function, while accuracy was used as the performance metric. ReduceLROnPlateau and EarlyStopping callbacks were implemented to increase training stability and prevent overfitting.

3.4. Theoretical Background Behind the Performance Analysis of the DL Model

A confusion matrix is one of the most prominent methods for summarising a classifier’s quality. It shows the absolute number of correct and false predictions on a set of test data. To provide a clearer picture of their performance, the DL models are also tested using the following criteria: accuracy, loss, precision, recall, and F1 score. For better organisation of the section, all symbols used during the calculations are listed in Table 2.

The confusion matrix describes the performance of a classification model by showing the true vs. predicted classifications [32]. For the confusion matrix of M classes, represented in Equation (1):

C M = [\begin{matrix} a_{11} & a_{21} & \dots & a_{M 1} \\ a_{12} & a_{22} & \dots & a_{M 2} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{1 M} & a_{2 M} & \dots & a_{M M} \end{matrix}]

(1)

For a predicted class k, parameters

T P_{k}

,

F N_{k}

,

F P_{k}

, and

T N_{k}

are calculated through Equations (2)–(5):

\begin{matrix} T P_{k} = a_{i k} & f o r & i = k \end{matrix}

(2)

\begin{matrix} F N_{k} = \sum_{i = 1}^{M} a_{i k} & f o r & i \neq k \end{matrix}

(3)

\begin{matrix} F P_{k} = \sum_{i = 1}^{M} a_{k i} & f o r & i \neq k \end{matrix}

(4)

T N_{k} = \sum_{i = 1, k = 1}^{M} a_{i k} - T P_{k} - F N_{k} - F P_{k}

(5)

Accuracy is the ratio of total correct predictions over the total number of predictions, and it is given as in Equation (6):

A c c u r a c y = \frac{\sum_{k = 1}^{M} (T P_{k} + T N_{k})}{\sum_{k = 1}^{M} (T P_{k} + T N_{k} + F P_{k} + F N_{k})}

(6)

The loss function measures the error between the model’s predictions and the actual values, and it is represented by Equation (7):

L o s s = \frac{1}{M} \sum_{k = 1}^{M} \sum_{i = 1}^{M} a_{i k} log a_{i k}

(7)

Precision can be defined as the ratio of true positive predictions to the total predicted positive samples. It indicates how many predicted positive instances are correct [33,34,35]. It is given by Equation (8) as follows:

P r e c i s i o n_{k} = \frac{T P_{k}}{T P_{k} + F P_{k}}

(8)

Recall, also known as sensitivity, is the ratio of true positive predictions TP to the total actual positive samples (TP + FN). This metric shows how well the model identifies the positive class; in other words, it indicates how effectively the model captures the actual positives. The recall of the kth class is represented by Equation (9) as follows:

R e c a l l_{k} = \frac{T P_{k}}{T P_{k} + F N_{k}}

(9)

Finally, F1 score is the harmonic mean of precision and recall. The metric balances both metrics are used to better reflect the overall model performance, especially in case of class imbalance. A higher F1 score indicates better model performance [29,34]. The F1 score of the kth class is represented by Equation (10) as follows:

F 1_{k} s c o r e = 2 \frac{P r e c i s i o n_{k} \cdot R e c a l l_{k}}{P r e c i s i o n_{k} + R e c a l l_{k}}

(10)

4. Numerical Results

The study performed the classification task on a PC equipped with an Intel Core i5-9400 processor, an NVIDIA 360 GPU with 4 GB of VRAM, and 8 GB of RAM. Before training, the data is automatically partitioned, and randomly selected images are placed into training, validation, and testing folders based on the specified ratios. After training and validation, the model is tested on images from the test folder. The model’s performance parameters are evaluated using Equations (6)–(10). All results are presented in Table 3.

The model achieved a rather high overall accuracy of 0.92, which indicates that the model is generally performing well across all classes. The macro averages indicate a precision of 0.94, a recall of 0.87, and an F1 score of 0.87. It can be inferred from the figures that the precision is generally high, despite the model may struggle with recall across certain classes, The weighted average, which accounts for the support of each class according to its incidence, indicates that the model’s overall performance is slightly better in terms of precision (0.93), recall (0.92), and F1 score (0.90).

The classification model performs well overall, especially in identifying the “CATERPILLAR”, “HEALTHY”, “MITE”, and “WORM” classes. The “APHID” class shows significant room for improvement, as indicated by its low precision, recall, and F1 score. While the model’s high accuracy is promising, specific classes need targeted adjustments or more training data to enhance their detection capabilities.

Figure 7a,b show the model’s training and validation performance progress over 50 epochs. The model’s training process is represented through (a) accuracy and (b) loss. The blue dotted line represents the model’s training accuracy, and the orange dotted line indicates validation accuracy. Training accuracy starts at 0.2 and increases as the epochs progress, reaching approximately 0.6. The fact that validation accuracy is higher than training accuracy indicates that the model has a strong generalisation ability. However, after the 20th epoch, there is no further improvement in validation accuracy, suggesting that the model has reached the maximum level of learning it can achieve from the dataset and will not show any more progress.

In the second graph, the dotted blue line shows the model’s training loss, while the dotted orange line represents validation loss. Training loss decreases continuously as the epochs progress, and validation loss is stabilised around the 20th epoch. After this point, further training may result in overfitting. Overfitting occurs when the model becomes too closely adapted to the training data, reducing its generalisation ability. Therefore, training should be stopped around the 20th epoch, or early stopping techniques should be applied to prevent the model from overtraining.

The test results are summarised in the confusion matrix presented in Figure 8. The image analysis in the test folder revealed that the learning process yielded foolproof outcomes for some specific classes, such as HEALTHY, CATERPILLAR, MILDEW, and MITE. Minor classification errors are obtained for BURNT and WORM classes, while misclassifications in the APHID class are more challenging. Although APHID predictions are mostly confused with MILDEW, there is some minor confusion between the BURNT and HEALTHY classes.

5. Discussion

The classification of various diseases affecting pepper leaves is based on their distinct visual and colour differences, as an important guide to ensure accurate differentiation. Healthy pepper leaves are predominantly uniformly bright green. Unlike healthy ones, burnt leaves are characterised by brown, ring-shaped spots. The leaves affected by aphids display black dots as their defining feature. The leaves affected by caterpillars can be distinguished by characteristic leaf deficiencies and deep perforations, serving as their most prominent characteristic. Mite-affected peppers exhibit wrinkled leaves with dark green layers. In the worm-infected group, the formation of white lines on the leaf surface is identified as a key distinguishing detail. Finally, leaves infected with mildew are primarily characterised by yellow decay and spots. Comprehensive pre-processing based on that guidance, combined with high-quality photo lighting, builds up distinguishing characteristics of these classes, which, in turn, have positively impacted the learning process’s performance.

To the best of our knowledge, this is the first time that aphids and caterpillar classes have been investigated. Although the private dataset specially collected for this study is slightly imbalanced, it contains more than 220 images per class, which is a reasonable average compared to the other datasets in this field and enough to make meaningful deductions.

The proposed model is compared with recent counterparts to provide a more comprehensive judgment about accuracy. Studies are examined according to several criteria in Table 4. It is clearly shown that higher accuracies are provided for a smaller number of tasks. In addition, in these studies, the number of patterns that models need to learn is significantly smaller, resulting in higher accuracy scores. We obtained accuracy for both modified and original VGG16 models. At first glance, the approximate results should also be viewed from the aspect of processing speeds under the same conditions. It should be highlighted that the processing time is shortened from 7618 s, which is originally required, to 6750 s.

Despite the strong performance of the proposed model, several limitations must be acknowledged. The dataset used in this study is private and relatively limited in size, particularly for certain disease classes such as aphids and caterpillars, which may affect the robustness and generalizability of the model. Predictions for the aphid class are frequently confused with images from the burnt and mildew classes. A deeper analysis of the results indicates that variations in the conditions under which the photographs were taken contributed to classification errors during the training process. That is, the performance could be sensitive to background noise and cool and warm colour tones due to variations in lighting conditions, which are unavoidable in real-world environments. Finally, it should also be pointed out that although colour enhancement techniques improved class separability, they may introduce artefacts or distortions if not properly calibrated across diverse image sources.

6. Conclusions

The study presented a modified DL-based model for classifying six pepper leaf diseases and healthy leaves. This is the first time two classes, aphids and caterpillars, have been included in the diagnosis process. The private dataset, specially collected for this study, contains 123 aphid-infected and 145 caterpillar-infested leaves. Contrary to previous research based on enhancing contrast settings, this study mainly focuses on a novel pre-processing model that promotes enhancing some distinguishing characteristic colours among classes. For this purpose, the colour settings have been designed to be sensitive to yellows and greens. More comprehensive pre-processing has built up distinguishing characteristics for each class to be diagnosed and positively impacted the learning process’s performance. The VGG16-based model, chosen primarily due to its strong feature extraction performance, was initialised with weights pre-trained on the ImageNet dataset. These weights were preserved and frozen before all convolutional and MaxPooling layers of VGG16 were set to be non-trainable. Adding new layers to the pre-trained VGG16 model and training them only increases the training speed and facilitates pattern recognition, which is important for diagnostic performance. Despite thorough results for most classes, accuracy results show that minor classification errors are obtained for the burnt and worm classes, while misclassifications in the aphid class are more challenging. The relatively small number of aphid-infected samples is believed to have hindered accuracy. Additionally, lighting conditions during sample collecting and the model’s sensitivity to cool and warm colour tones contribute to classification errors during training. These avenues remain for further exploration. Hence, in addition to expanding the dataset, particularly for underrepresented classes such as aphids, to improve model generalizability and reduce misclassification, our future work will mainly focus on increasing the robustness in sensitivity to cool and warm colour tones. Additionally, exploring alternative deep learning architectures and ensemble methods may further enhance diagnostic accuracy.

We hope our study will be a valuable step toward a quick diagnosis from snapshots obtained by real-time greenhouse monitoring systems. This could be followed by appropriate agricultural spraying by drones or unmanned aerial vehicles.

Author Contributions

Conceptualization, S.Ç. and A.T.G.; methodology, S.Ç. and A.T.G.; software, S.Ç.; validation, A.T.G.; investigation, S.Ç. and A.T.G.; resources, S.Ç. and A.T.G.; data curation, S.Ç. and A.T.G.; writing, S.Ç. and A.T.G.; review and editing, A.T.G.; visualization, S.Ç. and A.T.G.; supervision, A.T.G.; administration, A.T.G. All authors have read and agreed to the published version of the manuscript.

Funding

No funding was received for this study.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset supporting this study will be available after publishing from Zenodo Respiratory at https://doi.org/10.5281/zenodo.16608348.

Conflicts of Interest

The authors declare that there are no conflicts of interest concerning the publication of this article.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
DL	Deep Learning
AI	Artificial Intelligence
LVQ	Learning Vector Quantisation
RGB	Red, Green, and Blue
HSV	Hue, Saturation, and Value

References

Barik, S.; Ponnam, N.; Reddy, A.C.; DC, L.R.; Saha, K.; Acharya, G.C.; Reddy, M. Breeding peppers for industrial uses: Progress and prospects. Ind. Crops Prod. 2022, 178, 114626. [Google Scholar] [CrossRef]
Baenas, N.; Belović, M.; Ilic, N.; Moreno, D.A.; García-Viguera, C. Industrial use of pepper (Capsicum annum L.) derived products: Technological benefits and biological advantages. Food Chem. 2019, 274, 872–885. [Google Scholar] [CrossRef]
Jayaram, N.; Rao, A.; Ramesh, S.; Manjunath, B.; Mangalagowri, N.; Ashwini, M. Tagging SSR markers to genomic regions associated with anthracnose disease resistance and productivity per se traits in hot pepper. Environ. Ecol. 2016, 34, 1440–1446. [Google Scholar]
Vishnoi, V.K.; Kumar, K.; Kumar, B. Plant disease detection using computational intelligence and image processing. J. Plant Dis. Protect. 2021, 128, 19–53. [Google Scholar] [CrossRef]
Kaya, Y.; Gürsoy, E. A novel multi-head CNN design to identify plant diseases using the fusion of RGB images. Ecol. Inform. 2023, 75, 101998. [Google Scholar] [CrossRef]
Khan, I.A.; Moustafa, N.; Pi, D.; Haider, W.; Li, B.; Jolfaei, A. An Enhanced Multi-Stage Deep Learning Framework for Detecting Malicious Activities From Autonomous Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 25469–25478. [Google Scholar] [CrossRef]
Liu, H.; Ma, Y.; Jiang, H.; Hong, T. SPR-YOLO: A Traffic Flow Detection Algorithm for Fuzzy Scenarios. Arab. J. Sci. Eng. 2025. [Google Scholar] [CrossRef]
Thakur, D.; Saini, J.K.; Srinivasan, S. DeepThink IoT: The strength of deep learning in internet of things. Artif. Intell. Rev. 2023, 56, 14663–14730. [Google Scholar] [CrossRef]
Kufel, J.; Bargieł-Łączek, K.; Kocot, S.; Koźlik, M.; Bartnikowska, W.; Janik, M.; Czogalik, Ł.; Dudek, P.; Magiera, M.; Lis, A.; et al. What Is Machine Learning, Artificial Neural Networks and Deep Learning?—Examples of Practical Applications in Medicine. Diagnostics 2023, 13, 2582. [Google Scholar] [CrossRef]
Liu, G.; Peng, J.; El-Latif, A.A.A. SK-MobileNet: A Lightweight Adaptive Network Based on Complex Deep Transfer Learning for Plant Disease Recognition. Arab. J. Sci. Eng. 2023, 48, 1661–1675. [Google Scholar] [CrossRef]
Latif, G.; Abdelhamid, S.E.; Mallouhy, R.E.; Alghazo, J.; Kazimi, Z.A. Deep Learning Utilization in Agriculture: Detection of Rice Plant Diseases Using an Improved CNN Model. Plants 2022, 11, 2230. [Google Scholar] [CrossRef] [PubMed]
Lee, S.; Goëau, H.; Bonnet, P.; Joly, A. New perspectives on plant disease characterization based on deep learning. Comput. Electron. Agric. 2020, 170, 105220. [Google Scholar] [CrossRef]
Wu, Q.; Ji, M.; Deng, Z. Automatic detection and severity assessment of pepper bacterial spot disease via multi Models based on convolutional neural networks. Int. J. Agric. Environ. Inf. Syst. 2020, 11, 29–43. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
Pudumalar, S.; Muthuramalingam, S. Hydra: An ensemble deep learning recognition model for plant diseases. J. Eng. Res. 2024, 12, 781–792. [Google Scholar] [CrossRef]
Tani, H.; Kotani, R.; Kagiwada, S.; Uga, H.; Iyatomi, H. Diagnosis of Multiple Cucumber Infections with Convolutional Neural Networks. In Proceedings of the 2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), Washington, DC, USA, 9–11 October 2018; pp. 1–4. [Google Scholar] [CrossRef]
Fuentes, A.; Yoon, S.; Kim, S.; Park, D. A Robust Deep-Learning-Based Detector for Real-Time Tomato Plant Diseases and Pests Recognition. Sensors 2017, 17, 2022. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Carranza-Rojas, J.; Joly, A.; Bonnet, P.; Goëau, H.; Mata-Montero, E. Automated Herbarium Specimen Identification using Deep Learning. Biodivers. Inf. Sci. Stand. 2016, 1, e20302. [Google Scholar] [CrossRef]
Sardogan, M.; Tuncer, A.; Ozen, Y. Plant Leaf Disease Detection and Classification Based on CNN with LVQ Algorithm. In Proceedings of the 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, Bosnia and Herzegovina, 20–23 September 2018; pp. 382–385. [Google Scholar] [CrossRef]
Zeng, M.; Zhu, X.; Wan, L.; Xu, J.; Shen, L. Data-Driven Prediction of Grape Leaf Chlorophyll Content Using Hyperspectral Imaging and Convolutional Neural Networks. Appl. Sci. 2025, 15, 5696. [Google Scholar] [CrossRef]
Elbasi, E.; Topcu, A.E.; Cina, E.; Zreikat, A.I.; Shdefat, A.; Zaki, C.; Abdelbaki, W. Enhanced Plant Leaf Classification over a Large Number of Classes Using Machine Learning. Appl. Sci. 2024, 14, 10507. [Google Scholar] [CrossRef]
Rangarajan, A.K.; Purushothaman, R.; Ramesh, A. Tomato crop disease classification using pre-trained deep learning algorithm. Appl. Sci. 2018, 133, 1040–1047. [Google Scholar] [CrossRef]
Ferentinos, K.P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]
Kaya, A.; Keceli, A.; Catal, C.; Yalic, H.Y.; Temucin, H.; Tekinerdogan, B. Analysis of transfer learning for deep neural network-based plant classification models. Comput. Electron. Agric. 2019, 158, 20–29. [Google Scholar] [CrossRef]
Das, P.K. Leaf disease classification in bell pepper plant using VGGNet. J. Innov. Image Process. 2023, 5, 36–46. [Google Scholar] [CrossRef]
Dai, M.; Sun, W.; Wang, L.; Dorjoy, M.M.; Zhang, S.; Miao, H.; Han, L.; Zhang, X.; Wang, M. Pepper leaf disease recognition based on enhanced lightweight convolutional neural networks. Front. Plant Sci. 2023, 14, 1–18. [Google Scholar] [CrossRef]
Rababa, L.; Ali, N.; Alessa, R.; Alzu’bi, A. Pepper Leaf Diagnosis Using Deep-Net with Low-Dimensional Image Classification. In Proceedings of the 2023 14th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 21–23 November 2023; pp. 282–286. [Google Scholar] [CrossRef]
Begum, S.S.A.; Syed, H. GSAtt-CMNetV3: Pepper Leaf Disease Classification Using Osprey Optimization. IEEE Access 2024, 12, 32493–32506. [Google Scholar] [CrossRef]
Liu, S.; Deng, W. Very deep convolutional neural network based image classification using small training sample size. In Proceedings of the 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 3–6 November 2015; pp. 730–734. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar] [CrossRef]
Islam, M.A.; Islam, M.S.; Hossen, M.S.; Emon, M.U.; Keya, M.S.; Habib, A. Machine Learning based Image Classification of Papaya Disease Recognition. In Proceedings of the 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 5–7 November 2020; pp. 1353–1360. [Google Scholar] [CrossRef]
Yacouby, R.; Axman, D. Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Association for Computational Linguistics, Online, 20 November 2020; pp. 79–91. [Google Scholar] [CrossRef]
Fragoso, J.; Silva, C.; Paixão, T.; Alvarez, A.B.; Júnior, O.C.; Florez, R.; Palomino-Quispe, F.; Savian, L.G.; Trazzi, P.A. Coffee-Leaf Diseases and Pests Detection Based on YOLO Models. Appl. Sci. 2025, 15, 5040. [Google Scholar] [CrossRef]
Barbosa, M.d.O.; Aguiar, F.P.L.; Sousa, S.d.S.; Cordeiro, L.d.S.; Nääs, I.d.A.; Okano, M.T. YOLOv8m for Automated Pepper Variety Identification: Improving Accuracy with Data Augmentation. Appl. Sci. 2025, 15, 7024. [Google Scholar] [CrossRef]
Bezabh, Y.; Salau, A.; Abuhayi, B.M.; Mussa, A.; Ayalew, A. CPD-CCNN: Classification of pepper disease using a concatenation of convolutional neural network models. Sci. Rep. 2023, 13, 15581. [Google Scholar] [CrossRef] [PubMed]
Yin, H.; Gu, Y.H.; Park, C.; Park, J.; Yoo, S.J. Transfer Learning-Based Search Model for Hot Pepper Diseases and Pests. Agriculture 2020, 10, 439. [Google Scholar] [CrossRef]
Fu, Y.; Guo, L.; Huang, F. A lightweight CNN model for pepper leaf disease recognition in a human palm background. Heliyon 2024, 10, e33447. [Google Scholar] [CrossRef] [PubMed]

Figure 1. A block diagram of proposed model.

Figure 2. Sample leaf images for each class in the dataset: (a) aphids, (b) blight, (c) caterpillars, (d) healthy, (e) mildew, (f) mites, and (g) worm.

Figure 3. Cropping and resizing examples.

Figure 4. Data augmentation examples.

Figure 5. Flowchart diagram of colour enhancement.

Figure 6. Colour enhancement examples.

Figure 7. (a) Accuracy and loss (b) functions during the training process.

Figure 8. Confusion matrix of the classifier.

Table 1. The augmentation techniques performed in the study.

Technique	Range	Task
Random Rotation	40	Randomly rotate the images between 0 and 40.
Zoom	0.2	Randomly zooms the images in or out by 20%.
Width Shift	0.2	Shifts the images horizontally by 20%.
Height Shift	0.2	Shifts the images vertically by 20%.
Shear	0.2	Apply a 20% shear to the images.
Horizontal Flip	True	Randomly flips the images horizontally.
Brightness Range	[0.2, 1.0]	Randomly adjusts image brightness between 20% and 100%.
Fill Mode	Nearest	Fills empty pixels with the nearest neighbour values.

Table 2. Nomenclature used during the calculations.

Symbol	Definition
$C M$	Confusion Matrix
$a_{i k}$	Number of predictions of the actual ith class predicted as kth class
$T P_{k}$	True positive predictions for the kth class
$F P_{k}$	False positive predictions for the kth class
$T N_{k}$	True negative predictions for the kth class
$F N_{k}$	False negative predictions for the kth class
i	Actual class
k	Predicted class
M	Number of classes
1	APHID
2	BURNT
3	CATERPILLAR
4	HEALTHY
5	MILDEW
6	MITE
7	WORM

Table 3. Summary of numerical results.

Class	Precision	Recall	F1 Score	Support
APHID	1.00	0.23	0.38	13
BURNT	0.86	0.97	0.91	33
CATERPILLAR	1.00	1.00	1.00	15
HEALTHY	0.94	1.00	0.97	31
MILDEW	0.83	1.00	0.91	29
MITE	0.96	1.00	0.98	22
WORM	1.00	0.89	0.94	27
Accuracy			0.92	170
Macro Avg	0.94	0.87	0.87	170
Weighted Avg	0.93	0.92	0.90	170

Table 4. A list of recently presented studies with the most similar diagnosis tasks with VGG16 classifier.

Author (Year)/Cite	Dataset/Availability	Dataset Size	Num. of Classes	Accuracy (%)
Begum et al. (2024) [29]	PlantVillage/Public	1855	2	93.50
Bezabh et al. (2023) [36]	North Macha Worenda/Public	1596	4, 2	97.34
Yin et al. (2020) [37]	National Institute of Horticultural And Herbal Science/Public	1977, 2621	15, 19	70.45, 88.72
Fu et al. (2024) [38]	Nanchang Academy of Agricultural Sciences/Public	1262	4	76.89, 87.38
This Work	Private	1600	7	92.00 , 91.01 *

* The modified VGG16 model. ** The original VGG16 model.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Çetinkaya, S.; Tandirovic Gursel, A. Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis. Appl. Sci. 2025, 15, 8690. https://doi.org/10.3390/app15158690

AMA Style

Çetinkaya S, Tandirovic Gursel A. Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis. Applied Sciences. 2025; 15(15):8690. https://doi.org/10.3390/app15158690

Chicago/Turabian Style

Çetinkaya, Süleyman, and Amira Tandirovic Gursel. 2025. "Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis" Applied Sciences 15, no. 15: 8690. https://doi.org/10.3390/app15158690

APA Style

Çetinkaya, S., & Tandirovic Gursel, A. (2025). Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis. Applied Sciences, 15(15), 8690. https://doi.org/10.3390/app15158690

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Pep-VGGNet: A Novel Transfer Learning Method for Pepper Leaf Disease Diagnosis

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Dataset

3.2. Pre-Processing Strategy

3.2.1. Cropping and Resizing

3.2.2. Augmentation

3.2.3. Colour Enhancement

3.3. Classification with Modified VGG16

3.4. Theoretical Background Behind the Performance Analysis of the DL Model

4. Numerical Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI