1. Introduction
The El Niño phenomenon (ENP) is an atmospheric event that originates from the increase in sea surface temperature, due to the influx of large masses of warm water from the western Pacific to the equatorial Pacific. During this phenomenon, the ocean transmits energy to the atmosphere and as a result, changes in the distribution of precipitation and environmental temperature are generated; a precedent for this occurred during the ENP 1997–1998 where the temperature on the coast of Peru increased by 5 °C above normal, affecting the economy and productive activities such as agriculture, fishing, and livestock [
1]. On the coast of Peru, the significant impact on agriculture due to the increase in air temperature is well known, whose levels depend on the intensity of the event. Temperature variations in each of its phases positively or negatively affect the various stages of plant development as well as its production. Excessive heat causes the plant to form a greater number of vegetative branches and abundant leaves but fewer flowers. Significant variations in precipitation stored in the soil and then taken up by the plant, its excess or deficit, affect the development and growth of crops and have a well-marked water demand from one phase to another; when the water content in the soil is not optimal, photosynthetic activity decreases affecting production and productivity [
2].
Space remote sensing is a technique that allows obtaining information from an object on the Earth that is at a certain distance from a satellite sensor, through the measurement of reflected, absorbed, and emitted electromagnetic radiation coming from the Earth-atmosphere system [
3]. In this regard, various objects or materials reflect and absorb different wavelengths of electromagnetic radiation, allowing to know and determine the type of material that reflects said wavelength, which can be detected by the sensors of a satellite. Such is the case that the correspondence between the reflected wavelength and the percentage of reflectance of the material on the Earth’s surface generates a spatial reflectance curve known as a spectral signature, giving the possibility of determining it in an optical image coming from satellites; these spectral signatures are unique for each particular characteristic, as they are composed of their own levels of absorption, emission, reflectance, or transmission [
4]. In this way, when a study of the coverage or density of vegetation in an agricultural field is carried out, healthy vegetation or leaves absorb the blue and red wavelengths coming from the Sun’s electromagnetic radiation; yet, the green, as well as the near-infrared, wavelengths are reflected. However, as the visible spectrum of the human eye cannot identify infrared radiation, it is observed that healthy vegetation always has the color green [
5]. This means that the spectral signature associated with the vegetation cover of a particular fruit crop will depend on the characteristics contained in its healthy leaves, for example, in its epidermis and mesophyll, including the spectral signature of the fruit itself.
On the other hand, unlike empirical or numerical approaches, machine learning methods are receiving increasing attention in research on land surface modeling. Convolutional neural networks (CNNs) have the ability to learn functional relationships between independent and dependent variables. These CNNs provide a hierarchical representation of the input data by using multiple convolutional layers and are effective in handling large amounts of data over long time series. Given these advantages of CNNs, some studies have used them to predict the normalized difference vegetation index (NDVI) [
6]. Therefore, a novel hybrid deep model is presented to predict climate events using a combined training algorithm. Initially, preprocessing of the input satellite image is performed using the median filtering method. Subsequently, the preprocessed image is divided into segments using the k-means clustering algorithm, which is based on weighted cubic chaotic maps. After this stage, features are extracted from the segmented image, such as the differential vegetation index (DVI), normalized differential vegetation index (NDVI), modified transformed vegetation index (MTVI), green vegetation index (GVI), and the soil-adjusted vegetation index (SAVI) [
7].
At the national level, the Piura Region led the harvests with 57.3%, with the provinces of Sullana and Piura being the most notable, contributing with 48.3% and 45.7%, respectively, to the total harvested in the department. Regarding yield, a notable decrease was observed in 1998 due to the 1997/1998 El Niño phenomenon, reaching only 9.2 tons per hectare and causing the loss of agricultural land. Similarly, the El Niño phenomena of 2004–2005 and 2009–2010 affected production, reducing it to 202 thousand and 197 thousand tons, respectively [
8].
The objective of the present study is to analyze the growth and health trend of the lemon crop during the El Niño phenomenon, evaluating spatial–temporal agricultural production, using variables such as the NDVI, the NDWI, and temperature. This, in turn, can help growers make informed decisions about pruning, irrigation, and fertilizer application to improve overall crop productivity.
2. Materials and Methods
2.1. The Proposed Methodology
The proposed methodology involves data preparation, specifically collecting remote sensing images such as the NDVI and NDWI extracted from Sentinel-2. Subsequently, image preprocessing takes place, where both images are merged to serve as input variables for the neural network. Finally, the database will be classified based on three criteria (
Figure 1).
2.2. Study Area
The Piura Region is one of the main lemon-producing regions, with an average national production of 55.2% recorded in the last decade. The main producing areas in this region are the San Lorenzo Valley, Sullana, and Chulucanas [
9]. Tambo Grande, located in the San Lorenzo Valley, is the leading district in lime production (
Figure 2) [
10]. However, it is among the departments most affected by adverse climatic events, such as the El Niño phenomenon, prolonged periods of drought, cyclones, and landslides. For the present study, we focus on a small productive area in Tambo Grande, delimited by the latitudes of (−4.960641, −4.933791° S), longitudes of (−80.302248, −80.254354° W), and covering an area of 15.88 km
2.
2.3. Meteorological Data
For this study, meteorological data obtained through Landviewer and Sentinel-2 satellite images were used, with a resolution of 10 m per pixel and less than 10% cloud cover, covering the lemon fields in Tambo Grande. A total of 96 images in PNG format, with dimensions of 2132 × 600 pixels, were used in time series between 1 January 2019 and 15 October 2023. In addition, minimum and maximum temperature data were collected from each satellite image to analyze its correlation with crops. The use of satellite images from Sentinel-2 provides detailed information over large areas of land, offering relevant data on the trend, growth dynamics, and health of crops.
2.4. Remote Sensing Data and Preprocessing
In the present study, the Sentinel-2 reflectance product was utilized to calculate both the NDVI and the NDWI. These calculations were performed by overlaying the necessary high-quality layers to obtain the desired image. The NDVI serves as an indicator that allows us to understand the amount of vegetation in an area. This indicator aids in monitoring crops, assessing droughts, predicting production, and identifying susceptible areas. Using data from the near-infrared (
NIR) and red (
RED) bands, as per Equation (1), we obtain an image depicting the state of the crops.
The NDVI varies between −1 and 1, and in the images from red to dark green, values of 1 and dark green show the presence of vegetation in good health and production. Meanwhile, the NDWI is an indicator that helps us identify the presence of water in the area. A high value of this, or presence of light blue color in the image, represents the presence of water, while a low value or presence of dark green color means a lack of water. This value can be altered in Tambo Grande due to the level of risk posed to crops, such as rain and elevated temperatures.
To organize the images, we begin to extract these images using Landviewer and taking the images provided by Sentinel-2 each month. Considering the restrictions of the
Section 2.3, and if it is also considered there is more than one image in a single month, a total of 96 images are obtained for both the NDWI and NDVI. After this, the MATLAB R2023b software is used to join both variables corresponding to the same date, as seen in
Figure 3, while
Table 1 shows the number of joined images corresponding to each year.
2.5. Classification Criteria
Having the images united, it is classified to determine if the development of the lemon in the study area is at a high, medium, or low risk for a meteorological phenomenon which may occur harming the harvest. To classify it in the aforementioned way, it is performed using data provided by Landviewer with the NDVI time series analysis option, as can be seen in
Figure 4, where each month is evaluated to classify it according to the ranges shown in the table.
Table 2 shows classifications of high, medium, or low risk.
In addition to the aforementioned criterion, use is made of news programs that have reported irregularities in agricultural production in Tambo Grande, such as droughts [
11], intense rains due to the phenomenon of El Niño and Cyclone Yaku [
12], the appearance of pests [
13]. I have even lost economic capital to continue producing due to COVID-19 [
14]. The results of the classification process are summarized in
Table 3, which shows the number of images corresponding to each risk classification, categorized as high, medium, or low risk.
2.6. Training and Validation Environment
The preprocessing of the images and the implementation of the neural network were carried out in the MATLAB 2023b environment. To train the neural network, the deep learning resource Matlab Toolbox was used, which provides us with an interface and tools for the development of deep learning models. This environment has been selected since it has wide acceptance by the scientific community and the ability to implement deep learning models and algorithms.
2.7. CNN Model
A convolutional neural network (CNN) is a highly effective deep learning algorithm in computer vision applications. Its main architecture includes a convolutional layer, a pooling layer, an output layer, and a fully connected layer (
Figure 5). The convolutional layer, composed of convolutional filters, plays a crucial role in interacting with the input image to generate the output feature map. The addition of more convolutional layers allows for the capture of more specific patterns and features, giving the network greater learning and pattern discrimination capabilities, although this also makes it more complex and deeper.
Furthermore, in this study we established three categories and tested the classification of images according to the discussed criteria. The AlexNet network was evaluated for this study since it has shown better performance in detecting meteorological features compared to the ResNet, LeNet5, LeNet5-Like, and ZFNet networks. The architecture of the convolutional neural network chosen was “AlexNet”, and the most important things to highlight would be that this architecture is 8 layers deep, and the input image must have a size of 227 by 227 and can classify images according to the criteria assigned, where some examples of the classifications that can be made would be identification of animals, races, sign language, etc. For our classification of high, medium, and low risk, one can reference the Results Section, where the performance model, the training and validation confusion matrices, and the Grad-CAM results are shown.
2.8. Validation Method
The validation of the results of the neural network will be carried out by correlating them with the temperature values in the area. This approach seeks to validate the accuracy of the network and evaluate the relationship between temperature and the level of risk in crop dynamics. Since temperature is a crucial climatological component that affects crop development, variability in temperature patterns can indicate climatological phenomena. This validation method will not only support the neural network results but will also establish temperature as an additional indicator to prevent adverse events. In addition, it will provide a complete perspective on how thermal conditions can anticipate the proximity of climatological phenomena in the Piura Region.
4. Discussion
According to the learning curve (
Figure 6), the AlexNet architecture model generated smooth training and validation curves without observing pronounced fluctuations, obtaining a performance of 96.30%, results that indicate that during the training process there were no issues. This aligns with the findings of Morelos et al. [
16], who demonstrated that AlexNet achieved great performance with 94.1%, while other models like VGG-19 and Inception v3 showed optimal accuracy (100%). In a study conducted by Manataki et al. [
17], the authors stated that metrics are used to plot learning curves, provide valuable information about the training process, and reveal the presence of learning problems such as overfitting. Regarding loss and validation loss, signs of overfitting appear when the loss during model training decreases while the validation loss increases or remains constant. Regarding precision and validation, high precision coupled with low validation precision can signal the existence of overfitting. On the other hand, the confusion matrices (
Figure 7) were useful to observe the number of samples that are truly positive during training and that are correctly predicted as positive, as well as the samples misclassified, demonstrating that the neural network can make mistakes; in this study, the neural network made mistakes mainly between medium- and low-risk images, so it is necessary as time goes by to expand the database with more low-risk classification images.
The use of the Grad-CAM technique allowed us to provide more information about how the model makes predictions, what it has learned, and if differences are observed. In image
Figure 8a, the neural network focuses mainly on the NDVI image during periods without harmful weather events, suggesting that in times of low risk, the evaluation of the health status of plants is crucial to determine performance of the crop. In image
Figure 8b, which represents a medium risk, the network focuses on both the NDVI and NDWI images, indicating that to validate the risk it is necessary to consider both aspects, the health of the crops and the water level. Finally, in image
Figure 8c, the network highlights the NDVI area and pays attention to the river, suggesting that during periods of high risk, plant health and river level may be key indicators that anticipate the proximity of meteorological phenomena.