Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges

Meghraoui, Khadija; Sebari, Imane; Pilz, Juergen; Ait El Kadi, Kenza; Bensiali, Saloua

doi:10.3390/technologies12040043

Open AccessReview

Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges

by

Khadija Meghraoui

^1,*

,

Imane Sebari

^1,2,*

,

Juergen Pilz

³

,

Kenza Ait El Kadi

^1,2

and

Saloua Bensiali

⁴

¹

Research Unit of Geospatial Technologies for a Smart Decision, IAV Hassan II, Rabat 10101, Morocco

²

Department of Photogrammetry and Cartography, IAV Hassan II, Rabat 10101, Morocco

³

Department of Statistics, Alpen-Adria-Universität Klagenfurt, 9020 Klagenfurt, Austria

⁴

Department of Applied Statistics and Computer Science, IAV Hassan II, Rabat 10101, Morocco

^*

Authors to whom correspondence should be addressed.

Technologies 2024, 12(4), 43; https://doi.org/10.3390/technologies12040043

Submission received: 9 February 2024 / Revised: 7 March 2024 / Accepted: 20 March 2024 / Published: 24 March 2024

(This article belongs to the Section Information and Communication Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

Agriculture is essential for global income, poverty reduction, and food security, with crop yield being a crucial measure in this field. Traditional crop yield prediction methods, reliant on subjective assessments such as farmers’ experiences, tend to be error-prone and lack precision across vast farming areas, especially in data-scarce regions. Recent advancements in data collection, notably through high-resolution sensors and the use of deep learning (DL), have significantly increased the accuracy and breadth of agricultural data, providing better support for policymakers and administrators. In our study, we conduct a systematic literature review to explore the application of DL in crop yield forecasting, underscoring its growing significance in enhancing yield predictions. Our approach enabled us to identify 92 relevant studies across four major scientific databases: the Directory of Open Access Journals (DOAJ), the Institute of Electrical and Electronics Engineers (IEEE), the Multidisciplinary Digital Publishing Institute (MDPI), and ScienceDirect. These studies, all empirical research published in the last eight years, met stringent selection criteria, including empirical validity, methodological clarity, and a minimum quality score, ensuring their rigorous research standards and relevance. Our in-depth analysis of these papers aimed to synthesize insights on the crops studied, DL models utilized, key input data types, and the specific challenges and prerequisites for accurate DL-based yield forecasting. Our findings reveal that convolutional neural networks and Long Short-Term Memory are the dominant deep learning architectures in crop yield prediction, with a focus on cereals like wheat (Triticum aestivum) and corn (Zea mays). Many studies leverage satellite imagery, but there is a growing trend towards using Unmanned Aerial Vehicles (UAVs) for data collection. Our review synthesizes global research, suggests future directions, and highlights key studies, acknowledging that results may vary across different databases and emphasizing the need for continual updates due to the evolving nature of the field.

Keywords:

agriculture; crop yield; deep learning; prediction; systematic literature review

1. Introduction

Food security is a global challenge, and intelligent agriculture plays a crucial role in enhancing social and economic development while ensuring a stable food supply [1]. The Food and Agriculture Organization (FAO) has projected a 60% increase in food demand to meet the needs of the global population, which is expected to reach 9.3 billion by the year 2050 [2]. Hence, accurately predicting crop yields can provide vital information necessary for devising effective strategies to meet objectives and eliminate hunger, and this is accomplished by managing the agricultural production chain effectively, balancing import and export needs to satisfy food requirements [3].Traditional yield determination methods, which are straightforward [4], often involve expert evaluation of small sample areas. These areas, known as yield squares, are typically no larger than 1 m² [5]. The results from these samples are then extrapolated to determine the total yield. While these methods provide a direct approach to yield prediction, they are time-consuming and labor-intensive, and they lack the precision necessary for applications where accurate yield data are required. Furthermore, the precision of these techniques is fundamentally constrained by their dependence on limited sample areas and the presumption that these areas accurately reflect the conditions across the entire field or farm. Such assumptions may introduce errors stemming from the uneven distribution of soil nutrients, water availability, pest infestations, and various other agricultural variables that influence crop development and yields. Therefore, despite the useful perspectives provided by these conventional methods of yield estimation, particularly in settings with scarce technological support, their practicality is limited by the extensive resources they demand and the likelihood of discrepancies and inaccuracies in their outcomes.

Crop simulation models, also referred to as crop development models, predict yields by simulating the entire growth period of crops. These models incorporate physiological data and environmental factors, including climate and soil conditions [6]. Prominent examples in this category are the AFRCWHEAT2 and CERES models [7]. However, despite their ability to establish accurate simulations of crop growth and make reasonable yield determinations [8], these models are considered expensive since they require extensive data related to climatic conditions, soil characteristics, crop varieties, and farming techniques, which are not always accessible. Moreover, these models also present a level of difficulty that necessitates a thorough comprehension of both the model and the simulated system, which could thus limit their use. Furthermore, it is difficult to project a model developed for a specific field onto larger regions, thus implying special calibrations if the studied area changes or even if the crop under study has changed.

Empirical methods, which establish relationships between independent variables and a dependent variable, often require less data than physical models, providing a simpler alternative [9]. These methods generally use historical data and statistical techniques to forecast future crop yields, with a particular focus on environmental variables. A widely used strategy involves developing simple regression models based on variables obtained from remote sensing, such as the Normalized Difference Vegetation Index (NDVI). Nonetheless, the utility of these models typically is confined to specific crop types and scales. Expanding their application to larger areas or different time periods poses challenges due to issues with model extrapolation. As such, modifications in the study’s geographical, environmental, or crop conditions mandate the creation of a new model. Moreover, agricultural yield results from a multitude of interrelated factors, and the inherent simplicity of statistical models might not adequately grasp the complex dynamics between these factors and the expected yield, potentially diminishing their efficacy [7].

Given these challenges, the agricultural sector is progressively shifting its focus to more advanced and sophisticated solutions, namely, by integrating artificial intelligence, which is a technology aimed at emulating human cognitive functions and relies on developing applications and algorithms that operate within computers and adaptive contexts, especially for decision making [10]. With its capacity to analyze extensive datasets and process complex inputs, AI offers a viable path forward [11], especially in agricultural applications [12]. AI techniques, particularly deep learning (DL), represent a major advancement in multiple domains and areas. It indicates a specialized sub-branch of machine learning derived from artificial neural network architectures, distinguished by their hierarchical learning structures, with a variety of interconnected layers. Each layer in this structure is responsible for extracting various pieces of information from the data it analyzes. These layers can directly handle raw or slightly processed inputs, autonomously identifying the necessary features for the tasks developed. This ability removes the necessity for manually extracting features, a step that is commonly needed in conventional machine learning methods, thus streamlining the data processing workflow and boosting the model’s proficiency in recognizing intricate patterns. DL is capable of processing complex relationships between inputs and outputs in various fields, including the processing of high-resolution remote sensing data, especially drone imagery, to determine forest attributes [13], studying the crop water requirements considering different environments [14], and also the automatic detection of weeds, particularly through using a convolutional neural network to process Unmanned Aerial Vehicle (UAV) imagery [15], and intelligent agriculture [16], particularly for predicting agricultural yield. This advanced technology has improved the capabilities of statistical techniques for predicting and estimating agricultural yields [17] and can thus surmount numerous limitations inherent in traditional and simple approaches. Through the utilization of DL, we can create predictive models that are not just more precise but also adaptable to the ever-changing agricultural landscape.

The critical need for precise agricultural yield predictions, coupled with the evolving landscape of deep learning applications in this domain, underscores the importance of a comprehensive review of existing research. Despite advancements in deep learning for agricultural purposes, gaps remain in synthesizing findings across diverse crop varieties, understanding the efficacy of various deep learning architectures, and consolidating the types of input data crucial for accurate predictions. Our systematic survey study is designed to fill these points by thoroughly examining these facets using a structured methodological approach. We aim not only to provide a state-of-the-art analysis of deep learning techniques in crop yield forecasting but also to highlight the specific contributions of different studies in advancing this field. By identifying the most influential research and addressing critical open questions, our work offers a unique contribution by guiding future research directions and assisting researchers in exploring the recent literature. This study, therefore, acts as a pivotal resource for established research in the field, shedding light on successful methodologies and pinpointing areas that are primed for innovative breakthroughs.

This systematic review is thoughtfully structured to explore the field of crop yield prediction using deep learning by centering on a series of pivotal research questions. Firstly, we explore the range of crops that have garnered attention in existing studies. Secondly, we delve into the deep learning architectures that have been evaluated in the literature specifically for crop yield prediction, assessing their effectiveness and application. Thirdly, we examine the main categories of input data employed in prior research to understand the data foundations that underpin successful yield prediction models. Lastly, we address the challenges and prerequisites necessary for accurate crop yield forecasting using deep learning techniques. By articulating these research questions, our study aims to provide a clear, structured investigation into the current state of the field, offering insights into both the achievements and the areas ripe for further exploration.

This manuscript is organized into several distinct sections, beginning with an introduction that outlines existing approaches to crop yield determination, emphasizes the need for precise predictions, and elucidates the purpose of our review. This is followed by a comparison of our study with related works and existing literature reviews. We then detail our systematic methodology, which encompasses the formulation of research questions, the definition of keywords, and criteria for inclusion and exclusion. Afterward, we present our principal findings, address the research questions while highlighting potential solutions to the challenges encountered, and present notable contributions to the field. We also propose directions for future investigation and conclude by summarizing the study’s key points.

2. Comparison of Related Works and Existing Surveys with the Present Review Study

Systematic literature reviews focusing on agricultural yield prediction through deep learning are scarce, prompting us to undertake an in-depth exploration of this application. While numerous authors have conducted survey studies reviewing the literature in this domain, our study distinguishes itself by employing a rigorous and methodological approach. Nevertheless, it aligns with both traditional and narrative literature reviews in terms of its comprehensive scope. The authors of Ref. [18] conducted a survey study on the use of artificial intelligence for predicting agricultural yields, specifically focusing on deep learning and its various architectures. Their research highlighted a range of agricultural tasks that can be addressed using deep learning and identified a growing trend in the application of recurrent neural networks within the agriculture sector, especially for yield prediction. Ref. [19] presented a comprehensive review by studying scientific articles related to yield prediction using machine learning algorithms, while also discussing the various advantages and challenges associated with them. Their results highlighted deep learning as an advanced and essential development in machine learning architectures, particularly deep neural networks and convolutional neural network architectures. The authors of Ref. [3] developed a systematic literature review on the application of machine learning to agricultural yield prediction, finding a significant trend toward the use of deep learning algorithms. These algorithms are particularly noted for their increasing application in agriculture to predict dynamic variables such as crop yields. The authors of Ref. [20] conducted a study on machine learning and deep learning algorithms for crop yield prediction. However, their analysis was based on a limited selection of publications and research questions, focusing on the prediction algorithms without discussing potential future directions or developments. Notably, the study treated the NDVI as a performance parameter like R-squared, which is typically a feature used in such models. Furthermore, the emerging trends and algorithms discussed, including CNNs and DNNs, are not particularly new to the field. In their study, the authors of [21] discussed various deep learning techniques for crop yield prediction. However, they did not present all the papers considered in their review and included a large number of research questions that contained redundant information.

In our study, we establish a set of important research questions within the field and develop a detailed protocol and methodology to address them, as will be outlined in Section 3. Our primary contribution is the methodology we adopt, which we anticipate will provide distinctive insights in response to the specific research questions we have formulated. A particular focus of our contribution will be the comparative analysis of the studies included in our survey. Consequently, we will recommend the most noteworthy papers, helping readers efficiently select relevant articles. Identifying open issues and presenting a future research roadmap constitute additional specific contributions of this systematic literature review.

3. Methodology

3.1. Review Protocol

In our research on crop yield prediction, we selected a systematic review method, aiming for a comprehensive and objective compilation of pertinent academic research in this field. The broad and diverse assortment of studies related to crop yield prediction underscores the suitability of this method for our research objectives. This methodology enables an in-depth review, critique, and synthesis of all studies that align with our specific research questions, encompassing the analysis of different crop types addressed, the various deep learning frameworks used, the key types of data employed, and the challenges and prerequisites essential for precise yield prediction. The adoption of a systematic approach is vital in diminishing bias and ensuring a clear, repeatable framework for analysis. This precision is essential for our research objectives, aiming not only to summarize the current findings within the realm of crop yield prediction but also to uncover gaps and emerging trends that can guide future studies. To accomplish this, we adhered to the methodology proposed by the authors of [22], who highlighted that a systematic literature review (SLR) is an approach initially developed for the medical sector in the United Kingdom, and its goal is to achieve transparency of the methodologies carried out and to obtain collective peer approval of the adopted approach. According to [23], the core features of an SLR include well-defined objectives, predefined eligibility criteria, an explicit and reproducible method, an exhaustive search strategy, a thorough evaluation of the included studies, and a systematic synthesis and presentation of the characteristics and findings of these studies.

The present research study details the results from a rigorously structured scientific approach, as illustrated in Figure 1, comprising a sequence of well-defined procedural phases. These phases range from developing specific research questions pertinent to the topic, establishing detailed inclusion and exclusion criteria, to the careful selection of databases and conducting an in-depth evaluation of relevant studies. The purpose of formulating research questions is to provide a clear direction for the review, guiding the selection of relevant literature while ensuring the review’s focus aligns with its thematic core. This process helps delineate the scope of the review, covering both the breadth and depth of the literature to be included, thereby ensuring the review’s feasibility and adherence to predefined boundaries. The inclusion and exclusion criteria serve as essential guidelines for identifying which studies to include or exclude, ensuring the review incorporates only the literature that directly addresses the specified questions. The study culminates in highlighting the most critical findings for the readers and suggesting future research avenues that could enhance the precision of crop yield predictions using deep learning techniques.

3.2. Practical Approach

In conducting our literature review, our aim was to offer an in-depth analysis of the research carried out in the realm of deep learning and its application to predicting crop yields. To achieve this, we examined studies from various perspectives. In this section, we will consolidate and highlight the essential aspects of the methodology we employed.

3.2.1. Formulation of the Research Questions

The main characteristic of a systematic literature review is that it generally begins with thorough bibliographic research on the topic under consideration, which helps to clarify the research concept [24]. The purpose of this step is to facilitate a better understanding of what has been previously published within the research area. For this literature review, the following research questions were identified:

Q1. What types of crops have been treated by previous studies to predict yield using deep learning?
Q2. Which deep learning architectures have been tested in the literature for crop yield prediction?
Q3. What are the main categories of input data used in previous studies for crop yield prediction using deep learning?
Q4. What are the challenges and the requirements for predicting crop yield using deep learning?

We formulated these research questions considering their importance. The first one aims to accurately identify which crops are under study for precise yield predictions using deep learning, shedding light on the focal areas and key crops that are attracting attention due to their significance. The selection of an appropriate architecture is pivotal, as the success of deep learning methods is significantly influenced by this crucial decision. Regarding input data, the performance and accuracy of deep learning models vary based on the specific data utilized, making it essential to examine the types of input data researchers have used. Moreover, a comprehensive discussion on deep learning must include its limitations and the prevailing challenges, alongside strategies for addressing these issues. Through these detailed points, our goal is to extensively investigate the use of deep learning for crop yield prediction.

3.2.2. Search Strategy for Relevant Studies

In our research methodology, we carefully selected a range of keywords to ensure comprehensive coverage of the literature related to deep learning applications in crop yield prediction. Our chosen keywords included ‘Deep Learning’, ‘Crop Yield Prediction’, ‘Neural Network’, ‘Agriculture’, ‘Computer Vision’, and ‘Artificial Intelligence’. These terms were selected for their direct relevance to the core aspects of our study, encompassing both the technological approaches and the agricultural context of the research. To conduct a thorough search, we utilized these keywords in various combinations across several scientific databases known for their extensive repositories of high-quality academic articles. The databases included the Directory of Open Access Journals (DOAJ), the Institute of Electrical and Electronics Engineers (IEEE), the Multidisciplinary Digital Publishing Institute (MDPI), and ScienceDirect. For each database, we used specific search equations, aiming to balance comprehensiveness with specificity. For instance, some search equations combined ‘Deep Learning’ AND ‘Crop Yield Estimation’ to capture relevant studies at the intersection of these fields. By employing these targeted search strategies, we ensured the inclusion of pertinent studies, thereby laying a solid foundation for our literature review.

3.2.3. Definition of Criteria for Inclusion and Exclusion

Inclusion and exclusion criteria are crucial for ensuring that only studies relevant to the topic are kept. Hence, we defined the inclusion and exclusion parameters as outlined in Table 1.

Certain articles, although survey studies, were excluded from the analysis due to their nature. However, they were not omitted from consideration during the reading phase. Their significant contributions to our topic were utilized to enrich the discussed points within our paper.

3.2.4. Quality Assessment

To assess the selected articles for our review, we focused on key factors such as the relevance of each study and its research questions, the clarity and comprehensiveness of the methodology, the technical specifications of the deep learning architectures, and the thoroughness of the results discussion. To quantitatively evaluate these aspects, we devised a scoring system with a maximum of 20 points, distributed across the following criteria:

Relevance of Research Questions: Studies were evaluated on how directly they addressed the proposed questions, with a scoring range from 0 to 5 (highly relevant).
Methodological Clarity and Detail: The clarity and depth of the described methodology, including any technical details, were scored from 0 (unclear or insufficiently detailed) to 5 (exceptionally clear and detailed).
Technical Specifications: The detail with which the deep learning architecture was described was scored from 0 (no specifications provided) to 5 (comprehensive specifications).
Results and Discussion: The presentation and discussion of the study’s findings were evaluated, with scores ranging from 0 (lacking discussion) to 5 (comprehensive and insightful discussion).

3.2.5. Data Extraction

We utilized Zotero, a bibliographic management software, for data extraction from the selected studies. This free tool streamlined our literature review by enabling the direct import of manuscripts from their respective databases, allowing us to efficiently organize the literature by capturing essential details like publication titles, authors, and other relevant information. We further categorized the imported articles according to their source databases for enhanced organization. Subsequently, we employed Excel to develop a spreadsheet, which facilitated the systematic extraction and analysis of responses to our research questions from these articles. By predefining these questions as columns in the spreadsheet, we established a structured framework that enabled us to directly associate our findings with the specific inquiries driving our review.

4. Results

We followed an analytical process for our critical review, initially focusing on identifying answers to our research questions and conducting a critical evaluation of each article analyzed. Utilizing Zotero, we efficiently organized the extracted articles, which facilitated a streamlined review process. Additionally, we employed Excel to further structure our analysis; this spreadsheet was instrumental in categorizing the data. Specifically, columns were dedicated to each of our research questions, allowing for a direct comparison and synthesis of findings. Moreover, a crucial column was allocated for the critical evaluation of each study, where we noted significant observations and assessments. This combination of tools and methodical analysis ensured a comprehensive and organized review, enabling us to derive insightful results from the literature. This section details our analyses and the answers derived from our studied research articles.

4.1. Selected Studies

The number of articles identified and selected for further review are detailed in Table 2. These articles offer valuable insights and findings essential to our research, particularly in applying advanced deep learning techniques for predicting crop yields, which aligns with our primary research question about the specific crops targeted in yield forecasting. The recurrence of this theme across the included studies underscores the growing trend of integrating deep learning into agricultural practices. Our analysis also revealed significant findings regarding the variety of data used in deep learning models for agricultural forecasting, emphasizing the need for standardized data collection and processing methods to improve model accuracy and consistency. Furthermore, these studies offer detailed insights into the diverse architectures chosen by researchers and their impact on model outcomes. Additionally, these papers address our fourth research question, shedding light on the challenges and constraints of current deep learning applications in agriculture. The 92 studies that were retained are listed in Table 3. This table provides details including the agricultural crop addressed, the main inputs of the learning model, and the deep learning architecture selected by the authors.

4.2. Statistical Analysis

We conducted a meta-analysis of the selected scientific articles, as illustrated by Table 2. The majority of the examined articles were sourced from the DOAJ or the ScienceDirect database, with roughly twice the number coming from these sources compared to those from IEEE and MDPI. Among these, most were published in journals (81), with only a few from conferences (11), indicating that our literature review primarily focused on more detailed research papers.

Further examination of the annual distribution of papers, as depicted in Figure 2, reveals a pronounced upward trajectory in research within this domain, marked by a significant increase in publications in recent years. This surge highlights the growing interest and significant advancements in the field, largely driven by breakthroughs in deep learning technology and a broader recognition of AI’s potential in addressing key agricultural challenges, such as food security. As a result, there has been a marked increase in the adoption of sophisticated AI tools and computational techniques. Especially in the last few years, the field has seen a rapid rise in the number of publications, reflecting the adoption of innovative deep learning methods that are transforming agricultural practices. The increasing number of academic papers not only illustrates the dynamic nature of research, continuously integrating practical insights, but also underscores the need for review studies to thoroughly evaluate these contributions, identifying areas for future research or innovative strategies to enhance crop yield predictions with these advanced technologies.

4.3. Crops Examined in Studies Using Deep Learning Models

To address the first research question (Q1) and facilitate the interpretation of Table 3, we classified the agricultural crops that were the focus of deep learning models in the selected studies. As depicted in Figure 3, a majority of these studies focused on predicting yields for staple crops such as wheat, maize, and rice. This emphasis not only underscores the pivotal role these crops play in ensuring global food security but also demonstrates the significant potential of deep learning technologies to enhance yield predictions for these vital crops. Such advancements could greatly contribute to improving agricultural sustainability and operational efficiency, potentially having a profound impact on feeding the growing global population. Conversely, our review identified a notable point in the research landscape concerning other agricultural categories, particularly fruit trees, for which we found, at most, one study dedicated to each type of crop. This scarcity highlights the imperative to extend deep learning applications to a broader array of crops, aiming to optimize diverse agricultural practices, especially in sectors that have not yet fully capitalized on these technological advancements.

4.4. Deep Learning Architectures Used in Crop Yield Prediction

4.4.1. A Brief Overview of Deep Learning

The popularity of deep neural networks surged in 2012 following the victory of the AlexNet architecture in the ImageNet challenge [110]. Subsequently, it became standard practice to assess neural network accuracy using large-scale datasets. As a result, numerous neural networks were released by their developers, complete with the weights acquired during model training using these data [111]. The architecture of a deep neural network draws inspiration from the human brain’s configuration, where artificial neurons are connected in sequence and structured into layers, forming a synthetic framework. The primary elements of a deep learning network consist of neurons, layers, and activation functions [112]. In the context of a basic deep learning model, the output from one layer serves as the input for the next, illustrating the layer-by-layer progression of data processing. The significance of the weight matrices is underscored as they modify and scale the input data throughout the network. Additionally, the incorporation of a bias term at each layer enhances the model’s adaptability, allowing for adjustments to the output that are independent of the weighted sum of the inputs. The introduction of nonlinearity into the system through a nonlinear activation function, applied to the combined result of the linear transformation and bias, is essential. This nonlinearity is what permits the network to learn and represent complex, nonlinear patterns within the dataset. Equation (1) presents the mathematical details of such a network.

L^{(l)} = ϕ (M^{(l)} L^{(l - 1)} + {bi}^{(l)})

(1)

where

L^{(l)}

represents the output of layer l,

M^{(l)}

and

{bi}^{(l)}

denote the weight matrix and bias vector for layer l, respectively, and

ϕ

is the nonlinear activation function. The initial input

L^{(0)}

corresponds to the input to the network.

4.4.2. Deep Learning Models Applied for Forecasting Agricultural Yields

To provide an overview of the deep learning architectures used for yield prediction in agriculture and to address research question two (Q.2), we developed Figure 4. This illustration shows the number of articles that employed each model, indicating that two-dimensional convolutional neural networks (2D CNNs) and LSTM (Long Short-Term Memory) from recurrent neural networks are the predominant techniques. These results are in alignment with the findings presented in the study by Mohammadi et al. (as cited in reference [113]). It should be noted that the widespread adoption of 2D CNNs is largely attributed to the architecture’s adeptness at processing spatial data, a key factor in understanding agricultural yield and its variability. In contrast, LSTM networks are favored for their proficiency in managing time-series data, reflecting the significant role of temporal dynamics in agricultural yield predictions. The choice of these two architectures highlights their effectiveness in navigating the intricate structures and the spatial and temporal variability intrinsic to agricultural yields.

Table 3 reveals that the total number of deep learning architectures utilized surpasses 92, exceeding the count of the studies reviewed. This discrepancy in numbers arises from the common practice among researchers in this domain, where it is typical for a single study to simultaneously explore and evaluate various deep learning frameworks. This approach goes beyond mere experimentation with diverse techniques and algorithms; it serves as a comprehensive method to assess and compare the effectiveness of different architectures under the same conditions, using the same datasets. Through this comparative examination, researchers aim to identify and highlight the strengths and weaknesses of each framework, providing essential insights into their suitability and performance for specific challenges in agricultural yield prediction. Consequently, the employment of multiple architectures within individual studies underscores the complexity and diversity of deep learning applications in agriculture, reflecting the ongoing effort to refine predictive accuracy in tackling the spatial and temporal complexities of agricultural indicators.

Through our analysis of the various studies, the well-known deep learning architectures used in crop yield prediction are as follows:

Deep Neural Network (DNN): It is comprised of various layer types, including input, multiple hidden, and output layers, all interconnected as depicted in Figure 5, where the first layer is designed to receive the values of the model’s variables. This architecture is characterized by its weights, which allow the model to learn by being updated through a specific learning algorithm [114]. In the study [17], the authors investigated this architecture. Focusing on the mid-summer period, particularly July to August, they used data from the Moderate Resolution Imaging Spectroradiometer (MODIS) to predict corn and soybean yields. Their research included a comparative analysis of the performance of six artificial intelligence models, including a DNN. Their findings suggested that the deep neural network model was more accurate, improving yield prediction by 21 to 33% for corn and 17 to 22% for soybean compared to the other artificial intelligence (AI) models tested. However, it is crucial to point out that the research findings were expressed in terms of yield using the metric tons per hectare, which is a unit of measurement more commonly applied to fruit crops. This choice of unit is somewhat unconventional for studies focused on cereal grains, where different metrics might be more standard or expected. This discrepancy in the unit of measurement could influence the interpretation and comparison of the results with other studies in the cereal grain domain.

Deep Belief Network (DBN): This specific deep learning architecture has demonstrated considerable performance across various applications. It incorporates both supervised and unsupervised learning processes [115]. Specifically, a DBN is a generative model composed of a stack of Restricted Boltzmann Machines (RBMs) as depicted in Figure 6 followed by a layer that can be trained with supervised methods, often referred to as a Sigmoid Belief Network (SBN) [116]. Ref. [94] evaluated an Adaptive Lemuria algorithm, which was based on a deep belief network, utilizing inputs such as rainfall data, temperature, soil, as well as fertilizer information. Their study encompassed several types of crops, including banana (Musa), apple (Malus), orange (Citrus sinensis), and onion (Allium cepa). The model achieved an accuracy rate of nearly 98.35% and reported an error rate of 0.031.

Convolutional Neural Network (CNN): This algorithm is among the widely used architectures in the field of deep learning. CNNs can be explored in several dimensions, including one-dimensional (1D), two-dimensional (2D), and three-dimensional (3D) configurations, each serving different types of data inputs and applications.

One-dimensional CNN: This architecture is characterized by an input layer with a one-dimensional filter (Figure 7), typically utilized for processing 1D spectral data [118].

Figure 7. Architecture of 1D convolutional neural network [119].

Two-dimensional CNN: This architecture performs convolution operations using specialized filters and possesses multiple layers similar to other CNN models (Figure 8). These include a convolutional layer, nonlinear layer, max-pooling layer, and fully connected layers. Numerous studies have showcased the effectiveness of this architecture, particularly for processing unstructured data such as images.

Figure 8. Architecture of 2D CNN [120].

Three-dimensional CNN: The primary distinction between three-dimensional and two-dimensional CNN is that 3D CNNs apply convolution across both spatial and temporal dimensions [121]. Unlike 1D and 2D CNN, the inclusion of a third dimension in 3D CNN allows for the creation of a complex feature cube as depicted in Figure 9 that captures more detailed information [122].

Figure 9. Comparison between two-dimensional and three-dimensional CNNs [123].

The authors of Ref. [25] indicated in their study that CNNs require fewer parameters to train the models. They compared this algorithm with other architectures such as Multilayer Perceptron (MLP) and ridge regression Best Linear Unbiased Prediction (rrBLUP) when predicting spring wheat yield. Their results showed an increase in accuracy of up to 5% when comparing CNN to rrBLUP. Other researchers applied convolutional neural networks to estimate winter wheat yield using MODIS products. For instance, the authors of [40] demonstrated the effectiveness of CNN in estimating winter wheat yield, achieving a Pearson correlation coefficient of up to 0.82 and a root mean square error of 724.72. Although this study indicated that remote sensing data alone can be sufficient for yield estimation, the accuracy of these predictions could potentially be enhanced by incorporating additional data, such as agrometeorological information. Additionally, the authors of [9] explored the use of 1D CNN for crop yield prediction and compared their performance with DNN. Their findings indicated that DNNs performed better at the field level when estimating winter wheat yield using data available from the Google Earth Engine (GEE) platform. However, at the county level, the performance of the two models was nearly equivalent. It is important to note that yield estimation at the field scale using MODIS satellite imagery could encounter challenges due to atmospheric effects and limitations in spatial resolution. Beyond CNNs with 1D and 2D, researchers also explored the capabilities of 3D CNN in processing the temporal dimension of remote sensing data. For instance, the authors of [57] utilized data spanning from 2003 to 2016 to train their model. Despite the challenges associated with such an extended dataset, which may introduce greater variability or outliers, they successfully predicted soybean yield using MODIS imagery. Their results highlight the critical importance of selecting appropriate input data when applying 3D CNN to such applications.

Region-based Convolutional Neural Network (R-CNN): It is an advanced deep learning architecture that is used for object detection. It identifies objects and regions of interest through selective search techniques and concludes with a CNN to classify the objects. A notable improvement of R-CNN is Fast R-CNN, which employs Region of Interest (RoI) Pooling to extract features from predetermined regions on feature maps established by convolutional layers [72]. A further advancement is represented by Faster R-CNN, which generates feature maps through a CNN and then feeds them to a Region Proposal Network (RPN) that identifies potential object locations. As a practical example, the authors of [71] created an apple yield map by counting fruits detected in images captured by Unmanned Aerial Vehicles (UAVs) from a flight altitude of 15 m. A buffer area of approximately 1 m around each tree served as the input for the Faster R-CNN model to determine the number of fruits. This tree-level approach yielded an

R^{2}

of 0.86, with MAE and RMSE values of 10.35 and 13.56, respectively. Despite these promising results, the methodology presents a limitation by focusing on the number of fruits per tree rather than the agricultural yield typically measured in tons per hectare, leading to a divergence in the yield’s conventional definition and calculation. The authors of Ref. [72] developed a methodology utilizing region-based convolutional neural networks, specifically Faster R-CNN, to detect wheat spikes from images captured by a land-based vehicle at various growth stages of wheat. Their technique focused on quantifying the number and density of spikes. While this study suggested that wheat yield prediction could be based solely on spike analysis, it underscored the significance of selecting the appropriate altitude for image acquisition to ensure the images are usable for this purpose.

Region-based Fully Convolutional Network (R-FCN): It is a deep learning architecture which also exploits the RPN found in Faster R-CNN (Figure 10). It aims to enhance the efficiency of Faster R-CNN by incorporating a residual part for feature extraction and by increasing the computational efficiency through applying a fully convolutional network for object detection [124]. In their study, the authors of [76] utilized a region-based fully convolutional network to identify rice panicles from UAV-RGB images taken at an altitude of 17 m. Image acquisition was intentionally conducted between 10 a.m. and 12 p.m. to optimize lighting conditions. The researchers indicated that rice yield could be estimated by aggregating various factors, including panicle count per unit area, grains per panicle, setting percentage, and grain weight. However, the latter may not be necessary for direct yield estimation. The developed model achieved a precision value of 0.868, indicating a high level of detection accuracy.

Long Short-Term Memory (LSTM): The LSTM architecture is a subtype of recurrent neural network characterized by a structured diagram that includes several distinct units designed to capture temporal dependencies. As depicted in Figure 11, an LSTM cell comprises various components: the memory unit, which retains the state from time t − 1; the input gate, which controls the flow of input information to update the memory cell; the output gate, which determines the output based on information from the previous two units; and the forget gate, which decides which information should be discarded [87]. Ref. [84] noted that machine learning techniques often provide better computation efficiency and simplicity in calculation compared to deep learning methods. They employed remote sensing data products, primarily Solar-Induced chlorophyll Fluorescence (SIF) and Enhanced Vegetation Index (EVI) along with environmental data to predict large-scale maize yield. They evaluated the LSTM architecture as a deep learning technique and found that it did not significantly outperform machine learning techniques, specifically the Random Forest (RF) algorithm, especially when considering the required processing time and the limited generalization capabilities. Ref. [88] developed a model to predict rice yield at the county level using data from the MODIS satellite, as well as climate, and soil properties as inputs. Their results demonstrated that the LSTM model was highly effective, achieving an

R^{2}

value of 0.87 and an RMSE value of 724 kg/ha, surpassing the performance of the Least Absolute Shrinkage and Selection Operator (LASSO) model. The study highlighted the utility of satellite data for the direct temporal monitoring of agricultural crops. However, the temporal resolution of satellite imagery does not currently allow users to choose specific times for data acquisition, leading to constraints associated with the fixed operational schedules of the satellites.

Gated Recurrent Units (GRUs): Gated Recurrent Units are a simplified variant of LSTM, which is designed with two gates: an update gate and a reset gate, as depicted in Figure 12. The first one is used for updating the state of hidden layers, while the reset gate decides the amount of past information to forget [125]. Ref. [42] explored various recurrent neural network architectures, including GRU, Bidirectional GRU, LSTM, and Bidirectional LSTM, to predict tomato and potato yields by analyzing climate data, soil conditions, and water. Their study reported that the bidirectional architecture of LSTM demonstrated superior efficiency compared to other models, particularly the GRU, with an

R^{2}

ranging from 0.97 to 0.99, indicating a very high level of predictive accuracy. Additionally, when compared with other deep learning architectures such as CNN, the bidirectional LSTM again showed enhanced performance. However, the study provided a less detailed examination of the bidirectional architectures and indicated that it is necessary to have a significant number of images with labels to train a learning model, which requires more time. Nevertheless, several techniques have made it possible today to remedy this problem, as we will see in the following sections.

Autoencoders and Stacked Sparse Autoencoders (SSAEs): Autoencoders are neural network architectures primarily used to reduce dimensionality by compressing input data into a lower-dimensional representation and then reconstructing it to preserve essential information [107]. They consist of two main components: the encoder, which compresses the input, and the decoder, which reconstructs the output using the compressed input. Sparse autoencoders are a variant of autoencoders that incorporate a ‘sparse penalty’, ensuring that only a limited number of neurons in the layers are activated during data processing. This sparsity constraint allows for generating less complexity by decreasing the number of parameters, which can improve feature learning capabilities [126]. The authors of Ref. [47] conducted county-level predictions for rice, corn, and soybean yields by utilizing data derived from vegetation indices. They employed SSAE which was compared to other machine learning algorithms, including Support Vector Machine (SVM). Their findings indicated that SVM outperformed the deep learning algorithms for this particular application. The authors of Ref. [107] estimated the productivity of sugarcane crops using hyperspectral data. Initially, they utilized an autoencoder neural network to reduce the dimensionality of the data; subsequently, they applied a neural network to calculate the productivity of the sugarcane.

Domain Adversarial Neural Network (DANN): This is a specialized neural network architecture for domain adaptation that uses adversarial training to enhance the model’s ability to generalize. This architecture allows a model trained on a source domain to perform effectively on a different target domain by learning feature representations that predict labels in the source domain while remaining invariant to shifts between the source and target domains [127]. In their study, the authors of [95] indicated that a machine learning model trained with extensive data from one particular area might encounter a decline in effectiveness when applied to a different area. To address this, they developed a corn yield prediction model at the county level using DANN, which they found to generally yield better results in terms of prediction accuracy. However, the study did not explore the impact of factors other than the size of the training data on the performance of the DANN model.

Transformers: They have become a leading model in deep learning [128], gaining an increased application across diverse domains. Ref. [85] proposed an informer-based model to predict rice yield using satellite imagery and environmental characteristics. They compared this architecture with various ML and well-known DL techniques, especially RF and LSTM. Their results indicated that their model outperformed all the other architectures, achieving an

R^{2}

value of 0.81, an RMSE of 0.41 tons per hectare, and an MAPE value of 15.47% over the period from 2009 to 2016.

Temporal Convolutional Networks (TCNs): This architecture applies causal convolutions adapting to the temporal data by considering their sequential nature [129]. In their study, the authors of [100] developed a TCN-based prediction model using satellite imagery, focusing on the prediction of various crops, such as wheat, barley, and oats. Their findings indicate that the model based on TCN exceeds traditional machine learning methods and yields more precise values. They also observed that TCN demonstrates resilience in handling cloudy pixels, suggesting its capability to learn and filter out this type of noise directly from raw data. However, they also noted that integrating the temporal dimension of information did not enhance the predictive accuracy. Nonetheless, the study should present a more in-depth exploration of how well TCN adapts to the temporal aspects of data and an assessment of its potential limitations in this context.

Hybrid architectures: Recently, scientific studies have increasingly focused on exploiting hybrid models that combine two or more architectures within the same framework. This approach is particularly prevalent in precision agriculture for predicting crop yields, where such models are compared against simpler or non-hybrid architectures. For instance, Ref. [55] highlighted the importance of data from April to September, a period corresponding to the growth cycle of maize and soybean, explored the efficacy of a CNN-LSTM architecture for county-level yield estimation, and found that it outperformed other models with MAPE values of 10.3% for soybean and 13.2% for maize. In contrast, the authors of [44] proposed a hybrid model combining ConvLSTM with 3D CNN. This model was trained using county-level yield data and MODIS products to predict soybean yield. Their results, when compared with other hybrid models like CNN-LSTM and ConvLSTM, as well as simpler models such as 3D CNN, demonstrated the best performance. In their study, the authors of [105] addressed the issue of small datasets by proposing a framework that combines Generative Adversarial Networks (GANs) and CNNs to predict wheat yield in China using remotely sensed and meteorological data. GANs, which represent a sophisticated approach in the realm of semi-supervised and unsupervised learning techniques [130], consist of a generator and a discriminator. Both components are developed through the concept of adversarial training. The primary objective of GANs is to capture the data distribution to create new instances [131]. The GAN architecture in the study conducted in [105] was proposed to augment the original dataset. The results demonstrated that data augmentation with GANs improved the performance of the CNN model for yield estimation, particularly when the available samples were limited. The GAN was specifically employed to produce high-level samples, which expanded the training and validation datasets. When compared to a CNN model that utilized only the original dataset, the dataset augmented with GANs significantly improved the performance. This improvement can be attributed to the convolutional architecture’s need for more data to enhance predictions. However, the study mainly focused on the benefits of using GANs for dataset augmentation without discussing the optimal conditions required for utilizing this architecture to address other potential limitations in similar predictive applications.

4.5. Main Input Data

For the third research question (Q.3), which aims to identify the primary input data used to develop regression models, we categorize these data types and list the number of articles utilizing each type in Table 4. It is evident that satellite imagery was employed by nearly half of the studies, highlighting its significant role in such applications. Furthermore, the expanding role of remote sensing is also evident in the evolving use of UAVs in precision agriculture, considering the increasing number of studies exploring their application. For instance, the authors of [68] tested the use of UAVs equipped with RGB cameras for predicting coffee crop yield. Ref. [46] developed yield prediction models for wheat and barley using multispectral UAV data. Ref. [48] used high-resolution RGB and multispectral images from UAVs to estimate rice yield, while [29] explored the use of hyperspectral data from UAV imagery in a convolutional model to estimate corn yield. However, it is important to mention that the latter study chose a relatively elevated altitude of 40 m for image acquisition, which might not be the most suitable for analyzing corn crops.

4.6. Requirements and Challenges in Predicting Crop Yield Using DL

Even though agricultural yield prediction can gain great advantages from deep learning, as discussed in the aforementioned sections, there are several challenges that users and researchers should consider. These challenges can be categorized into various types, depending on the specific component they are related to:

The quality of the training dataset: Obtaining high-quality agricultural datasets presents a significant challenge due to the diverse features that need to be considered and the specific precision required for each trait, including soil, climate, and phenotypic characteristics. As a result, accurately collecting all this information is a complex task.

The size of the training dataset: Typically, deep learning models necessitate large datasets for effective training. Nonetheless, acquiring such extensive data for crop yield prediction can be challenging, especially on a smaller scale. Current studies often prepare hundreds or even thousands of pre-processed data entries to achieve better results. An insufficient quantity of input data can lead to problems, particularly overfitting, where the algorithm only memorizes the characteristics of the input data. To overcome this limit, several studies have turned to data augmentation techniques, which include, for example, rotating raw images to provide more data and mitigate insufficiency issues [71]. Other studies have focused on using regularization techniques, like L1 and L2, to prevent overfitting [17]; the dropout method, another approach, involves randomly disabling links and nodes in the learning mechanism, making the DL model more robust. Some authors prefer using transfer learning to avoid these problems. This method involves using existing models and modifying their parameters through fine-tuning to enhance efficiency and results [76]. Moreover, when dealing with a larger input dataset, it is beneficial to perform feature selection to eliminate redundancy and reduce processing time. For instance, [68] employed this strategy to predict coffee yield using UAV images. Specific issues arise with well-known architectures like DNN. While they are efficient at prediction tasks, adding more hidden layers to process a larger dataset may affect the quality of post-processing regression and lead to vanishing gradient problems [60]. Finally, Active Learning (AL) techniques might be used to extract the most informative training data to effectively learn DNN models, thereby reducing the impacts of overfitting.

The complexity of models: Developing a robust deep learning model requires selecting an architecture that fits the specific data input and the unique characteristics of a particular crop type, which can be a challenging task [132]. This process often demands a high level of expertise in the field. Additionally, this point encompasses the complexity related to the training time and computational demands of certain architectures.

The interpretability issue: DL can provide highly precise predictions, particularly in the context of agricultural yields. However, comprehending exactly how the model arrives at these specific values remains a challenge. This complexity in interpretation, often referred to as the ‘black box’ issue, can make it difficult for smallholders to understand the crucial factors driving a specific prediction. These multiple challenges necessitate extensive coordination among various participants in the agricultural sector, including data scientists and farmers, to guarantee that the deep learning model is sufficiently effective and suitable for its intended use.

4.7. Bayesian Approaches to the Challenges of Crop Yield Prediction

Regression techniques are often the method of choice for analyzing crop yield dependency on a given number of covariates [133]. The application of regression models becomes straightforward when there is a large number of observations compared to a relatively small number of covariates. Ref. [134] proposed a general method for selecting a sparse number of essential covariates in Bayes linear regression problems. Ref. [135] introduced an efficient Bayesian method for feature selection in life science applications which allows the inclusion of nonlinear interactions between the covariates as well as the incorporation of user-guided domain knowledge via Dirichlet priors. In nonlinear settings, things are becoming more challenging and the danger of model overfitting increases. Gaussian process regression (GPR), which represents a nonparametric Bayesian regression framework [136], has gained increasing popularity. GPR models can be considered as an alternative to NNs, since a large class of NN-based regression models converges to a GP in the limit (of infinite networks) [137]. GPR models are also widely used in spatial and spatio-temporal statistics and in the design and evaluation of computer experiments [138]. When the number of covariates becomes large, we have to resort to feature selection methods such as LASSO and Random Forests, the latter being random collections of regression trees. From a Bayesian perspective, Bayesian additive regression trees (BART) have become popular. The recent BART R package [139] implements a Bayesian (ML) ensemble predictive modeling method. Scalable GP extensions are provided by so-called Nearest Neighbor Gaussian processes (NNGPs) [140]. Very recently, Ref. [141] introduced a novel combination of NNGPs, robust reference priors, and Bayesian variable selection, which achieves a superior performance over well-established approaches including, e.g., the Adaptive LASSO (ALASSO) [142], BART, and variational inference for Bayesian variable selection (VARBVS) [143]. Finally, Deep GPs, as introduced in [144], combine non-stationary and global modeling; these can be viewed as (deep and wide) limits of Deep NNs [145]. They come, however, at the cost of additional computational expenses. Deep GP modeling had been applied to yield prediction by [146]; since then, we have not observed, however, further extensions of their methodology, with a notable exception of [147].

5. Discussion

Through our systematic literature review, we highlighted various aspects related to crop yield prediction using deep learning techniques. Our analysis reveals that a commonly used architecture might not be efficient for all studies and cases. This is particularly true for 2D CNN, which is widely utilized in precision agriculture. However, several studies have pointed out its limitations in exploring the three-dimensional aspect, a requirement especially crucial when considering multi-temporal data. Currently, researchers are increasingly adopting hybrid solutions as a technique to compensate for the limitations of simpler architectures. In particular, a combination of CNN and LSTM is utilized to capture both spatial and temporal information. This approach enables more realistic estimates that take into account various dimensions and facets of the dataset. Different studies employing deep learning models have utilized satellite imagery for large-scale coverage, particularly in county-level analysis. Meanwhile, many studies have preferred the use of UAVs to circumvent atmospheric effect issues and to benefit from their adjustable, high-quality spatial and temporal resolutions, particularly useful for field-scale yield determination. Alternatively, some studies have relied solely on environmental data, focusing mainly on climatic and soil characteristics. For example, [42,67] used these inputs to predict the yields of tomato (Solanum lycopersicum), potato (Solanum tuberosum), and cotton (Gossypium), achieving MSE values ranging from 0.017 to 0.039 for tomato and potato and an RMSE value of 48.213 for cotton. Additionally, other researchers have combined meteorological data with aerospace imagery to enhance their yield predictions. Conversely, other authors have chosen to acquire images using manual or hand-held sensors to capture closer images of the plants and the specific crops under study. In terms of the most frequently covered agricultural crops by the reviewed articles, cereals, particularly wheat and maize, are predominant. This emphasis highlights their importance, as they are fundamental to global food security. In our review, we followed the methodology described above and selected DOAJ, IEEE, MDPI, and ScienceDirect as the scientific sources for our data. However, it is worth noting that other databases could provide additional literature articles. The keywords we chose were diversified to capture a broad range of relevant articles. Nevertheless, other keywords may be selected and thus might yield varying results. Despite this, the number of scientific articles we found sufficiently addresses the purposes of this study and its research questions.

Deep learning has revolutionized various fields including precision agriculture, particularly yield prediction. Our systematic literature review highlights the significant contribution of this technique, especially in enhancing statistical estimation methods. However, there are various issues and challenges that this sub-branch of machine learning must overcome. These include the number of data required, the risk of overfitting, and the selection of an appropriate architecture to develop an efficient model according to the input data. Despite this remarkable progress, a significant issue often arises with these algorithms: the ‘black box’ problem. This term refers to the challenge many researchers face in explaining the detailed workings of the algorithm, leading to concerns about trust in a mechanism that is unknown and inaccessible to its users. Consequently, there is a growing demand for the developers of these techniques to establish clear causal relationships in their models, directly linking input data to the results.

6. Recommended Best Studies for Readers

After addressing the research questions outlined in our methodology and analyzing various aspects of this research topic, we choose to present a comparison based on several criteria. This aims to recommend the most noteworthy articles for consideration by future readers. These criteria include the following:

The study introduces novel and methodologically original approaches to predict agricultural yield.
The study defines agricultural yield in accordance with the established literature, rather than using discontinuous or alternative measures.
The data used as input reflect the exploitation of new geospatial technologies.
The study focuses on yield prediction for major crops, particularly cereals.
Finally, the selected article demonstrates significant quality.

Based on these comparison points, all the studies mentioned in Table 3 were compared, and the best studies are the following:

The study conducted by the authors of [8], who developed a model based on deep CNN-LSTM for in-season soybean prediction and accomplished this by exploiting satellite imagery.
The study carried out in [58], which used high-resolution imagery acquired by a UAV to predict cereal crops using a spatio-temporal DL model.
The study conducted in [109], where the authors presented the prediction of winter wheat yield using a hybrid deep learning model combining static features and remote sensing data.
The study carried out in [29], in which the authors determined corn yield using high-resolution hyperspectral data and combining different dimensions of a CNN as a DL model.
The study presented in [44], in which authors developed various models combining different DL architectures, namely, LSTM and CNN, and used satellite data to predict soybean yield.
The research study carried out in [49], where the authors focused on predicting winter wheat and corn yields, accomplished this by studying the effects of spatial, spectral, and temporal data, and proposed a novel technique called SSTNN to establish their prediction.

7. Open Issues and Future Road Maps

Artificial intelligence, with its diverse subfields, is crucial in various sectors, especially in the prediction of agricultural yields. The wide array of architectures and algorithm parameters poses a challenge in identifying the optimal combination for accurate yield predictions. A pressing concern is the ‘black box’ nature of deep learning models, which conceals the decision-making process and casts doubt on the models’ reliability due to the lack of transparency in their predictions. Although methods have been developed to demystify AI decisions, there is a discernible lack of comprehensive reviews that delve into the fusion of deep learning with Explainable Artificial Intelligence (XAI) specifically in the context of agricultural yield predictions. Addressing the issues identified in our research necessitates a detailed review that combines deep learning with XAI, supported by a well-defined experimental framework. Moreover, integrating Knowledge Representation (KR) with deep learning could offer sophisticated solutions for crop yield forecasting. Additionally, exploring new deep learning architectures might yield further advancements in this field, enhancing our ability to understand and improve yield prediction models, particularly in the following ways:

Using high-performance architectures based on transformers, as it was highlighted in the study conducted in [148], and GANs, and examining the benefits of using a hybrid solution based on efficient algorithms.
Exploring other domain adaptation methods, as yield is sensitive from one area to another.

Moreover, we can investigate the advantages and benefits of probabilistic machine learning frameworks [145], specifically the Probabilistic Neural Network (PNN), which enhances deterministic models through the application of probability theories for artificial neural networks. These probabilistic forecasting models contribute to precise decision making. By accounting for uncertainty, probabilistic models also generate more realistic predictions, especially when applied to complex datasets, including crop yield determination. Furthermore, considering that crop yield is a measure spread over multiple seasons, its prediction can be improved by incorporating advanced spatio-temporal modeling, including dynamic spatio-temporal architectures [149]. This involves using the distribution of probabilities to model the progression of plots over time, predicting future states based on past information. This modeling is based on modifying the configuration and structure of objects at different times, capturing important dependencies of the considered phenomenon. The reason for considering dynamic modeling is to discuss how a specific measure evolves over time. In other terms, these are models built by understanding historical events and subsequently creating statistical models that depict the progression from the past to the present. Multiple dynamic models exist; one well-known example is the non-Gaussian and nonlinear data model, allowing us to model much more complex phenomena. A perspective in this regard could be to model crop yield over time by leveraging high-performing mathematical and statistical models and comparing the results against those obtained through deep learning. This would enable the derivation of a conclusion regarding the processing and consideration of the dynamic and spatio-temporal aspects of this measure.

8. Conclusions

In this paper, we conduct a literature review to address four significant research questions related to predicting crop yield using deep learning. This involves a systematic methodology for reviewing the literature and extracting pertinent data. Our study reveals that deep learning models, especially in specialized fields like intelligent agriculture, exhibit unique characteristics. Firstly, there is the type of data to be utilized, including their format, quantity, and adequacy. This necessitates the use of advanced processing techniques, such as transfer learning or data augmentation. Additionally, the specific nature of the crop and the chosen architecture play a crucial role in extracting comprehensive information to enhance the model efficiency. Finally, a notable challenge is the ‘black box’ nature of these models, where the causal relationship between inputs and results remains unclear. Our systematic review highlights that research articles vary in their use of deep learning architectures, ranging from Long Short-Term Memory networks, convolutional neural networks of various dimensions (from 1D to 3D), and the well-known DNN to other recurrent models like Gated Recurrent Units and more advanced hybrid solutions that are increasingly popular. Our review also indicates that the cereal sector is the primary focus in crop yield prediction research, with aerospace imagery from satellites or drones being the most utilized data source. Additionally, this study offers recommendations about the best studies in this field and exposes unresolved and open issues, particularly those related to the ‘black box’ problem in DL models. In our next work, we will delve into the detailed use of high-resolution imagery for predicting cereal yield at very small scales. Furthermore, we will employ specific techniques to tackle the ‘black box’ challenge in this context.

Author Contributions

Conceptualization, K.M., I.S. and J.P.; methodology, K.M., I.S. and J.P.; validation, K.M., I.S., J.P., K.A.E.K. and S.B.; formal analysis, K.M., I.S. and J.P.; investigation, K.M., I.S. and J.P.; resources, K.M. and I.S.; writing—original draft preparation, K.M., I.S. and J.P.; writing—review and editing, K.M., I.S., J.P., K.A.E.K. and S.B.; visualization, K.M. and I.S.; supervision, I.S., J.P., K.A.E.K. and S.B.; project administration, I.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial intelligence
AL	Active Learning
ALASSO	Adaptive LASSO
BART	Bayesian Additive Regression Trees
CNN	Convolutional neural network
DANN	Domain Adversarial Neural Network
DBN	Deep Belief Network
DL	Deep learning
DNN	Deep neural network
DOAJ	Directory of Open Access Journals
EVI	Enhanced Vegetation Index
FAO	Food and Agriculture Organization
GAN	Generative Adversarial Networks
GEE	Google Earth Engine
GPR	Gaussian Process Regression
GRU	Gated Recurrent Unit
IEEE	Institute of Electrical and Electronics Engineers
KR	Knowledge Representation
LASSO	Least Absolute Shrinkage and Selection Operator
LiDAR	Light Detection And Ranging
LSTM	Long Short-Term Memory
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MDPI	Multidisciplinary Digital Publishing Institute
ML	Machine Learning
MLP	Multilayer Perceptron
MODIS	Moderate Resolution Imaging Spectroradiometer
NDVI	Normalized Difference Vegetation Index
NNGPs	Nearest Neighbor Gaussian Processes
PNN	Probabilistic Neural Network
RBM	Restricted Boltzmann Machine
R-CNN	Region-based convolutional neural network
RF	Random Forest
R-FCN	Region-based fully convolutional network
RMSE	Root Mean Square Error
RoI	Region of Interest
RPN	Region Proposal Network
rrBLUP	ridge regression Best Linear Unbiased Prediction
SBN	Sigmoid Belief Network
SIF	Solar-Induced chlorophyll Fluorescence
SLR	Systematic literature review
SSAE	Stacked-Sparse Autoencoder
SVM	Support Vector Machine
TCN	Temporal Convolutional Network
UAV	Unmanned Aerial Vehicle
VARBVS	Variational inference for Bayesian variable selection
XAI	Explainable Artificial Intelligence

References

Food and Agriculture Organization of the United Nations. The Future of Food and Agriculture: Trends and Challenges; FAO: Rome, Italy, 2017. [Google Scholar]
United Nations. Feeding the World Sustainably; United Nations Chronicle: New York, NY, USA, 2020. [Google Scholar]
Van Klompenburg, T.; Kassahun, A.; Catal, C. Crop yield prediction using machine learning: A systematic literature review. Comput. Electron. Agric. 2020, 177, 105709. [Google Scholar] [CrossRef]
Zannou, J.G.N.; Houndji, V.R. Sorghum Yield Prediction using Machine Learning. In Proceedings of the 2019 3rd International Conference on Bio-Engineering for Smart Technologies (BioSMART), Paris, France, 24–26 April 2019; pp. 1–4. [Google Scholar] [CrossRef]
Maimaitijiang, M.; Sagan, V.; Sidike, P.; Hartling, S.; Esposito, F.; Fritschi, F.B. Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sens. Environ. 2020, 237, 111599. [Google Scholar] [CrossRef]
Gao, Y.; Wang, S.; Guan, K.; Wolanin, A.; You, L.; Ju, W.; Zhang, Y. The Ability of Sun-Induced Chlorophyll Fluorescence from OCO-2 and MODIS-EVI to Monitor Spatial Variations of Soybean and Maize Yields in the Midwestern USA. Remote Sens. 2020, 12, 1111. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Z.; Feng, L.; Du, Q.; Runge, T. Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in the Conterminous United States. Remote Sens. 2020, 12, 1232. [Google Scholar] [CrossRef]
Sun, J.; Di, L.; Sun, Z.; Shen, Y.; Lai, Z. County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model. Sensors 2019, 19, 4363. [Google Scholar] [CrossRef] [PubMed]
Cao, J.; Zhang, Z.; Luo, Y.; Zhang, L.; Zhang, J.; Li, Z.; Tao, F. Wheat yield predictions at a county and field scale with deep learning, machine learning, and Google Earth Engine. Eur. J. Agron. 2021, 123, 126204. [Google Scholar] [CrossRef]
Crawford, K. The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence; Yale University Press: New Haven, CT, USA, 2021. [Google Scholar]
Banerjee, G.; Sarkar, U.; Das, S.; Ghosh, I. Artificial intelligence in agriculture: A literature survey. Int. J. Sci. Res. Comput. Sci. Appl. Manag. Stud. 2018, 7, 1–6. [Google Scholar]
Astaoui, G.; Dadaiss, J.E.; Sebari, I.; Benmansour, S.; Mohamed, E. Mapping wheat dry matter and nitrogen content dynamics and estimation of wheat yield using UAV multispectral imagery machine learning and a variety-based approach: Case study of Morocco. AgriEngineering 2021, 3, 29–49. [Google Scholar] [CrossRef]
Haq, M.A.; Rahaman, G.; Baral, P.; Ghosh, A. Deep learning based supervised image classification using UAV images for forest areas classification. J. Indian Soc. Remote Sens. 2021, 49, 601–606. [Google Scholar] [CrossRef]
Haq, M.A.; Khan, M.Y.A. Crop water requirements with changing climate in an arid region of Saudi Arabia. Sustainability 2022, 14, 13554. [Google Scholar] [CrossRef]
Haq, M.A. CNN Based Automated Weed Detection System Using UAV Imagery. Comput. Syst. Sci. Eng. 2022, 42, 837–849. [Google Scholar]
Maheswari, P.; Raja, P.; Apolo-Apolo, O.E.; Perez-Ruiz, M. Intelligent fruit yield estimation for orchards using deep learning based semantic segmentation techniques—A review. Front. Plant Sci. 2021, 12, 684328. [Google Scholar] [CrossRef]
Kim, N.; Ha, K.-J.; Park, N.-W.; Cho, J.; Hong, S.; Lee, Y.-W. A Comparison Between Major Artificial Intelligence Models for Crop Yield Prediction: Case Study of the Midwestern United States, 2006–2015. ISPRS Int. J. Geo-Inf. 2019, 8, 240. [Google Scholar] [CrossRef]
Dharani, M.K.; Thamilselvan, R.; Natesan, P.; Kalaivaani, P.C.D.; Santhoshkumar, S. Review on Crop Prediction Using Deep Learning Techniques. J. Phys. Conf. Ser. 2021, 1767, 012026. [Google Scholar] [CrossRef]
Rashid, M.; Bari, B.S.; Yusup, Y.; Kamaruddin, M.A.; Khan, N. A Comprehensive Review of Crop Yield Prediction Using Machine Learning Approaches with Special Emphasis on Palm Oil Yield Prediction. IEEE Access 2021, 9, 63406–63439. [Google Scholar] [CrossRef]
Dewangan, U.; Talwekar, R.H.; Bera, S. Systematic Literature Review on Crop Yield Prediction using Machine & Deep Learning Algorithm. In Proceedings of the 2022 5th International Conference on Advances in Science and Technology (ICAST), Mumbai, India, 2–3 December 2022; pp. 654–661. [Google Scholar]
Oikonomidis, A.; Catal, C.; Kassahun, A. Deep learning for crop yield prediction: A systematic literature review. N. Z. J. Crop Hortic. Sci. 2023, 51, 1–26. [Google Scholar] [CrossRef]
Sordello, R.; Villemey, A.; Jeusset, A.; Vargac, M.; Bertheau, Y.; Coulon, A.; Deniaud, N.; de Lachapelle, F.F.; Guinard, E.; Jactel, H. Conseils Méthodologiques pour la Réalisation D’une Revue Systématique à Travers L’expérience de COHNECS-IT. Available online: https://hal.sorbonne-universite.fr/hal-01592725/ (accessed on 1 December 2021).
Nambiema, A.; Fouquet, J.; Guilloteau, J.; Descatha, A. La revue systématique et autres types de revue de la littérature: Qu’est-ce que c’est, quand, comment, pourquoi? Arch. Des Mal. Prof. L’Environ. 2021, 82, 539–552. [Google Scholar] [CrossRef]
Siddaway, A.P.; Wood, A.M.; Hedges, L.V. How to do a systematic review: A best practice guide for conducting and reporting narrative reviews, meta-analyses, and meta-syntheses. Annu. Rev. Psychol. 2019, 70, 747–770. [Google Scholar] [CrossRef] [PubMed]
Sandhu, K.S.; Lozada, D.N.; Zhang, Z.; Pumphrey, M.O.; Carter, A.H. Deep Learning for Predicting Complex Traits in Spring Wheat Breeding Program. Front. Plant Sci. 2021, 11, 613325. [Google Scholar] [CrossRef]
Li, Z.; Chen, Z.; Cheng, Q.; Fei, S.; Zhou, X. Deep Learning Models Outperform Generalized Machine Learning Models in Predicting Winter Wheat Yield Based on Multispectral Data from Drones. Drones 2023, 7, 505. [Google Scholar] [CrossRef]
Huang, H.; Huang, J.; Feng, Q.; Liu, J.; Li, X.; Wang, X.; Niu, Q. Developing a dual-stream deep-learning neural network model for improving county-level winter wheat yield estimates in China. Remote Sens. 2022, 14, 5280. [Google Scholar] [CrossRef]
Srivastava, A.K.; Safaei, N.; Khaki, S.; Lopez, G.; Zeng, W.; Ewert, F.; Gaiser, T.; Rahimi, J. Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci. Rep. 2022, 12, 3215. [Google Scholar] [CrossRef]
Yang, W.; Nigon, T.; Hao, Z.; Paiao, G.D.; Fernández, F.G.; Mulla, D.; Yang, C. Estimation of corn yield based on hyperspectral imagery and convolutional neural network. Comput. Electron. Agric. 2021, 184, 106092. [Google Scholar] [CrossRef]
Måløy, H.; Windju, S.; Bergersen, S.; Alsheikh, M.; Downing, K.L. Multimodal performers for genomic selection and crop yield prediction. Smart Agric. Technol. 2021, 1, 100017. [Google Scholar] [CrossRef]
Jeong, S.; Ko, J.; Yeom, J.-M. Predicting rice yield at pixel scale through synthetic use of crop and deep learning models with satellite data in South and North Korea. Sci. Total Environ. 2022, 802, 149726. [Google Scholar] [CrossRef] [PubMed]
Li, D.; Wu, X. Individualized Indicators and Estimation Methods for Tiger Nut (Cyperus esculentus L.) Tubers Yield Using Light Multispectral UAV and Lightweight CNN Structure. Drones 2023, 7, 432. [Google Scholar] [CrossRef]
Xu, X.; Li, H.; Yin, F.; Xi, L.; Qiao, H.; Ma, Z.; Shen, S.; Jiang, B.; Ma, X. Wheat ear counting using K-means clustering segmentation and convolutional neural network. Plant Methods 2020, 16, 106. [Google Scholar] [CrossRef]
Zhang, J.; Zhao, B.; Yang, C.; Shi, Y.; Liao, Q.; Zhou, G.; Wang, C.; Xie, T.; Jiang, Z.; Zhang, D. Rapeseed stand count estimation at leaf development stages with UAV imagery and convolutional neural networks. Front. Plant Sci. 2020, 11, 617. [Google Scholar] [CrossRef]
Tanaka, Y.; Watanabe, T.; Katsura, K.; Tsujimoto, Y.; Takai, T.; Tanaka, T.S.T.; Kawamura, K.; Saito, H.; Homma, K.; Mairoua, S.G.; et al. Deep Learning Enables Instant and Versatile Estimation of Rice Yield Using Ground-Based RGB Images. Plant Phenomics 2023, 5, 0073. [Google Scholar] [CrossRef]
Bellis, E.S.; Hashem, A.A.; Causey, J.L.; Runkle, B.R.; Moreno-García, B.; Burns, B.W.; Green, V.S.; Burcham, T.N.; Reba, M.L.; Huang, X. Detecting intra-field variation in rice yield with unmanned aerial vehicle imagery and deep learning. Front. Plant Sci. 2022, 13, 716506. [Google Scholar] [CrossRef] [PubMed]
Mia, M.S.; Tanabe, R.; Habibi, L.N.; Hashimoto, N.; Homma, K.; Maki, M.; Matsui, T.; Tanaka, T.S. Multimodal Deep Learning for Rice Yield Prediction Using UAV-Based Multispectral Imagery and Weather Data. Remote Sens. 2023, 15, 2511. [Google Scholar] [CrossRef]
Yalcin, H. An approximation for a relative crop yield estimate from field images using deep learning. In Proceedings of the 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Istanbul, Turkey, 16–19 July 2019; pp. 1–6. [Google Scholar]
Yang, Q.; Shi, L.; Lin, L. Plot-scale rice grain yield estimation using UAV-based remotely sensed images via CNN with time-invariant deep features decomposition. In Proceedings of the IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 7180–7183. [Google Scholar]
Mu, H.; Zhou, L.; Dang, X.; Yuan, B. Winter Wheat Yield Estimation from Multitemporal Remote Sensing Images based on Convolutional Neural Networks. In Proceedings of the 2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Shanghai, China, 5–7 August 2019; pp. 1–4. [Google Scholar] [CrossRef]
Lee, S.; Jeong, Y.; Son, S.; Lee, B. A self-predictable crop yield platform (SCYP) based on crop diseases using deep learning. Sustainability 2019, 11, 3637. [Google Scholar] [CrossRef]
Alibabaei, K.; Gaspar, P.D.; Lima, T.M. Crop Yield Estimation Using Deep Learning Based on Climate Big Data and Irrigation Scheduling. Energies 2021, 14, 3004. [Google Scholar] [CrossRef]
Zhou, S.; Xu, L.; Chen, N. Rice Yield Prediction in Hubei Province Based on Deep Learning and the Effect of Spatial Heterogeneity. Remote Sens. 2023, 15, 1361. [Google Scholar] [CrossRef]
Gavahi, K.; Abbaszadeh, P.; Moradkhani, H. DeepYield: A combined convolutional neural network with long short-term memory for crop yield forecasting. Expert Syst. Appl. 2021, 184, 115511. [Google Scholar] [CrossRef]
Sagan, V.; Maimaitijiang, M.; Bhadra, S.; Maimaitiyiming, M.; Brown, D.R.; Sidike, P.; Fritschi, F.B. Field-scale crop yield prediction using multi-temporal WorldView-3 and PlanetScope satellite data and deep learning. ISPRS J. Photogramm. Remote Sens. 2021, 174, 265–281. [Google Scholar] [CrossRef]
Nevavuori, P.; Narra, N.; Lipping, T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019, 163, 104859. [Google Scholar] [CrossRef]
Ju, S.; Lim, H.; Ma, J.W.; Kim, S.; Lee, K.; Zhao, S.; Heo, J. Optimal county-level crop yield prediction using MODIS-based variables and weather data: A comparative study on machine learning models. Agric. For. Meteorol. 2021, 307, 108530. [Google Scholar] [CrossRef]
Yang, Q.; Shi, L.; Han, J.; Zha, Y.; Zhu, P. Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images. Field Crops Res. 2019, 235, 142–153. [Google Scholar] [CrossRef]
Qiao, M.; He, X.; Cheng, X.; Li, P.; Luo, H.; Zhang, L.; Tian, Z. Crop Yield Prediction from Multi-spectral, Multi-temporal Remotely Sensed Imagery Using Recurrent 3D Convolutional Neural Networks. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102436. [Google Scholar] [CrossRef]
MacEachern, C.B.; Esau, T.J.; Schumann, A.W.; Hennessy, P.J.; Zaman, Q.U. Detection of fruit maturity stage and yield estimation in wild blueberry using deep learning convolutional neural networks. Smart Agric. Technol. 2023, 3, 100099. [Google Scholar] [CrossRef]
Chakraborty, M.; Pourreza, A.; Zhang, X.; Jafarbiglu, H.; Shackel, K.A.; DeJong, T. Early almond yield forecasting by bloom mapping using aerial imagery and deep learning. Comput. Electron. Agric. 2023, 212, 108063. [Google Scholar] [CrossRef]
Huber, F.; Yushchenko, A.; Stratmann, B.; Steinhage, V. Extreme Gradient Boosting for yield estimation compared with Deep Learning approaches. Comput. Electron. Agric. 2022, 202, 107346. [Google Scholar] [CrossRef]
Han, J.; Shi, L.; Yang, Q.; Chen, Z.; Yu, J.; Zha, Y. Rice yield estimation using a CNN-based image-driven data assimilation framework. Field Crops Res. 2022, 288, 108693. [Google Scholar] [CrossRef]
Tanabe, R.; Matsui, T.; Tanaka, T.S. Winter wheat yield prediction using convolutional neural networks and UAV-based multispectral imagery. Field Crops Res. 2023, 291, 108786. [Google Scholar] [CrossRef]
Ghazaryan, G.; Skakun, S.; König, S.; Rezaei, E.E.; Siebert, S.; Dubovyk, O. Crop Yield Estimation Using Multi-Source Satellite Image Series and Deep Learning. In Proceedings of the IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, 26 September–2 October 2020; pp. 5163–5166. [Google Scholar] [CrossRef]
Qiao, M.; He, X.; Cheng, X.; Li, P.; Luo, H.; Tian, Z.; Guo, H. Exploiting hierarchical features for crop yield prediction based on 3-D convolutional neural networks and multikernel gaussian process. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4476–4489. [Google Scholar] [CrossRef]
Terliksiz, A.S.; Altỳlar, D.T. Use of deep neural networks for crop yield prediction: A case study of soybean yield in Lauderdale County, Alabama, USA. In Proceedings of the 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Istanbul, Turkey, 16–19 July 2019; pp. 1–4. [Google Scholar]
Nevavuori, P.; Narra, N.; Linna, P.; Lipping, T. Crop Yield Prediction Using Multitemporal UAV Data and Spatio-Temporal Deep Learning Models. Remote Sens. 2020, 12, 4000. [Google Scholar] [CrossRef]
Abbaszadeh, P.; Gavahi, K.; Alipour, A.; Deb, P.; Moradkhani, H. Bayesian multi-modeling of deep neural nets for probabilistic crop yield prediction. Agric. For. Meteorol. 2022, 314, 108773. [Google Scholar] [CrossRef]
Khaki, S.; Wang, L. Crop Yield Prediction Using Deep Neural Networks. Front. Plant Sci. 2019, 10, 621. [Google Scholar] [CrossRef]
Ramzan, Z.; Asif, H.S.; Yousuf, I.; Shahbaz, M. A Multimodal Data Fusion and Deep Neural Networks Based Technique for Tea Yield Estimation in Pakistan Using Satellite Imagery. IEEE Access 2023, 11, 42578–42594. [Google Scholar] [CrossRef]
Bai, D.; Li, D.; Zhao, C.; Wang, Z.; Shao, M.; Guo, B.; Liu, Y.; Wang, Q.; Li, J.; Guo, S. Estimation of soybean yield parameters under lodging conditions using RGB information from unmanned aerial vehicles. Front. Plant Sci. 2022, 13, 1012293. [Google Scholar] [CrossRef] [PubMed]
Fei, S.; Li, L.; Han, Z.; Chen, Z.; Xiao, Y. Combining novel feature selection strategy and hyperspectral vegetation indices to predict crop yield. Plant Methods 2022, 18, 119. [Google Scholar] [CrossRef] [PubMed]
Kumar, C.; Mubvumba, P.; Huang, Y.; Dhillon, J.; Reddy, K. Multi-Stage Corn Yield Prediction Using High-Resolution UAV Multispectral Data and Machine Learning Models. Agronomy 2023, 13, 1277. [Google Scholar] [CrossRef]
Mokhtar, A.; El-Ssawy, W.; He, H.; Al-Anasari, N.; Sammen, S.S.; Gyasi-Agyei, Y.; Abuarab, M. Using machine learning models to predict hydroponically grown lettuce yield. Front. Plant Sci. 2022, 13, 706042. [Google Scholar] [CrossRef]
Kuwata, K.; Shibasaki, R. Estimating crop yields with deep learning and remotely sensed data. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 858–861. [Google Scholar]
Livieris, I.E.; Dafnis, S.D.; Papadopoulos, G.K.; Kalivas, D.P. A Multiple-Input Neural Network Model for Predicting Cotton Production Quantity: A Case Study. Algorithms 2020, 13, 273. [Google Scholar] [CrossRef]
Barbosa, B.D.S.; Ferraz, G.A.e.S.; Costa, L.; Ampatzidis, Y.; Vijayakumar, V.; dos Santos, L.M. UAV-based coffee yield prediction utilizing feature selection and deep learning. Smart Agric. Technol. 2021, 1, 100010. [Google Scholar] [CrossRef]
Ma, Y.; Zhang, Z.; Yang, H.L.; Yang, Z. An adaptive adversarial domain adaptation approach for corn yield prediction. Comput. Electron. Agric. 2021, 187, 106314. [Google Scholar] [CrossRef]
Priyatikanto, R.; Lu, Y.; Dash, J.; Sheffield, J. Improving generalisability and transferability of machine-learning-based maize yield prediction model through domain adaptation. Agric. For. Meteorol. 2023, 341, 109652. [Google Scholar] [CrossRef]
Apolo-Apolo, O.E.; Pérez-Ruiz, M.; Martínez-Guanter, J.; Valente, J. A Cloud-Based Environment for Generating Yield Estimation Maps from Apple Orchards Using UAV Imagery and a Deep Learning Technique. Front. Plant Sci. 2020, 11, 1086. [Google Scholar] [CrossRef]
Hasan, M.M.; Chopin, J.P.; Laga, H.; Miklavcic, S.J. Detection and analysis of wheat spikes using convolutional neural networks. Plant Methods 2018, 14, 100. [Google Scholar] [CrossRef]
Ariza-Sentís, M.; Valente, J.; Kooistra, L.; Kramer, H.; Mücher, S. Estimation of spinach (Spinacia oleracea) seed yield with 2D UAV data and deep learning. Smart Agric. Technol. 2023, 3, 100129. [Google Scholar] [CrossRef]
Lu, J.; Yang, R.; Yu, C.; Lin, J.; Chen, W.; Wu, H.; Chen, X.; Lan, Y.; Wang, W. Citrus green fruit detection via improved feature network extraction. Front. Plant Sci. 2022, 13, 946154. [Google Scholar] [CrossRef]
Peng, J.; Wang, D.; Zhu, W.; Yang, T.; Liu, Z.; Rezaei, E.E.; Li, J.; Sun, Z.; Xin, X. Combination of UAV and deep learning to estimate wheat yield at ripening stage: The potential of phenotypic features. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103494. [Google Scholar] [CrossRef]
Zhou, C.; Ye, H.; Hu, J.; Shi, X.; Hua, S.; Yue, J.; Xu, Z.; Yang, G. Automated Counting of Rice Panicle by Applying Deep Learning Model to Images from Unmanned Aerial Vehicle Platform. Sensors 2019, 19, 3106. [Google Scholar] [CrossRef] [PubMed]
Lang, P.; Zhang, L.; Huang, C.; Chen, J.; Kang, X.; Zhang, Z.; Tong, Q. Integrating environmental and satellite data to estimate county-level cotton yield in Xinjiang Province. Front. Plant Sci. 2023, 13, 1048479. [Google Scholar] [CrossRef] [PubMed]
Cheng, E.; Zhang, B.; Peng, D.; Zhong, L.; Yu, L.; Liu, Y.; Xiao, C.; Li, C.; Li, X.; Chen, Y. Wheat yield estimation using remote sensing data based on machine learning approaches. Front. Plant Sci. 2022, 13, 1090970. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Si, H.; Gao, Z.; Shi, L. Winter wheat yield prediction using an LSTM model from MODIS LAI products. Agriculture 2022, 12, 1707. [Google Scholar] [CrossRef]
Zhu, Y.; Wu, S.; Qin, M.; Fu, Z.; Gao, Y.; Wang, Y.; Du, Z. A deep learning crop model for adaptive yield estimation in large areas. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102828. [Google Scholar] [CrossRef]
Shen, Y.; Mercatoris, B.; Cao, Z.; Kwan, P.; Guo, L.; Yao, H.; Cheng, Q. Improving Wheat Yield Prediction Accuracy Using LSTM-RF Framework Based on UAV Thermal Infrared and Multispectral Imagery. Agriculture 2022, 12, 892. [Google Scholar] [CrossRef]
Di, Y.; Gao, M.; Feng, F.; Li, Q.; Zhang, H. A New Framework for Winter Wheat Yield Prediction Integrating Deep Learning and Bayesian Optimization. Agronomy 2022, 12, 3194. [Google Scholar] [CrossRef]
Shahrin, F.; Zahin, L.; Rahman, R.; Hossain, A.J.; Kaf, A.H.; Azad, A.A.M. Agricultural analysis and crop yield prediction of Habiganj using multispectral bands of satellite imagery with machine learning. In Proceedings of the 2020 11th International Conference on Electrical and Computer Engineering (ICECE), Dhaka, Bangladesh, 17–19 December 2020; pp. 21–24. [Google Scholar]
Zhang, L.; Zhang, Z.; Luo, Y.; Cao, J.; Tao, F. Combining Optical, Fluorescence, Thermal Satellite, and Environmental Data to Predict County-Level Maize Yield in China Using Machine Learning Approaches. Remote Sens. 2020, 12, 21. [Google Scholar] [CrossRef]
Liu, Y.; Wang, S.; Chen, J.; Chen, B.; Wang, X.; Hao, D.; Sun, L. Rice yield prediction and model interpretation based on satellite and climatic indicators using a transformer method. Remote Sens. 2022, 14, 5045. [Google Scholar] [CrossRef]
Tian, H.; Wang, P.; Tansey, K.; Han, D.; Zhang, J.; Zhang, S.; Li, H. A deep learning framework under attention mechanism for wheat yield estimation using remotely sensed indices in the Guanzhong Plain, PR China. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102375. [Google Scholar] [CrossRef]
Tian, H.; Wang, P.; Tansey, K.; Zhang, J.; Zhang, S.; Li, H. An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong Plain, PR China. Agric. For. Meteorol. 2021, 310, 108629. [Google Scholar] [CrossRef]
Cao, J.; Zhang, Z.; Tao, F.; Zhang, L.; Luo, Y.; Zhang, J.; Han, J.; Xie, J. Integrating Multi-Source Data for Rice Yield Prediction across China using Machine Learning and Deep Learning Approaches. Agric. For. Meteorol. 2021, 297, 108275. [Google Scholar] [CrossRef]
Schwalbert, R.A.; Amado, T.; Corassa, G.; Pott, L.P.; Prasad, P.V.; Ciampitti, I.A. Satellite-based soybean yield forecast: Integrating machine learning and weather data for improving crop yield prediction in southern Brazil. Agric. For. Meteorol. 2020, 284, 107886. [Google Scholar] [CrossRef]
Ma, Y.; Zhang, Z.; Kang, Y.; Özdoğan, M. Corn yield prediction and uncertainty analysis based on remotely sensed variables using a Bayesian neural network approach. Remote Sens. Environ. 2021, 259, 112408. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, Z.; Luo, Y.; Cao, J.; Xie, R.; Li, S. Integrating satellite-derived climatic and vegetation indices to predict smallholder maize yield using deep learning. Agric. For. Meteorol. 2021, 311, 108666. [Google Scholar] [CrossRef]
Jhajharia, K.; Mathur, P.; Jain, S.; Nijhawan, S. Crop yield prediction using machine learning and deep learning techniques. Procedia Comput. Sci. 2023, 218, 406–417. [Google Scholar] [CrossRef]
Divakar, M.S.; Elayidom, M.S.; Rajesh, R. Forecasting crop yield with deep learning based ensemble model. Mater. Today Proc. 2022, 58, 256–259. [Google Scholar] [CrossRef]
Jaison, B. Adaptive Lemuria: A progressive future crop prediction algorithm using data mining. Sustain. Comput. Inform. Syst. 2021, 31, 100577. [Google Scholar] [CrossRef]
Ma, Y.; Zhang, Z. A Bayesian domain adversarial neural network for corn yield prediction. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 5513705. [Google Scholar] [CrossRef]
Perich, G.; Turkoglu, M.O.; Graf, L.V.; Wegner, J.D.; Aasen, H.; Walter, A.; Liebisch, F. Pixel-based yield mapping and prediction from Sentinel-2 using spectral indices and neural networks. Field Crop. Res. 2023, 292, 108824. [Google Scholar] [CrossRef]
Wang, J.; Wang, P.; Tian, H.; Tansey, K.; Liu, J.; Quan, W. A deep learning framework combining CNN and GRU for improving wheat yield estimates using time series remotely sensed multi-variables. Comput. Electron. Agric. 2023, 206, 107705. [Google Scholar] [CrossRef]
Cunha, R.L.; Silva, B.; Netto, M.A. A scalable machine learning system for pre-season agriculture yield forecast. In Proceedings of the 2018 IEEE 14th International Conference on E-Science (e-Science), Amsterdam, The Netherlands, 29 October–1 November 2018; pp. 423–430. [Google Scholar]
De Freitas Cunha, R.L.; Silva, B. Estimating crop yields with remote sensing and deep learning. In Proceedings of the 2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS), Santiago, Chile, 22–26 March 2020; pp. 273–278. [Google Scholar]
Yli-Heikkilä, M.; Wittke, S.; Luotamo, M.; Puttonen, E.; Sulkava, M.; Pellikka, P.; Heiskanen, J.; Klami, A. Scalable crop yield prediction with Sentinel-2 time series and temporal convolutional network. Remote. Sens. 2022, 14, 4193. [Google Scholar] [CrossRef]
Olofintuyi, S.S.; Olajubu, E.A.; Olanike, D. An ensemble deep learning approach for predicting cocoa yield. Heliyon 2023, 9, e08351. [Google Scholar] [CrossRef]
Sun, J.; Lai, Z.; Di, L.; Sun, Z.; Tao, J.; Shen, Y. Multilevel deep learning network for county-level corn yield estimation in the US corn belt. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5048–5060. [Google Scholar] [CrossRef]
Liu, F.; Jiang, X.; Wu, Z. Attention. Mechanism-Combined LSTM for Grain Yield Prediction in China Using Multi-Source Satellite Imagery. Sustainability 2023, 15, 9210. [Google Scholar] [CrossRef]
Nasr, I.; Nassar, L.; Karray, F.; Zayed, M.B. Enhanced Deep Learning Satellite-based Model for Yield Forecasting and Quality Assurance Using Metamorphic Testing. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–6. [Google Scholar]
Zhang, J.; Tian, H.; Wang, P.; Tansey, K.; Zhang, S.; Li, H. Improving wheat yield estimates using data augmentation models and remotely sensed biophysical indices within deep neural networks in the Guanzhong Plain, PR China. Comput. Electron. Agric. 2022, 192, 106616. [Google Scholar] [CrossRef]
Morales, G.; Sheppard, J.W.; Hegedus, P.B.; Maxwell, B.D. Improved Yield Prediction of Winter Wheat Using a Novel Two-Dimensional Deep Regression Neural Network Trained via Remote Sensing. Sensors 2023, 23, 489. [Google Scholar] [CrossRef] [PubMed]
Espinosa, C.E.; Velásquez, S.; Hernández, F.L. Sugarcane Productivity Estimation Through Processing Hyperspectral Signatures Using Artificial Neural Networks. In Proceedings of the 2020 IEEE Latin American GRSS ISPRS Remote Sensing Conference (LAGIRS), Santiago, Chile, 22–26 March 2020; pp. 290–295. [Google Scholar] [CrossRef]
Zhou, W.; Song, C.; Liu, C.; Fu, Q.; An, T.; Wang, Y.; Sun, X.; Wen, N.; Tang, H.; Wang, Q. A Prediction Model of Maize Field Yield Based on the Fusion of Multitemporal and Multimodal UAV Data: A Case Study in Northeast China. Remote Sens. 2023, 15, 3483. [Google Scholar] [CrossRef]
Wang, X.; Huang, J.; Feng, Q.; Yin, D. Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-Producing Regions of China with Deep Learning Approaches. Remote Sens. 2020, 12, 1744. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems: 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, NV, USA, 3–6 December 2012; Volume 25. [Google Scholar]
Rakhmatulin, I.; Kamilaris, A.; Andreasen, C. Deep neural networks to detect weeds from crops in agricultural environments in real-time: A review. Remote Sens. 2021, 13, 4486. [Google Scholar] [CrossRef]
Grohs, P.; Kutyniok, G. (Eds.) Mathematical Aspects of Deep Learning; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
Mohammadi, S.; Belgiu, M.; Stein, A. 3D Fully Convolutional Neural Networks with Intersection Over Union Loss for Crop Mapping from Multi-Temporal Satellite Images. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 5834–5837. [Google Scholar]
García-Martínez, H.; Flores-Magdaleno, H.; Ascencio-Hernández, R.; Khalil-Gardezi, A.; Tijerina-Chávez, L.; Mancilla-Villa, O.R.; Vázquez-Peña, M.A. Corn Grain Yield Estimation from Vegetation Indices, Canopy Cover, Plant Density, and a Neural Network Using Multispectral and RGB Images Acquired with Unmanned Aerial Vehicles. Agriculture 2020, 10, 277. [Google Scholar] [CrossRef]
Deng, Y.; Chen, R.; Wu, C. Examining. the deep belief network for subpixel unmixing with medium spatial resolution multispectral imagery in urban environments. Remote Sens. 2019, 11, 1566. [Google Scholar] [CrossRef]
Hassan, M.M.; Alam, M.G.R.; Uddin, M.Z.; Huda, S.; Almogren, A.; Fortino, G. Human emotion recognition using deep belief network architecture. Inf. Fusion 2019, 51, 10–18. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, F. An improved deep belief network prediction model based on knowledge transfer. Future Internet 2020, 12, 188. [Google Scholar] [CrossRef]
Kawamura, K.; Nishigaki, T.; Andriamananjara, A.; Rakotonindrina, H.; Tsujimoto, Y.; Moritsuka, N.; Rabenarivo, M.; Razafimbelo, T. Using a one-dimensional convolutional neural network on visible and near-infrared spectroscopy to improve soil phosphorus prediction in Madagascar. Remote Sens. 2021, 13, 1519. [Google Scholar] [CrossRef]
Chaerun Nisa, E.; Kuan, Y.-D. Comparative Assessment to Predict and Forecast Water-Cooled Chiller Power Consumption Using Machine Learning and Deep Learning Algorithms. Sustainability 2021, 13, 744. [Google Scholar] [CrossRef]
Albelwi, S.; Mahmood, A. A framework for designing the architectures of deep convolutional neural networks. Entropy 2017, 19, 242. [Google Scholar] [CrossRef]
Hou, R.; Chen, C.; Shah, M. An end-to-end 3D convolutional neural network for action detection and segmentation in videos. arXiv 2017, arXiv:1712.01111. [Google Scholar]
Mäyrä, J.; Keski-Saari, S.; Kivinen, S.; Tanhuanpää, T.; Hurskainen, P.; Kullberg, P.; Poikolainen, L.; Viinikka, A.; Tuominen, S.; Kumpula, T. Tree species classification from airborne hyperspectral and LiDAR data using 3D convolutional neural networks. Remote Sens. Environ. 2021, 256, 112322. [Google Scholar] [CrossRef]
Vrskova, R.; Kamencay, P.; Hudec, R.; Sykora, P. A New Deep-Learning Method for Human Activity Recognition. Sensors 2023, 23, 2816. [Google Scholar] [CrossRef]
Granger, E.; Kiran, M.; Blais-Morin, L.-A. A comparison of CNN-based face and head detectors for real-time video surveillance applications. In Proceedings of the 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Montreal, QC, Canada, 28 November–1 December 2017; pp. 1–7. [Google Scholar]
Zhao, H.; Chen, Z.; Jiang, H.; Jing, W.; Sun, L.; Feng, M. Evaluation of three deep learning models for early crop classification using sentinel-1A imagery time series—A case study in Zhanjiang, China. Remote Sens. 2019, 11, 2673. [Google Scholar] [CrossRef]
Zhao, W.; Meng, Q.-H.; Zeng, M.; Qi, P.-F. Stacked sparse auto-encoders (SSAE) based electronic nose for Chinese liquors classification. Sensors 2017, 17, 2855. [Google Scholar] [CrossRef] [PubMed]
Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M. Domain-Adversarial Neural Networks. arXiv 2014, arXiv:1412.4446. [Google Scholar]
Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
He, Y.; Zhao, J. Temporal Convolutional Networks for Anomaly Detection in Time Series. J. Phys. Conf. Ser. 2019, 1213, 042050. [Google Scholar] [CrossRef]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative Adversarial Networks: An Overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Wang, K.; Gou, C.; Duan, Y.; Lin, Y.; Zheng, X.; Wang, F.-Y. Generative adversarial networks: Introduction and outlook. IEEE/CAA J. Autom. Sin. 2017, 4, 588–598. [Google Scholar] [CrossRef]
Elbasi, E.; Zaki, C.; Topcu, A.E.; Abdelbaki, W.; Zreikat, A.I.; Cina, E.; Shdefat, A.; Saker, L. Crop Prediction Model Using Machine Learning Algorithms. Appl. Sci. 2023, 13, 9288. [Google Scholar] [CrossRef]
Ansarifar, J.; Wang, L.; Archontoulis, S. An interaction regression model for crop yield prediction. Sci. Rep. 2021, 11, 17754. [Google Scholar] [CrossRef] [PubMed]
Posch, K.; Arbeiter, M.; Pilz, J. A novel Bayesian approach for variable selection in linear regression models. Comput. Stat. Data Anal. 2020, 144, 106881. [Google Scholar] [CrossRef]
Jenul, A.; Schrunner, S.; Pilz, J.; Tomic, O. A user-guided Bayesian framework for ensemble feature selection in life science applications (UBayFS). Mach. Learn. 2022, 111, 3897–3923. [Google Scholar] [CrossRef]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
Neal, R.M. Bayesian Learning for Neural Networks; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
Gramacy, R. Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Sparapani, R.; Spanbauer, C.; McCulloch, R. Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package. J. Stat. Softw. 2021, 97, 1–66. [Google Scholar] [CrossRef]
Datta, A.; Banerjee, S.; Finley, A.O.; Gelfand, A.E. Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets. J. Am. Stat. Assoc. 2016, 111, 800–812. [Google Scholar] [CrossRef] [PubMed]
Posch, K.; Arbeiter, M.; Pleschberger, M.; Pilz, J. Variable Selection Using Nearest Neighbor Gaussian Processes. Bayesian Anal. 2024. to be submitted. [Google Scholar]
Zou, H. The Adaptive Lasso and Its Oracle Properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
Carbonetto, P.; Stephens, M. Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Anal. 2012, 7, 73–108. [Google Scholar] [CrossRef]
Damianou, A.; Lawrence, N. Deep Gaussian processes. In Proceedings of the Artificial Intelligence and Statistics, Scottsdale, AZ, USA, 29 April–1 May 2013; pp. 207–215. [Google Scholar]
Murphy, K.P. Probabilistic Machine Learning: Advanced Topics; MIT Press: Cambridge, MA, USA, 2023. [Google Scholar]
You, J.; Li, X.; Low, M.; Lobell, D.; Ermon, S. Deep Gaussian Process for Crop Yield Prediction Based on Remote Sensing Data. Proc. AAAI Conf. Artif. Intell. 2017, 31, 4559–4565. [Google Scholar] [CrossRef]
Sharifi, A. Yield prediction with machine learning algorithms and satellite images. J. Sci. Food Agric. 2021, 101, 891–896. [Google Scholar] [CrossRef] [PubMed]
Parez, S.; Dilshad, N.; Alghamdi, N.S.; Alanazi, T.M.; Lee, J.W. Visual Intelligence in Precision Agriculture: Exploring Plant Disease Detection via Efficient Vision Transformers. Sensors 2023, 23, 6949. [Google Scholar] [CrossRef] [PubMed]
Wikle, C.K.; Zammit-Mangion, A.; Cressie, N.A.C. Spatio-Temporal Statistics with R; Chapman & Hall/CRC the R Series; CRC Press, Taylor and Francis Group: Boca Raton, FL, USA, 2019. [Google Scholar]

Figure 1. The methodological protocol of our systematic literature review.

Figure 2. Distribution of studies per year of publication.

Figure 3. Distribution of agricultural crops used by authors.

Figure 4. Distribution of deep learning architectures used by authors.

Figure 5. Deep neural network architecture.

Figure 6. Deep belief network architecture [117].

Figure 10. Region-based fully convolutional network architecture.

Figure 11. Long Short-Term Memory architecture [87].

Figure 12. Gated Recurrent Units architecture [125].

Table 1. Inclusion and exclusion criteria used.

Inclusion Criteria	Exclusion Criteria
The study must be empirical research that includes a methodology and results section	The paper is a survey or a review article
The study must have been published within the last eight years	Articles published before 2015 The considered paper has already been included from another database search
The paper discusses the application of deep learning to crop yield prediction	The publication does not deal with the application of deep learning to agricultural yield prediction The article does not meet the quality standards required for inclusion in our systematic review
The article is written in English	The article is written in a language other than English

Table 2. Distribution of articles and databases used.

Database	Number of Articles Remaining after Application of Inclusion and Exclusion Criteria
DOAJ	30
IEEE	15
MDPI	15
ScienceDirect	32
Total	92

Table 3. Details of selected publications.

Deep Learning Architecture	Main Input	Crop	References
1D CNN	Markers, UAV data, satellite data, environmental data	Wheat, corn, barley, rice	[9,25,26,27,28,29,30,31]
2D CNN	Hand-held devices, UAV data, RGB ground imagery, ground agricultural stations data, satellite data, web crawling (Google Web) data, satellite data, environmental data, in-field images (Canon camera)	Wheat, rapeseed, rice, tiger nuts, sunflower, sorghum, apple, pear, chives, onion, soybean, tomato, potato, corn, barley, wild blueberries, almond	[4,8,29,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54]
3D CNN	UAV data, satellite data	Rice, maize, soybean, wheat, barley, oats	[36,44,45,55,56,57,58,59]
DNN	Environmental data, UAV data, spectrometers, satellite data, web crawling (Google Web) data	Corn, tea, soybean, wheat, lettuce, cotton, apple, pear, chives, onion, coffee, rice	[5,6,7,9,17,28,31,41,45,60,61,62,63,64,65,66,67,68,69,70]
R-CNN	UAV data, land-based vehicle, hand-held camera, Cloud platform	Apple, wheat, spinach, citrus, rice	[71,72,73,74,75,76]
LSTM	Satellite data, environmental data, UAV data	Cotton, wheat, soybean, maize, tomato, potato, rice, rapeseed mustard, barley, bajra, jowar, onion	[8,31,42,47,49,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93]
DBN	Environmental data	Apple, banana, castor oil seed, cherries, chick peas, chili, cocoa, beans, coconuts, coffee, green, ginger, maize, onion, orange, papaya, pepper, potato, rice, sunflower, tea, tomato, wheat, olicrop	[94]
DANN	Satellite data	Corn	[95]
ConvLSTM	UAV data, satellite data, environmental data	Wheat, barley, oats, rice, soybean	[43,44,58,59,93]
GRU	Environmental data, satellite data	Tomato, potato, cereal, wheat	[42,96,97]
LSTM-DNN	Environmental data, satellite data	Soybean, corn, rice, sugarcane, cotton	[98,99]
Transformer	Satellite data, environmental data	Rice	[85]
TCN	Satellite data	Winter wheat, spring wheat, rye, feed barley, malting barley, oats	[100]
SSAE	Satellite data	Rice, corn, soybean	[47]
CNN-LSTM	Environmental data, satellite data, UAV data	Cocoa, maize, soybean, wheat, barley, oats, rice, beans, potatoes	[8,43,44,52,55,58,93,101,102,103]
1D CNN-2D CNN	UAV data	Corn	[29]
1D CNN-LSTM	Satellite data, environmental data	Rice	[31]
DFNNGRU-DeepGRUs	Satellite data	Strawberry	[104]
2D CNN-GAN	Satellite data, environmental data	Wheat	[105]
3D CNN-2D CNN	Satellite data	Wheat	[106]
3DCNN +LSTM	Satellite data	Wheat, corn	[49]
3DCNN-ConvLSTM	Satellite data	Soybean	[44]
Autoencoders	Spectrometry techniques	Sugarcane	[107]
CNN-attention-LSTM	LiDAR, UAV data, environmental data	Maize	[108]
CNN-GRU	Satellite data	Wheat	[97]
CNN-LSTM-GRU	Satellite data, environmental data	Wheat	[27]
LSTM-1D CNN	Satellite data, environmental data	Rice	[31]
LSTM-CNN	Satellite data	Wheat	[109]
SAE-CNNLSTM	Satellite data	Strawberry	[104]

Table 4. Frequency of main input data.

Data Type	Number of Publications
UAV (RGB)	16
UAV (multispectral)	11
UAV (hyperspectral)	1
UAV (thermal)	2
Satellite	49
Hand-held device	4
Land-based vehicle	1
Ground agricultural stations	2
Markers	2
Environmental data	20
Spectrometry techniques	2
Web crawling	1
Light Detection And Ranging (LiDAR)	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Meghraoui, K.; Sebari, I.; Pilz, J.; Ait El Kadi, K.; Bensiali, S. Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges. Technologies 2024, 12, 43. https://doi.org/10.3390/technologies12040043

AMA Style

Meghraoui K, Sebari I, Pilz J, Ait El Kadi K, Bensiali S. Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges. Technologies. 2024; 12(4):43. https://doi.org/10.3390/technologies12040043

Chicago/Turabian Style

Meghraoui, Khadija, Imane Sebari, Juergen Pilz, Kenza Ait El Kadi, and Saloua Bensiali. 2024. "Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges" Technologies 12, no. 4: 43. https://doi.org/10.3390/technologies12040043

APA Style

Meghraoui, K., Sebari, I., Pilz, J., Ait El Kadi, K., & Bensiali, S. (2024). Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges. Technologies, 12(4), 43. https://doi.org/10.3390/technologies12040043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges

Abstract

1. Introduction

2. Comparison of Related Works and Existing Surveys with the Present Review Study

3. Methodology

3.1. Review Protocol

3.2. Practical Approach

3.2.1. Formulation of the Research Questions

3.2.2. Search Strategy for Relevant Studies

3.2.3. Definition of Criteria for Inclusion and Exclusion

3.2.4. Quality Assessment

3.2.5. Data Extraction

4. Results

4.1. Selected Studies

4.2. Statistical Analysis

4.3. Crops Examined in Studies Using Deep Learning Models

4.4. Deep Learning Architectures Used in Crop Yield Prediction

4.4.1. A Brief Overview of Deep Learning

4.4.2. Deep Learning Models Applied for Forecasting Agricultural Yields

4.5. Main Input Data

4.6. Requirements and Challenges in Predicting Crop Yield Using DL

4.7. Bayesian Approaches to the Challenges of Crop Yield Prediction

5. Discussion

6. Recommended Best Studies for Readers

7. Open Issues and Future Road Maps

8. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI