Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification

Pintus, Maurizio; Colucci, Felice; Maggio, Fabio

doi:10.3390/iot6010013

Open AccessReview

Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification

by

Maurizio Pintus

^*

,

Felice Colucci

and

Fabio Maggio

Center for Advanced Studies, Research and Development in Sardinia (CRS4), Loc. Piscina Manna, Ed. 1, 09050 Pula, Italy

^*

Author to whom correspondence should be addressed.

IoT 2025, 6(1), 13; https://doi.org/10.3390/iot6010013

Submission received: 31 October 2024 / Revised: 5 February 2025 / Accepted: 7 February 2025 / Published: 10 February 2025

Download

Browse Figures

Versions Notes

Abstract

:

Advances in deep learning (DL) models and next-generation edge devices enable real-time image classification, driving a transition from the traditional, purely cloud-centric IoT approach to edge-based AIoT, with cloud resources reserved for long-term data storage and in-depth analysis. This innovation is transformative for agriculture, enabling autonomous monitoring, localized decision making, early emergency detection, and precise chemical application, thereby reducing costs and minimizing environmental and health impacts. The workflow of an edge-based AIoT system for agricultural monitoring involves two main steps: optimal training and tuning of DL models through extensive experiments on high-performance AI-specialized computers, followed by effective customization for deployment on advanced edge devices. This review highlights key challenges in practical applications, including: (i) the limited availability of agricultural data, particularly due to seasonality, addressed through public datasets and synthetic image generation; (ii) the selection of state-of-the-art computer vision algorithms that balance high accuracy with compatibility for resource-constrained devices; (iii) the deployment of models through algorithm optimization and integration of next-generation hardware accelerators for DL inference; and (iv) recent advancements in AI models for image classification that, while not yet fully deployable, offer promising near-term improvements in performance and functionality.

Keywords:

precision agriculture; crop continuos monitoring; image classification; edge-based AIoT; real-time deep learning; hardware accelerators; synthetic datasets; wireless communication; KANs; XNets

1. Introduction

Modern agriculture faces diverse challenges that extend beyond traditional farming, encompassing broader social and environmental concerns. For instance, early diagnosis of crop emergencies such as pests, diseases, weeds, and water or nutrient deficiencies is critical, as these issues can lead to severe harvest losses and economic damage, exacerbated by climate-change-induced extreme weather. Early detection enables targeted interventions to mitigate these impacts. Another significant example is precision farming, which involves applying treatments (e.g., pesticides, fertilizers, water) at variable rates based on field heterogeneity. Both challenges share a reliance on chemical products, posing dual drawbacks: (i) high costs for purchase and application and (ii) negative impacts on the environment and human health. Traditionally, diagnosing emergencies involves experts inspecting crops, often in rotation. However, this method is increasingly impractical. For instance, a 1-hectare crop with 2-m inter-row spacing requires inspecting 5 km of linear development. Daily monitoring is infeasible, as the average EU farm has 18 hectares per worker (2020 data, authors’ calculations based on [1,2]: this ratio is purely indicative, as the distribution of land among farmers is highly uneven). As for precision farming, while advanced machinery supports such practices, they often remain underutilized, also due to a lack of necessary input data for onboard systems in the form of prescription maps detailing specific treatments. Traditional methods for generating such maps—using satellite images for large-scale grain crops or drones (Unmanned Aerial Vehicles, UAVs) for smaller or detail-intensive cultivation—are subject to technical and practical constraints, including limited image resolution and high service costs.

AI offers a potential approach to these challenges. Deep learning (DL) models, particularly convolutional neural networks (CNNs; see [3]), excel in automatically classifying images without human input. This theoretically enables the development of automated systems integrated into agricultural machinery and terrestrial rovers (Unmanned Ground Vehicles, UGVs) capable of autonomous operation. These systems could, in principle, continuously monitor crops, identify potential emergencies, and report their nature and location to human experts. Similarly, it is conceivable to equip smart machinery, such as booms, with advanced sensors and electromechanical systems to identify targets (e.g., weeds) on the move and adapt their actions accordingly.

However, these applications, along with many others of significant potential interest, are inherently constrained by a fundamental requirement: real-time processing. Consequently, the traditional cloud-centric IoT approach—where field data are transmitted to remote computers via a local wireless network (and potentially the Internet) for processing before being sent back to the crop—is largely impractical.

Fortunately, today, there is an important novelty. Recent advancements in AI models and hardware for deep learning acceleration have introduced a significant breakthrough. Image classification and segmentation can now be performed directly on edge devices with limited computational power and low energy consumption, making them ideal for integration into UGVs and agricultural machinery. This synergy of AI and IoT is commonly referred to as edge-based AIoT. Edge-based AIoT may enable the practical implementation of innovative agricultural concepts that were previously unfeasible, with autonomous robotics and real-time precision farming being two prominent examples. Smart UGVs [4] can mimic human experts by inspecting crops, detecting potential emergencies, providing initial risk classifications, and identifying exact locations within the field. Real-time precision farming, on the other hand, may involve tractors equipped with chemical tanks, hydraulic circuits, and a rear boom featuring RGB cameras and edge computers. These systems leverage GPUs or other hardware accelerators to classify plant images in real time, enabling precise detection of weeds or fungal diseases and triggering targeted AI spraying. Unlike ideas proposed in the past, which focused on simplified situations manageable with low-complexity methods (suitable for the hardware available at the time), advancements in the performance of next-generation devices now enable true real-time applications, with precision treatments to be performed during standard field operations. Figure 1 displays technical schematics of the systems described above.

This technology is not yet fully developed or available as an off-the-shelf solution. The examples in Figure 1 represent research and development prototypes rather than commercial products. However, the current state of the ICT arsenal—including AI models, datasets, dedicated hardware, and advancements in deep learning theory—has matured sufficiently to enable practical implementations of edge-based AIoT systems in agriculture.

This review examines the key components necessary to develop expert systems for agriculture by leveraging advancements in edge-based AIoT technology. Four priorities have been identified for the implementation of such solutions:

High-Quality Training Datasets: cultivated plants exhibit significant phenotypic variability and produce fruit within narrow timeframes, making it challenging to create high-quality datasets. Innovative approaches are essential to address this issue.
Deep Learning Models: the rapid evolution of DL models for computer vision demands careful selection. Choosing the appropriate model can determine the feasibility of real-time inference and enable effective performance even with limited datasets.
Hardware Awareness: a thorough understanding of state-of-the-art is critical. Manufacturers offer diverse solutions based on varying philosophies, making it essential to evaluate the global landscape and select the most suitable option for each application.
Emerging Deep Learning Methods: recent advancements in DL show game-changing potential. While not yet integrated into standard libraries and less accessible than established models, they warrant close monitoring for future agricultural applications.

We aim to provide a comprehensive review of the latest findings on the topics above, drawn from a broad range of research, including both theoretical works and practical references related to ICT technologies. Our goal is also to offer a useful toolkit for those designing and implementing modern agriculture systems based on cutting-edge AIoT solutions. However, given the rapidly evolving nature of these topics, we note that this review may become outdated within a few years. For this reason, we have focused on the most recent papers, prioritizing those aligned with the latest advancements in the field. We have intentionally limited the analysis of certain well-established aspects of ‘classical’ IoT for agriculture, such as sensor technology and wireless networks in the field. The pace of innovation in these areas is not as rapid as in the four priorities outlined above. Additionally, several recent reviews already address these topics, such as [4] for sensors related to computer vision, [5] for general-purpose sensors in precision farming, and [6] for wireless communication technologies in agriculture. Finally, we note that AIoT can support a wide range of applications relying on diverse data types, such as meteorological information, soil parameters, and chemical-physical measurements from plant leaf sensors. This review mostly focuses on image data and real-time methodologies for its analysis, with a particular emphasis on implementations in autonomous systems: nevertheless, in Section 7.2, we briefly review new trends in wireless communication and sensor technology.

This work is organized as follows. In Section 2, we present the background and related literature on AIoT and deep learning classifiers for agriculture. In Section 3, we describe the process used to search and compile the bibliography for this review. Section 4 covers the limited public image datasets available in this field. In Section 5, we highlight the most relevant works on computer vision algorithms. Section 6 reviews findings on synthetic dataset generation. In Section 7, we discuss recent advances in IoT and real-time systems. Section 8 outlines the core concepts behind the most promising emerging neural network architectures, detailing their strengths and weaknesses compared to traditional deep learning philosophy. Finally, Section 9 and Section 10 address future directions and provide a discussion and conclusions, respectively.

2. Related Work

For a systematic, general picture of digital agriculture and its transition from level 4.0 to 5.0, we recommend [7]. They provided a clear representation of the current status of ICT technology applications in modern farming, including IoT and AI models for data and image analysis. Few studies on AIoT for agriculture exist. Ref. [8] presents a comprehensive review of general IoT and AI applications in disease identification, farm monitoring, and agricultural data analysis. Unlike our approach, their work does not emphasize edge computing for agriculture but instead highlights the importance of 5G and broader network infrastructures, focusing primarily on the cloud-centric AIoT paradigm. Their section on challenges in technology adoption is particularly insightful. Notably, one of the issues they address—privacy and data security—can be significantly mitigated by transitioning from a cloud-centric approach to a hybrid model, where most data are collected and processed locally in real time. Another notable work on AIoT for agriculture is [9], a research topic book encompassing 12 papers that cover a wide range of applications. Among them, Sun et al. [10] presented a highly interesting approach for rapid pear detection to facilitate robotic harvesting, tested even under nighttime conditions. Zhou et al. [11] described a method for detecting robot pathways for navigation between vineyard rows (‘road extraction’) and propose an algorithm for fruit detection. Their proposed solutions are innovative, although the authors acknowledged certain technical limitations and suggested directions for further development. Wang et al. [12] addressed one of the major challenges in applying deep-learning-based computer vision models in agriculture—the lack of data—by proposing an effective method for adaptive image augmentation.

AI technology for greenhouse agriculture deserves separate consideration, given the significant differences compared to open field crops. These differences include environmental conditions, types of emergencies, data characteristics, sensors, and more. Akbar et al. [13] provided an exhaustive review of AI technology for smart greenhouses, covering monitoring of crop growth, recognition and classification, pest, disease, and insect detection, yield estimation, and weed management. Notably, they also outline the major limitations affecting current computer vision technology applied to this field, particularly the lack of high-quality datasets, the challenge of integrating disparate sensors carrying different types of physical information, and, even more importantly, the shortage of specialized, interdisciplinary human experts in the field.

Several reviews are devoted to AI-based image classification for agriculture in open fields: in the following, we provide a selection of remarkable works published in the last five years. Tripathi and Maktedar [14] analyzed the role of computer vision (CV) models in fruits and vegetables, stressing the importance of effective data preprocessing, which is often underrated. They also pay special attention to the mathematics behind crucial steps like feature extraction, and the computation and assessment of descriptors through similarity measures. In [15], IoT takes center stage, emphasizing sensor technology for agrometeorological data (temperature, moisture, etc.), soil parameters (pH, water content, etc.), Wireless Sensor Networks, and robotics. Early, autonomous recognition of diseases and pests is a key activity: consequently, several authors focus on this area. Hasan et al. [16] investigated machine learning (ML) methods, including unsupervised models, concentrating on feature representation in both classical ‘shallow’ classifiers and modern DL classifiers. The issue of real data availability is discussed, along with solutions to manage practical problems like examples scarcity through data augmentation. Mishra et al. [17] analyzed how this general problem is managed using autonomous platforms, mainly through spectral imaging, i.e., the convergence of traditional imaging and spectroscopy. They also provided an interesting explanation of why near-infrared (NIR) imaging is important for detecting plant diseases, which we highly recommend. Ngugi et al. [18], on the other hand, limited their investigation to methods based on ‘standard’ RGB imagery, mainly for practical and affordability reasons. These are important considerations for methods suitable for a large farming audience. They compare results from both hand-crafted feature extraction and deep learning, concluding that the combination of the two leads to higher performance. Orchi et al. [19] provided a valuable review on the taxonomy of main plant diseases and related symptoms, with a quantitative survey of 10 deep learning methods applied to detect them. We particularly appreciate their critical examination of unresolved challenges such as insufficient and/or unbalanced data, symptom similarity, image segmentation, and practical issues encountered when taking real images in the field, such as lighting conditions and camera use. They propose several beneficial strategies to tackle these problems.

In recent years, there has been a trend towards specialized reviews focusing on specific crops, exploring AI models tailored to address their particular challenges. Omaye et al. [20] explored plant disease detection, focusing on its application to four major species: apple, cassava, cotton, and potato plants. They addressed key points related to these crops, including the main diseases they are subject to, the trends in deep learning and machine learning models for their detection, and opportunities and recommendations for designing future strategies. Similarly, Jafar et al. [21] studied the major diseases affecting four reference crops: tomato, chilli, potato, and cucumber; they stressed the possible benefits to integrate AI with IoT technology, in particular drones.

Weed detection deserves special consideration due to its significant impact on farming economy, yet recognizing weeds is often challenging due to their phenotypic similarity to cultivated plants (even because of Vavilovian mimicry). Hasan et al. [22] provided a comprehensive review of the weeds landscape and detection methods, comparing traditional machine learning with deep learning models. The authors dedicated significant attention to the entire workflow, including in-field data acquisition using UAVs, robots, and other vehicles, as well as remote sensing and the utilization of public data repositories. They emphasized image preprocessing and preparation, highlighting the crucial step of annotation. AI models are also thoroughly reviewed, discussing their training and performance. Murad et al. [23] presented another comprehensive systematic review on the topic. They clearly and effectively illustrated the widespread issue of weeds and their impact on cultivated plants. The review encompasses both classical machine learning and recent deep learning methods, offering valuable insights into existing datasets. Additionally, they showcase AI methods applied to weed detection, along with their Key Performance Indicators and, where available, the approximate computer time required for training. Intriguingly, they list each weed along with its main target crops, the best AI model for detection, and the achieved accuracy. The review by Juwono et al. [24] is noteworthy for its focus on the mathematical aspects of weed detection workflows using AI models, an often neglected area. The authors conducted a thorough analysis of background removal techniques, which are critical for accurate detection, employing various vegetation indices and thresholds. They also delved into the distinctions between biological, spectral, and texture features. Similar to [23], they provided valuable information on the most effective AI models and their performance in detecting specific weeds. They also detail publicly available datasets. Qu and Su [25] adopted a similar approach, with the added value of two sections devoted, respectively, to (i) weed recognition applications, where they consider images from various sources, ranging from smartphones to drones and satellites and (ii) the integration of robots and IoT-enabled agricultural machines. Hu et al. [26] focused on DL-based weed recognition in large-scale grain fields, appropriately illustrating the differences among image classification, object detection, semantic segmentation, and instance segmentation.

There is a final group of references that we believe are of special interest: those targeting the technological and algorithmic aspects of AI models for agriculture. Yang et al. [27] addressed the recurrent problem of data scarcity using a few-shot learning (FSL) approach. In addition to the common data augmentation technique, they employed methods based on metric learning, external memory, and parameter optimization. Interestingly, they concluded that current agricultural few-shot learning is mostly theoretical and suggested the use of IoT to move this technology to a truly operational ground. Ragu and Teo [28] also dealt with object detection and classification using few-shot models, focusing on meta-learning algorithms. They summarized numerous research works on FSL for agriculture, along with the datasets used and the accuracy achieved. In [29], a beneficial survey of works on semantic segmentation of agricultural images is presented. This technique allows for pixel-level descriptions of embedded objects. For instance, it is feasible to identify pests by extracting the texture, shape, and size of insects in the image. The authors provided examples of semantic segmentation based on thresholding, clustering, and deep learning, and finally, reviewed challenges and strategies to improve AI for semantic analysis of images in agriculture. Guerri et al. [30] considered the analysis of hyperspectral images (HSIs) in agriculture, analyzing the issues that these data bring: redundancy of spectral bands, shortage of training samples, and complex relationship between spatial positions and spectral bands. The results of several machine learning methods are illustrated, not limited to CNNs. HSI can be used also for shedding light on soil properties like texture, moisture, nutrients content, carbon presence, and salinity. Regarding crops, HSI can help to obtain information about chlorophyll content, drought stress, and weed and disease detection [31].

Contributions of This Study

While a few reviews on AIoT in agriculture exist, as previously illustrated, they primarily focused on integrating the traditional cloud-centric paradigm with AI models available at the time of publication. In contrast, our work emphasizes edge-based AIoT for agriculture, a novel paradigm where most processing occurs directly in the field or nearby, close to the data source. Remote computers are reserved for specific tasks, such as long-term data storage, their in-depth analysis, and activities that do not require real-time processing. In general, we believe that the most effective ICT implementations in near-future agriculture will strongly rely on a well-balanced integration of edge and cloud computing paradigms, optimized to leverage the strengths of each.

In detail, our contributions can be summarized as follows:

A critical challenge in AIoT for agriculture is the scarcity of images and other data to train deep learning models, driven by factors such as the seasonality of crops, significant phenotypic variation across locations, and year-to-year variability within the same region. We provide a detailed review of three practical strategies to address this issue: utilizing open datasets, including the most recent ones, ‘classical’ data augmentation, based on simple geometric transformations of existing images, and generating synthetic images.
The landscape of deep learning models for computer vision is evolving rapidly. We focus on recent advancements that outperform existing methods and enable implementation on edge devices with limited hardware resources.
This progress is closely tied to next-generation computing devices, which surpass their predecessors in performance while maintaining moderate energy consumption. Unlike remote servers used in cloud computing, edge devices often face memory constraints. We review state-of-the-art methods tailored for edge deployment and the latest hardware innovations in this domain.
Recent neural network architectures, diverging from the traditional multi-layer perceptron approach, have emerged in the deep learning community. Although these models are not yet as mature as established methods, they hold significant promise and could soon surpass current technologies. We highlight two major trends in image classification and review the preliminary literature on these developments.

3. Methodology

Our work is based on a two-level bibliographic collection and analysis. First, we generated a large dataset using the most common search engines for scientific literature, namely Scopus by Elsevier [32], Google Scholar [33], arXiv [34], IEEE Xplore [35], PubMed [36], Science.gov [37], ScienceDirect [38], Semantic Scholar [39], and WorldWideScience [40]. We used the following queries, all coupled with the condition AND ‘agriculture’ (all strings were lowercased, searched either as full words or subwords; compound words, like ‘real time’, were searched with and without hyphenation): ‘IoT’; ‘artificial intelligence’ OR ‘AI’; ‘machine learning’ OR ‘ML’ OR ‘deep learning’ OR ‘DL’; (‘image’ OR ‘video’) AND (‘crop’ OR ‘plant’ OR ‘fruit’ OR ‘orchard’ OR ‘zootechnics’ OR ‘livestock’) AND (‘classification’ OR ‘recognition’ OR ‘analysis’); (‘disease’ OR ‘pest’ OR ‘weed’) AND (‘diagnosis’ OR ‘detection’ OR ‘recognition’ OR ‘classification’ OR ‘identification’); ‘soil’ OR ‘land’ OR ‘LULC’; (‘data’ OR ‘dataset’ OR ‘database’ OR ‘repository’) AND (‘public’ OR ‘synthetic’ OR ‘generate’); ‘remote sensing’; ‘real time’ OR ‘on the go’ OR ‘edge computing’ OR ‘HW’ OR ‘accelerators’ OR ‘CPU’ OR ‘GPU’ OR ‘NPU’ OR ‘TPU’ OR ‘cloud’; ‘Kolmogorov’ AND ‘Arnold’ AND ‘network’; ‘XNet’ OR ‘Cauchy’ AND ‘network’; ‘direction’ OR ‘perspective’ OR ‘development’ OR ‘challenge’ OR ‘4.0’ OR ‘5.0’; ‘wireless’ OR ‘communication’.

Unsurprisingly, the result was a huge dataset with many ‘false positives’, requiring careful manual intervention to remove redundancies, off-topic works, and resources not meeting the following criteria: (i) only articles/reviews from scientific journals, book chapters, or conference proceedings; (ii) English language only; (iii) open access only, with very few unavoidable exceptions; and (iv) only resources from recent years (the past decade). After an accurate and time-consuming cleanup, the result was a list of more than 1400 resources. The BibTeX version of the list is available upon request. The key features of this dataset are reported in Figure 2.

Figure 2 (left) clearly shows an exponential trend in the number of published studies on deep learning and disease detection in agriculture. The topic ‘plant’ is not considered as a general term but rather in connection with ‘image’ and ‘classification’. Additionally, topics such as remote sensing and machine learning have also shown significant growth in recent years. We recall that all queries were performed using the keyword ‘agriculture’ in conjunction with each term. Figure 2 (right) shows a donut chart of publications from the past decade categorized by topic (outer division), with the inner chart grouping these topics into semantic categories.

Figure 3 shows the distribution of works by the journals in which they were published. Appendix A provides the full names corresponding to each abbreviated journal name.

During these operations, biases are inevitably introduced based on individual interests, and this is reflected in the final result. For instance, we are mainly interested in the ICT aspects of image classification in agriculture, with a special focus on crops and cultivated plants. Therefore, we used a coarse query for zootechnics, while we could have been more detailed by including keywords such as ‘cattle’, ‘beef’, ‘pig’, ‘ear’, and others. Following these premises and ideas extensively illustrated in the Introduction, we generated a second-level, smaller dataset, whose topics distribution is illustrated in Figure 4. We prioritized areas such as AIoT, synthetic data generation, and real-time AI inference over the larger, more agronomical fields of the original dataset, which have already been investigated in many recent literature surveys. Additionally, we placed strong emphasis on the most recent works. This was performed, once again, by resorting to accurate manual annotation, generating a final bibliography with more than 170 resources, which forms the basis of this review.

A comparison between Figure 2 (right) and Figure 4 reveals that, transitioning from the first to the second level, certain semantic categories have gained prominence. Notably, the category ‘data’, including publicly available image datasets and synthetically generated datasets, has become particularly significant and will be examined in detail in Section 4 and Section 6. Likewise, the categories ‘Perspectives’ and ‘Hardware’ feature more prominently in the reviewed bibliography compared to the broader first-level database. We remark that in Figure 2 (right) and, much further, in Figure 4, the significance of the category ‘harware’ is underestimated because most of the information on computer devices and AI accelerators has been taken from the web pages of the manufacturers and other miscellaneous sources of the Internet, which were not included in either bibliographic database.

4. Public Datasets

Several surveys focus on datasets for precision agriculture. Notably, ref. [41] reviewed public datasets for computer vision tasks in precision agriculture, including 15 datasets on weed control, 10 on fruit detection, and 9 on miscellaneous applications. They also provided recommendations for creating new image datasets, addressing aspects such as image acquisition, augmentation, annotation, and data sharing.

Table 1 lists reviewed public datasets for computer vision tasks in precision agriculture, along with their size, crops/weeds/fruits that are represented, the main tasks for which they could be used, and the countries where the photos were shot. The shown number of images indicate the number of raw photos and possibly the number of images after their augmentation or subdivision, if available.

Some datasets have several crop species, usually with images of both healthy and unhealthy plants and sometimes with the corresponding weeds. One of the most used is the PlantVillage dataset [42], which contains 54,309 images of 14 crop species captured in experimental research facilities connected to American Land Grant Universities, divided into 38 classes of healthy and unhealthy crops. The crops include apple, blueberry, cherry, corn, grape, orange, peach, bell pepper, potato, raspberry, soybean, squash, strawberry, and tomato, while the diseases include 17 fungal diseases, 4 bacterial diseases, 2 mold (oomycete) diseases, 2 viral diseases, and 1 disease caused by a mite. The PlantDoc dataset [43] contains 2598 images of 13 plant species divided into 27 classes. It was created by downloading images related to the 38 PlantVillage classes using their scientific and common names from Google Images and Ecosia. The resulting 20,900 images were then manually selected according to APSNet guidelines. Unlike PlantVillage images, which have homogeneous backgrounds, PlantDoc images present natural environmental conditions. According to the authors, fine-tuning the models on PlantDoc instead of PlantVillage reduces the classification error by up to 31% in the classification or detection of images in real scenarios. The Plant Pathology Databases for Image-Based Detection and Recognition of Diseases (PDDB) [44] include a dataset with 2326 original images of 171 diseases and other disorders affecting 21 plant species and one with 46,513 images, obtained by subdividing the original images. The Plant Seedlings Dataset [45] contains 407 images with 960 unique plant seedlings belonging to 12 crop and weed species at several growth stages captured by using a consumer camera. Katra-Twelve [46] is a dataset provided by an Indian university and contains 4503 images divided into 12 plant species with 22 healthy and diseased leaf types. The Cashew, Cassava, Maize, Tomato (CCMT) dataset [47] is a dataset for crop diseases that includes 24,881 raw images from local farms in Ghana divided into 22 classes, then augmented to 102,976 through cropping and resizing. Chen et al. [48] released a dataset captured infield with non-uniform illumination intensities and clutter field background conditions containing 400 images with 4 maize diseases and 500 images with 5 rice diseases.

Other datasets are specific to weeds. The leaf counting dataset [49] contains 9372 images collected in fields across Denmarks using several cameras with 18 weed species or families at different stages of growth and, therefore, with a different number of leaves. The images are divided into nine classes according to the number of leaves in each plant and present several types of soil and light conditions. DeepWeeds [50] contains 17,509 images of eight weed species collected by a robot from eight locations across northern Australia. The Open Plant Phenotype Database (OPPD) [51] contains 7590 images with 64,292 plant seedlings of 47 weed species, cultivated under ideal, drought, and natural growth conditions. The authors provide bounding box annotations for 315,038 plant objects. Weed25 [52] is a dataset of weeds containing 14,035 images related to 25 different weed species acquired from fields and lawns in China at different time points and with different weather conditions (sunny, cloudy, and rainy).

Most datasets are related to a single crop species, often with the corresponding weeds. Rice Leaf Disease Image Samples [53] is a dataset of 5932 images that includes 4 kinds of rice leaf diseases found in the western region of the Indian state of Odisha: bacterial blight, blast, brown spot, and tungro. Li et al. [54] made available a dataset of 7220 photos taken by mobile phone at a rice experimental base in China that includes three common rice leaf diseases: rice leaf blast, bacterial blight of rice, and flax leaf spot. CottonWeeds [55] is a dataset of 7578 images of two cotton weed species cropped from 1737 photos captured with a consumer reflex camera under different weather conditions and at different day periods from an Indian cotton field. CottonWeedID15 [56], CottonWeedDet3 [57], and CottonWeedDet12 [58] are three datasets of weeds that are common in cotton fields in southern USA. The images were acquired by either smartphones or hand-held digital cameras, under natural field light conditions, and at varied stages of weed growth. The size varies from the 848 images with 1532 bounding box annotations of CottonWeedDet3 to 5648 images with 9370 bounding box annotations of CottonWeedDet12. The Global Wheat Head Detection (GWHD) dataset [59] contains 4700 high-resolution RGB images collected between 2016 and 2019 by nine institutions at ten different locations in various countries by using several cameras, depicting 190,000 labeled wheat heads. An updated version was released in 2021, incorporating 1722 additional images from five more countries, along with 81,553 new wheat head annotations [60]. The authors also provided guidelines for the development of new wheat head detection datasets, with a discussion about image acquisition, metadata to be associated, and labeling methods. The Sugar Beets 2016 dataset [61] contains 283 images divided into ten classes, sugar beet and nine types of weeds, and 12,340 images divided into three classes, crop, weed, and background. The images were acquired by a robot on a sugar beet farm over a period of three months under controlled lighting.

Table 1. Public datasets for precision agriculture.

Reference	Dataset	Images	Plants/Diseases	Main Tasks	Location
[42]	PlantVillage	54,309	14 crop species with 26 diseases	Crop disease classification	USA
[43]	PlantDoc	2598	13 plant species with 17 diseases	Crop disease classification	Worldwide
[44]	PDDB	2326/46,513	21 plant species with 171 diseases	Crop disease classification	Brasil
[45]	Plant Seedlings	407	12 crop and weed species	Plant Seedling Classification, weed classification	Denmark
[46]	Katra-Twelve	4503	12 plant species	Crop disease classification	India
[47]	CCMT	24,881/102,976	Cashew, Cassava, Maize, Tomato	Crop disease classification	Ghana
[48]	Plant Disease Detect	900	Four maize diseases and five rice diseases	Crop disease classification	China
[49]	Leaf counting dataset	9372	18 weed species or families	Weed classification, leaf counting	Denmark
[50]	DeepWeeds	17,509	Eight weed species	Weed classification	Australia
[51]	OPPD	7590	47 weed species	Plant Seedling Classification, weed detection	Denmark
[52]	Weed25	14,035	25 weed species	Weed detection	China
[53]	Rice Leaf Disease Image Samples	5932	Four rice leaf diseases	Rice disease classification	India
[54]	Rice disease pictures	7220	Three rice leaf diseases	Rice disease classification	China
[55]	CottonWeeds	7578	Two cotton weed species	Cotton weed classification	India
[56]	CottonWeedID15	5187	15 cotton weeds	Cotton weed detection	USA
[57]	CottonWeedDet3	848	Three cotton weeds	Cotton weed detection	USA
[58]	CottonWeedDet12	5648	12 cotton weeds	Cotton weed detection	USA
[59,60]	GWHD	6422	Wheat	Wheat head detection	Various countries
[61]	Sugar Beets 2016	12,623	Nine types of sugar beet weeds	Crop and weed segmentation	Germany
[62]	Plant Pathology 2021—FGVC8	About 20,000	Apple leaf diseases	Apple disease classification	USA
[63]	AppleLeaf9	14,582	Eight apple leaf diseases	Apple disease classification	China, USA
[64]	Potato Leaf Disease Dataset in Uncontrolled Environment	3073	Six potato diseases	Potato disease classification	Indonesia
[65]	Aberystwyth Leaf Evaluation Dataset	1676	Arabidopsis in various stages of growth	Leaf evaluation	UK
[66]	CD&S	2112/4455	Three corn diseases	Corn disease classification and detection, disease severity classification	USA
[67,68]	Vegnet	656	Three cauliflower diseases	Cauliflower disease detection	Bangladesh
[69]	Sun Flower Fruits and Leaves dataset	467/1,668	Three sunflower diseases	Sunflower disease classification	Bangladesh
[70]	Rice Seedling and Weed Dataset	28/224	Rice seedlings and weeds	Crop and weed segmentation	China
[71]	GrassClover	39,615	Two types of clovers and three types of weeds	Clover, grass and weed segmentation	Denmark
[72]	DeepFruits	724	sweet pepper, rock melon, apple, avocado, mango, orange and strawberry	Fruit detection	Australia
[73]	Date fruit	8079	date	Fruit detection, maturity analysis	Saudi Arabia

Plant Pathology 2021—FGVC8 [62], constructed by Cornell Initiative for Digital Agriculture, containing about 20,000 images of several apple foliar diseases, in particular apple scab, cedar apple rust, Alternaria leaf spot and frogeye leaf spot, and healthy leaves. Photos were taken using a consumer reflex camera and smartphones under various illumination, angle, surface, and noise conditions. AppleLeaf9 [63] contains 14,582 images of 8 apple leaf diseases, with 94% of them captured in the wild. The Potato Leaf Disease Dataset in Uncontrolled Environment [64] contains 3073 images with seven classes, i.e., healthy leaves and leaves attacked by viruses, bacteria, fungi, pests, nematodes, and phytophthora, captured by smartphone from potato farms located in Central Java, Indonesia. The Aberystwyth Leaf Evaluation Dataset [65] comprises 1676 top-down, time-lapse visible spectrum images of Arabidopsis, acquired over 35 days with 15-minute intervals. It includes 916 annotated images, capturing 40 plants in various stages of growth. The Corn Disease and Severity dataset (CD&S) [66] contains 4455 images related to three common foliar corn diseases, i.e., northern leaf blight, gray leaf spot, and northern leaf spot, of which 2112 are field images and 2343 are augmented images obtained by changing the background of half of the field images. These raw images are also annotated with bounding boxes. The other 515 images are categorized according to disease severity in a range from 1 (resistant) to 5 (susceptible). Vegnet [67] includes 656 images of cauliflowers affected by three diseases, bacterial spot rot, black rot, and downy mildew, captured in Bangladesh using a digital camera. Uddin et al. [68] released an annotated version with bounding boxes indicating the disease locations. The Sun Flower Fruits and Leaves dataset [69] contains 467 sunflower images manually captured from a demonstration farm in Bangladesh with 3 types of diseases, then augmented to 1668 through rotations, scaling, shifting, noising, blurring, brightness, and contrast change. Ma et al. [70] presented a dataset with 28 original images of seedling rice with weeds captured in paddy fields in China, divided into 8 tiles to obtain 224 smaller images of size 912 × 1024 pixels. The images are segmented into rice seedlings, weeds, and background.

The GrassClover dataset [71] contains images acquired with three digital cameras at two experimental sites and in the fields of three dairy farms in Denmark in a 18-month period. It contains dense populations of grass with two types of clovers and three types of weeds characterized by heavy occlusions. The training set contains 8000 synthetic images with pixel-wise class and instance labels, 31,600 unlabeled images, and 152 randomly selected biomass labels and corresponding training images, while the test set contains 15 manually labeled images and 238 biomass pairs. The synthetic images have been generated by cropping out several plant species and plant parts from the original photos and adding them to soil background images after random rotation and scaling together with an artificial shadow created with a Gaussian filter until a preset leaf area index was reached.

Finally, there are some datasets that are specific to fruit. DeepFruits [72] contains between 54 and 170 images for each of 7 fruits: sweet pepper, rock melon, apple, avocado, mango, orange, and strawberry. The Date fruit dataset [73] contains two subsets of images related to dates: the first one includes 8079 images captured in Saudi Arabia by three color cameras divided according to fruit variety, maturity and harvesting decision, while the second one contains the images, videos, and weight measurements of date bunches acquired during the harvesting period.

5. Computer Vision Algorithms

Computer vision algorithms for precision agriculture are related to several tasks, including disease, weed, and pest recognition, health monitoring, irrigation management, and fruit, plant, or leaf counting. Disease and weed recognition are two of the most analyzed tasks and they can be furthermore categorized into: (i) classification of images according to the main subject of the image, which can be a binary one with a classification between healthy and unhealthy plants or crops and weeds or a multi-class one with the identification of the depicted plant species/family or disease; (ii) detection of healthy and unhealthy plants, disease regions, or weeds in the images, usually by using rectangular bounding boxes; and (iii) segmentation of the images, assigning the pixels to the class they belong to (e.g., healthy/unhealthy plant or crop, weed, and background). Table 2 and Table 3 lists, respectively, multi-crop and single-crop algorithms for computer vision tasks in precision agriculture proposed or analyzed by the reviewed publications, along with the used datasets, the augmentation techniques and the best results obtained on these datasets.

For classification tasks CNNs are widely used, sometimes in combination with vision transformers, a technology based on the attention mechanism that obtained good results on natural language processing [74]. For detection tasks one of the most used method is the YOLO one-stage algorithm [75], a standard in computer vision, particularly among object detection algorithms for real-time applications: several authors have compared their versions or tried to improve them for tasks related to precision agriculture. There are few works related to segmentation, probably due to the difficulty in annotating the individual pixels of the images, a fact that is reflected by the lack of public datasets annotated for this task. The distribution of the works on algorithms is similar to that of the works on datasets also in relation to crops, with some publications that proposed methods for analyzing several crop species and with the majority of them related to a single crop species, often rice or cotton. The following works belong to the first group. Zhao et al. [76] proposed a method based on YOLOv5, adding two lightweight structures for extracting feature information to the CSP structure of the neck part, the Ghost module [77] and the inverted residual structure, introduced in the MobileNetV2 architecture [78], and proposed the CAM module, with an improved channel attention mechanism. They also proposed an improved bounding box prediction method and an improved loss function called DIoU for replacing the Generalized Interface over Union (IoU) loss function of the original YOLOv5s, which introduces the overlap area and the distance of the centroids and that decreases convergence time by minimizing the distance between the two target frames. Augmentation includes the mosaic data enhancement method, proposed by Bochkovskiy et al. [79], in which four images are randomly cropped and merged into one only image: this method is used in the first 40% of training rounds, while in the remaining 60%, normal data augmentation such as flip, scaling, length and width distortion, and color gamut transformation is used. Their model achieved a mean Average Precision (mAP) of 95.92% in a dataset of 1319 images extracted from the PlantVillage dataset with five crops and eight disease types, to which transformations like Gaussian blurring, horizontal flip, random rotation, and random brightness change were applied for obtaining 4079 images. Yu et al. [80] proposed Inception Convolution and Vision Transformer (ICVT), a deep learning model that mixes the Inception architecture [81] with the vision transformer [82], obtaining an accuracy of 99.94% on the same PlantVillage dataset.

Table 2. Multi-crop computer vision algorithms for precision agriculture.

Reference	Algorithms	Datasets	Augmentation	Best Results
[76]	Improved YOLOv5 with two lightweight structures for extracting feature information added to the neck part and a new channel attention mechanism	A dataset of 1319 images extracted from the PlantVillage dataset with five crops and eight disease types	Mosaic data enhancement, flip, scaling, length and width distortion, color gamut transformation	mAP = 95.92%
[80]	ICVT, a mix between the Inception architecture and the vision transformer	PlantVillage		Accuracy = 99.94%
[83]	VGG-ICNN, a CNN model with four layers from VGG16 and three blocks form GoogleNet InceptionV7	PlantVillage, a dataset derived from the PDDB dataset with 18 plant species, the Apple Dataset from the plant pathology challenge, and the two subsets of Plant Disease Detect related to maize and rice diseases		Accuracy = 99.94% on PlantVillage, accuracy = 93.66% on PDDB
[84]	Lighter MobileNetV3-small model with two layers quantized with lower-precision representations	PlantVillage		Accuracy = 99.50%
[85]	A downscaled Inception architecture, a downscaled residual architecture, and a downscaled dense residual architecture	PlantVillage	Cropping, padding, vertical and horizontal flip, translations, scaling, shearing, rotation, image sharpening, dropping of pixel values and color channels, addition of Gaussian noise and Gaussian blur	Accuracy = 96.75% with the dense residual architecture
[86]	DFN-PSAN, a network with multi-level deep information feature fusion where feature extraction is carried out by an improved version of YOLOv5n with a PSAN classification layer and label smoothing	Katra-Twelve, BARI-Sunflower, containing 467 images of delicate leaves and infected sunflower leaves and flowers from Bangladesh, FGVC8, PlantVillage	A weather augmentation technique that simulates solar flares, the effect of rain, fog, and shadows from leaf shading	Accuracy = 99.89% on PlantVillage
[52]	YOLOv3, YOLOv5, and Faster-RCNN	Weed25		Accuracy = 92.4% with YOLOv5
[87]	Small Inception model applied on square patches with a size of 32 × 32 pixels	PlantVillage version with leaves segmented from the background, 108 images from the PDDB dataset		Accuracy = 94.04% on PlantVillage, Accuracy = 97.22% on the images from the PDDB dataset
[88]	DIC-Transformer, a mix between Faster R-CNN and Swin Transformer with image caption generation	Dataset with 3971 images of 10 plant species affected by 18 diseases		Accuracy = 85.4%

Table 3. Single-crop computer vision algorithms for precision agriculture.

Reference	Algorithms	Datasets	Augmentation	Best Results
[89]	DenseNet121, Inceptionv3, MobileNetV2, resNext101, Resnet152V, Seresnext101, an ensemble stack of Densenet121, EfficientNetB7 and XceptionNet	Dataset with 900 images of nine rice diseases	Horizontal and vertical flip, distortion, shear transformation, rotation from −15° to 15°, rotations of multiple of 90°, skewing and intensity change	Accuracy = 97.62% with the ensemble of Densenet121, EfficientNetB7 and XceptionNet
[54]	Improved YOLOv5s with a reduced workload in the backbone network	Dataset with 7220 images of rice diseases	Mosaic data enhancement, cropping, scaling, flip, translation, rotation	maP@0.5 = 98.65%
[90]	MSDB-ResNet, a multi-scale dual-branch model with a GAN and a ConvNeXt residual block incorporated into a ResNet	Rice Leaf Disease Image Samples	Cropping, scaling, mirroring, brightness change, motion blur	Accuracy = 99.34%
[91]	VGG-16 with an improved generalization in rice leaf detection	Rice Leaf Disease Image Samples	Cropping, tilting, rotation, blurring	Accuracy = 99.7%
[55]	11 deep learning models for image classification with cross-entropy and weighted cross-entropy losses, three YOLOv5 models	CottonWeeds		Accuracy = 95.43% with MobileNet, mAP@0.5 = 87.5% with YOLOv5x
[92]	13 single-stage and two-stage object detectors based on deep learning	Reduced version of CottonWeedDet3	Horizontal flip, shadow, rotation by 90°, brightness and contrast change, HSV and RGB shift, snow and rain, fancy PCA, blur, Gaussian noise	mAP@0.5 = 79.98% and mAP@[0.5:0.95] = 62.97% with RetinaNet R101-FPN
[58]	25 YOLO object detectors	CottonWeedDet12	Horizontal and vertical flip, rotation, compression, fancy PCA, brightness and contrast change, RGB shift, Gaussian and multiplicative noise, blur	mAP@0.50 = 95.63% and mAP@[0.5:0.95] = 90% with YOLOv4
[93]	EADD-YOLO, a model based on YOLOv5 with shufflenet inverted residual blocks inserted in the backbone and depthwise convolutions inserted in the neck	Dataset with 26,377 images of apple leaf diseases		mAP = 95.5%
[94]	DBCoST, a Dual-branch model with a CNN branch derived from the ResNet-18 model and a Transformer one derived from the Swin Transformer Tiny	Subset of FGVC8 with five disease types, AppleLeaf9 + 756 images from FGVC8	Horizontal and vertical flip, rotation, color jitter, normalization, Gaussian and salt-and-pepper noise	Accuracy = 97.32% on subset of FGVC8, accuracy = 98.06% on AppleLeaf9 + FGVC8
[95]	Enhanced YOLOX-Tiny with hierarchical mixed-scale units and convolutional block attention modules added to the neck part	Dataset with 340 images of tobacco crops with brown spot disease		AP = 80.56%
[96]	Lighter YOLOv5s with ghost convolution, an involution operator, an attention mechanism and a Content-Aware ReAssembly of Features	Dataset of 2246 images with seven strawberry diseases, PlantDoc	Mosaic data enhancement, HSV enhancement, brightness change, target occlusion	mAP@0.5 = 94.7% on their dataset, mAP@0.5 = 27.9% on PlantDoc
[97]	Lesion Proposal CNN	Dataset of 3411 images with strawberry diseases		Accuracy = 92.56%
[68]	Cauli-Det, an improved YOLOv8 with three additional convolution blocks and Hard Swish	VegNet		mAP@0.5 = 90.6%, mAP@[0.5:0.95] = 69.4%
[98]	Ten deep learning models	Dataset of 656 images with four classes of cauliflower diseases		Accuracy = 99.90% with EfficientNetB1, F1 = 99.62% with Xception
[99]	Faster-RCNN improved by employing the ResNet-50 model with the use of spatial channel attention as the underlying network for computing deep keypoints	CD& S		Accuracy = 97.89%, mAP = 94%
[100]	Model based on a masked autoencoder with a Vision Transformer as the backbone structure	Dataset of 3256 images with two classes of potato diseases, CCMT	Cropping, horizontal flip, rotation	Accuracy = 99.35% on their dataset, accuracy = 95.61% on CCMT
[101]	LeafSpotNet, a deep learning framework with a classification model based on MobileNetV3	Dataset with 2000 images with jasmine plant diseases	Conditional GAN	Accuracy = 97%

Thakur et al. [83] introduced a lightweight CNN model for the identification of crop diseases, VGG-ICNN, with 6 million parameters, less than most of the top performing deep learning models. This model has seven convolution layers: four initial layers from VGG16, mixed with two max pooling layers and pre-trained on ImageNet [102], and three blocks from GoogleNet InceptionV7, randomly initialized, followed by a Global Average Pooling layer instead of a flattening layer for reducing the number of trainable parameters. Final classification is carried out by a fully connected layers with a SoftMax activation function. Their model is compared with four other crop disease detection and classification algorithms and five standard lightweight models on five different datasets: PlantVillage, one derived from the PDDB dataset with 18 plant species, the Apple Dataset from the plant pathology challenge, and the two subsets of Plant Disease Detect related to maize and rice diseases. The proposed model outperforms all the other methods on all the datasets but MobileNetV2, that does better in two datasets out of five: in particular, it reaches an accuracy of 99.16% on the PlantVillage dataset and of 93.66% on the subset of the PDDB one. Khan et al. [84] proposed a model specific for edge computing devices. They quantized with lower-precision representations of 8 bits the “Linear” and “Conv2d” layers of the MobileNetV3-small model through the use of the Pytorch built-in quantization tool. The resulting model has 0.9 million parameters, but, pre-trained on ImageNet data, it maintains an accuracy of 99.50% on the PlantVillage dataset.

Macdonald et al. [85] introduced three lightweight deep learning architectures for classifying leaf diseases. According to the authors, full-scale state-of-the-art models designed for general purpose image classification tasks have a certain degree of redundancy when applied to objects like plants that exhibit similar shapes and sizes. In particular, they proposed a downscaled Inception architecture that includes Inception blocks from the GoogLeNet architecture [81], a downscaled residual architecture that includes residual blocks from the ResNet architecture [103], and a downscaled dense residual architecture that includes dense residual blocks from the DenseNet architecture. Augmentation includes vertical and horizontal flip, translations, scaling, shearing, cropping, padding, rotation, addition of Gaussian noise and Gaussian blur, image sharpening, dropping of pixel values, and color channels. The downscaled dense residual architecture achieves an accuracy of 96.75% on the PlantVillage dataset with an inference runtime of 31.7 ms on a NVIDIA RTX 3080 Laptop GPU, with a decrease in accuracy of only 1.25% from the full-scale DenseNet-121 model, that has 31x more parameters. Dai et al. [86] introduced DFN-PSAN, a network with multi-level deep information feature fusion. Feature extraction is carried out by an improved version of the YOLOv5n algorithm, where the convolutional neural network of the classification layer is replaced by a PSAN classification layer, which uses the PSA module [104] for multi-scale feature fusion. Label smoothing [105] was added to the cross-entropy loss function for reducing the risk of model overfitting. Their model obtained average accuracies between 93.24% and 98.37% in a k-fold cross-validation with the datasets Katra-Twelve, BARI-Sunflower, containing 467 images of delicate leaves and infected sunflower leaves and flowers from Bangladesh, and FGVC8. Furthermore, the proposed model achieves an accuracy of 99.89% on the PlantVillage dataset. They also used a weather augmentation technique to simulate solar flares and the effect of rain, fog, and shadows from leaf shading, which according to them cause an improvement in accuracy between 0.69% and 2.99% in the three datasets. Their work includes an analysis of other attention mechanisms, among which ParNet [106] reaches the highest accuracy of 94.78% on the FGVC8 dataset, and an interpretability analysis through the SHapley Additive exPlanation (SHAP) method [107]. Wang et al. [52] compared YOLOv3, YOLOv5, and Faster-RCNN on their Weed25 dataset, obtaining mAPs, respectively, of 91.8%, 92.4%, and 92.15%.

Bouacida et al. [87] introduced a method for generalizing the recognition of the plant diseases to all plant and disease types through the identification of the disease itself instead of considering only the visual appearance of the diseased leaf. To do so, they split each leaf image into smaller patches that did not contain any leaf characteristics; in particular, they use square patches with a size of 32 × 32 pixels, discarding patches with a percentage of black pixels greater to that of the original image, which is estimated to be around 50%. At inference time, the prevalence rate P of the disease is found by computing the percentage of unhealthy patches over all patches that make up the whole leaf:

P = \frac{P_{U} * 100}{P_{H} + P_{U}}

(1)

where

P_{U}

is the number of unhealthy patches and

P_{H}

is the number of healthy ones. They used as a CNN the small Inception model, a version of GoogLeNet Inception designed for small input sizes, training from scratch on the version of the PlantVillage dataset created by Mohanty et al. [108], where the leaves are segmented from the background. Their system achieved an accuracy on the same dataset of 94.04% and an accuracy of 97.22% on 108 images randomly taken from the PDDB dataset.

Zeng et al. [88] used image caption generation to generate textual descriptions of plant areas affected by diseases and also used it to improve disease classification. Their two-stage model, called DIC-Transformer, uses Faster R-CNN with Swin Transformer as backbone to detect the diseased area and generate its feature vector, then uses the Transformer to generate image captions and the image feature vector, weighted by text features to improve the disease prediction performance. Swin Transformer has been chosen among 16 different analyzed backbones implemented in two open-source frameworks, Detectron2 [109] and MMDetection [110]. According to the authors, thanks to the self-attention mechanism, the caption generation based on Transformers has a better handling of long-distance dependency, parallel computing capabilities, extraction of abstract features, and capturing of internal correlations in data or features. They also compiled a dataset containing 3971 images of 10 plant species affected by 18 diseases with descriptive information about their characteristics, ADCG-18. Images were selected in a two-step process, using firstly deep learning models to identify images not related to agriculture, then a manual filtering by agricultural professionals. The authors compared DIC-Transformer to 11 state-of-the-art caption generation methods and 10 state-of-the-art classification models and their system obtains the best performance, with values of BLEU-1, CIDEr-D, ROUGE, and accuracy, respectively, of 75.6%, 450.52, 72.1%, and 85.4%.

Other works concentrate on specific crops. Ahad et al. [89] compared six CNN architectures (DenseNet121, Inceptionv3, MobileNetV2, resNext101, Resnet152V, and Seresnext101) using a dataset of 900 images with a white background, evenly divided into nine classes of rice diseases from Bangladesh. The augmentation included random rotation from −15° to 15°, rotations of multiple of 90° at random, random distortion, shear transformation, horizontal and vertical flip, skewing, and intensity change, obtaining 10 augmented images for every original image. The DenseNet121 [111] and InceptionV3 models achieved a maximum accuracy of 97%. They also proposed an ensemble stack of Densenet121, EfficientNetB7, and XceptionNet based on a weighted voting scheme that reaches an accuracy of 97.62%. According to their findings, transfer learning can increase a accuracy up to 17%. Li et al. [54] proposed an improved version of YOLOv5s for reducing the workload of the backbone network for identification of rice diseases, reducing also the weight of the model by four times and increasing the prediction speed by three times. In particular, they deleted the Focus layer to avoid multiple slice operations and replace the C3 module in the backbone with a Shuffle block module, reducing the number of network parameters while capturing long-range spatial information. Their model obtains an mAP@0.5 of 98.65% and an mAP@[0.5:0.95] of 68.53% on their dataset of rice diseases, with a decrease of 0.18% and 1.48% from YOLOV5s. The augmentation includes mosaic data enhancement, flip, random translation, random rotation, random scaling, and random cropping, obtaining 18,456 images from the original 7220 photos. They also experimented with another network based on YOLOv5s, incorporating squeeze-and-excitation modules [112] and elements from the PP-Picodet network [80], but it failed to produce satisfactory results. Hu et al. [90] introduced MSDB-ResNet, a multi-scale dual-branch model that uses a GAN, incorporates the ConvNeXt residual block into the ResNet model to optimize the calculation ratio of the residual blocks, and adjust the size of the convolution kernel of each branch to extract disease features of different sizes. The authors tested this model on a dataset of 20,000 images obtained from the Rice Leaf Disease Image Samples through data augmentation methods such as random brightness, motion blur, mirroring, cropping, and scaling, obtaining an accuracy of 99.34%. Ritharson et al. [91] introduced a new architecture based on VGG-16 for improving the generalization in the classification of diseases that infect rice leaves, substituting the three fully connected layers of the original architecture with five dense layers and two dropout layers with 50% and 60% activation. According to the authors, the new layers make the model more capable to recognize intricate patterns and abstract features within the images. The VGG-16 model was chosen because it performed better than other pre-trained networks such as Xception, DenseNet121, InceptionResNetV2, InceptionV3, and ResNet50. Their model reaches an accuracy of 99.7% on the Rice Leaf Disease Image Samples dataset, subdivided by them according to the severity of the disease, i.e., according to the spread of infection above the surface of the leaf (mild or severe) and cleaned by removing duplicates, noisy and blurred images. The augmentation includes tilting, rotation, cropping, and blurring.

Saini and Nagesh [55] compared on their CottonWeeds dataset 11 deep learning models for image classification, trained through transfer learning from ImageNet, among which MobileNet achieves the highest accuracy at 95.43%. They also tried a loss adapted from the Weighted Cross-Entropy introduced by Phan and Yamamoto [113] for reducing the impact of class imbalance to the training and achieve an improvement from the models trained with the standard Cross-Entropy on the minority weed class purple nutsedge. Three YOLOv5 models are then evaluated in the task of object detection on the same dataset and YOLOv5x achieved the best performance, with an mAP@0.5 of 87.5%. Rahman et al. [92] compared 13 single-stage and two-stage object detectors based on deep learning for the detection of weeds in cotton on their dataset CottonWeedDet3—cleaned by discarding annotations smaller than 200 × 200 pixels and out-of-focus and by excluding images with more than 10 bounding boxes—and studied the effect of data augmentation on detection accuracy. Training was carried out by fine-tuning the pre-trained weights obtained through the MS COCO Dataset [114]. RetinaNet R101-FPN [115] achieves the best performance, with an mAP@0.5 of 79.98% and an mAP@[0.5:0.95] of 62.97%. According to their experiments, RetinaNet and Faster RCNN [116] models are better than YOLOv5 and EfficientDet [117] in detecting smaller bounding boxes. On the other hand, the authors recommended YOLOv5n and YOLOv5s models for deploying on resource-constrained mobile devices thanks to their reduced inference time and number of parameters, while maintaining at the same time satisfactory accuracies (mAP@0.5 of 76.58% and 77.47%, respectively). The augmentation, which includes horizontal flip, brightness and contrast change, random HSV and RGB shift, random snow and rain, fancy PCA, random blur, Gaussian noise, random shadow, and random rotation by 90°, increased the mAP@0.50 for the two models chosen for experiments up to 1.6% when the dataset size was increased 8×. Dang et al. [58] evaluated 25 YOLO object detectors belonging to seven versions from YOLOv3 to YOLOv7 on their CottonWeedDet12 dataset and found that the best one in terms of mAP@0.5 is YOLOv4 with 95.22% without augmentation, while in terms of mAP@[0.5:0.95], it is Scaled-YOLOv4 with 89.72% without augmentation. Training was carried out by fine-tuning the pre-trained weights obtained through the MS COCO Dataset, while performance was assessed with a Monte Carlo cross-validation, repeating model training and evaluation five times with different random seeds. Augmenting the original training set four times with horizontal and vertical flip, random rotation, Gaussian noise, compression, fancy PCA, change of brightness and contrast, RGB shift, multiplicative noise, and blurring, the mAP@0.50 of YOLOv4 increased to 95.63%, while its mAP@[0.5:0.95] increased to 90%. The authors also claimed that the most suitable models for real-time weed detection are YOLOv5n and YOLOv5s, which have a much smaller number of parameters and inference times with only slight decreases of mAPs. In this case, the beneficial effect of data augmentation is not clear, perhaps because YOLOv5 incorporated standard data augmentation approaches.

Zhu et al. [93] proposed EADD-YOLO, a model based on YOLOv5 for the detection of apple leaf diseases by inserting shufflenet inverted residual blocks in the backbone and depthwise convolutions as efficient feature learning modules in the neck. Furthermore, a coordinate attention module was embedded in critical locations to improve the detection of diseases of different sizes, and the SIoU was used instead of CIoU as the bounding box regression loss to improve prediction box localization accuracy. Their model reaches an mAP of 95.5% on a dataset of 26,377 images of apple leaf diseases taken both indoor and outdoor. Si et al. [94] proposed a dual-branch model called DBCoST for the classification of diseases in apple leaves. Their model combined CNNs, which are good at processing local features but whose limited receptive fields make them not so suitable for capturing global information, and Transformers, which on the other hand are good at capturing global information and establishing long-range dependencies with their self-attention mechanism, but do not extract very well local features. In particular, the CNN branch derives from the ResNet-18 model, while the Transformer one derives from the Swin Transformer Tiny, introduced by Liu et al. [118], a hierarchical Transformer based on a shift window design. A feature fusion module composed of two parts, a Concatenation and Residual Block, and an improved channel attention module, integrates the features extracted by the two branches. Their model reached an accuracy of 97.32% on a subset containing five disease types from the FGVC8 dataset, using as augmentation horizontal and vertical flip, random rotation, color jitter, and normalization, and an accuracy of 98.06% on the AppleLeaf9 dataset with 756 images of mixed diseases from FGVC8, using in this case as augmentation also Gaussian noise and salt-and-pepper noise.

Lin et al. [95] introduced an enhanced YOLOX-Tiny network for detecting brown spot disease in images of tobacco crops, introducing into the neck network hierarchical mixed-scale units (HMUs) for information interaction and feature refinement between channels and convolutional block attention modules (CBAMs) to further enhance the ability to extract useful features. Their network achieves an AP of 80.56% in their dataset with 340 images. Chen et al. [96] proposed a lighter YOLOv5s model for real-time strawberry disease detection. They enhance the original model with a GhostConv for feature map extraction, an involution operator with SiLU activation for capturing larger-scale contextual information, an attention mechanism for emphasizing relevant image features, and a CARAFE operator for content-aware upsampling. The authors evaluated their model on a dataset of 2246 images with seven strawberry diseases, with augmentations that include the mosaic method, HSV enhancement, variations in illumination, and target occlusion, obtaining a reduction of 45% in the number of parameters, of 77.5% in FLOPs, and of 42.6% in the size of the model with respect to the original YOLOv5s model, with an mAP@0.5 of 94.7%, which is 4.5% better than that of the original model. The authors evaluated their model also on the PlantDoc dataset, obtaining in this case an mAP@0.5 of 27.9%, with an increase of 0.9% to that of the original model. Hu et al. [97] introduced a Lesion Proposal CNN for the identification of strawberry diseases that firstly locates the main lesion object and then applies a lesion part proposal module to propose the discriminative lesion details. This system reached an accuracy of 92.56% on their dataset of 3411 images collected from Chinese fields and from the Internet.

Uddin et al. [68] modified the YOLOv8 object detector to classify cauliflower diseases and to localize the affected areas in the image by adding three additional convolutional blocks with a kernel size of 1, inserted before the output convolutional layer: this improves the processing of the feature maps without increasing the number of parameters significantly. While base YOLOv8 uses as activation function the Swish or Sigmoid-Weighted Linear Unit (SiLU), which incorporates a smooth sigmoid function, they used Hard Swish [119], which introduces a clipped linear function and improves efficiency without a decrease in detection performance. The proposed method, called Cauli-Det, reaches an mAP@0.5 of 90.6% and an mAP@[0.5:0.95] of 69.4% on the annotated version of the VegNet cauliflower disease classification dataset. Kanna et al. [98] compared 10 deep learning models on a dataset containing four classes of cauliflower diseases with 656 original images collected in Bangladesh. Used pre-processing includes conversion to grayscale, dilation, and erosion to add or remove the pixels from the boundaries of the images, histogram equalization, and adaptive thresholding to extract the plants from the background. EfficientNetB1 obtained the best results for accuracy, i.e., 99.90%, in the validation set and Xception, with an F1 score of 99.62%.

Masood et al. [99] enhanced Faster-RCNN by employing the ResNet-50 model with the use of spatial channel attention as the underlying network for computing deep keypoints, achieving an accuracy of 97.89% in classification and an mAP of 94% in detection of infections on the Corn Disease and Severity dataset. Wang et al. [100] proposed a model based on a masked autoencoder [120], a self-supervised learning algorithm with an encoder-decoder architecture that trains a model by masking parts of the image and trying to reconstruct those, with a Vision Transformer used as the backbone structure, for plant leaf disease recognition. It includes a convolutional block attention module [121] for enhancing the image features before the image blocks are passed to the encoder and a Gate Recurrent Unit [122] for capturing the sequential relationship between the diseased image blocks and enhancing the processing of temporal information of the features passed from the encoder. The model, pre-trained on the PlantVillage dataset, is tested on two datasets: a dataset of 3256 images related to potato diseases divided into three categories, late blight, early blight, and healthy, augmented through rotation, random cropping, and horizontal flip, and on the CCMT dataset. It reaches an accuracy of 99.35% on the first dataset and of 95.61% on the second one. V et al. [101] presented LeafSpotNet, a deep learning framework for detecting leaf spot disease in jasmine plants, widely cultivated in Southeast Asia. They used a classification model based on MobileNetV3 [123], a conditional GAN for data augmentation and the Swarm Particle Optimization method [124], which helps in eliminating the irrelevant features for enhancing the feature selection process, achieving a classification accuracy of 97% in their dataset of 2000 images. Transfer learning from the ImageNet-21 k dataset [125] is used for training. According to the authors, the system presents a good robustness in various conditions, including extreme camera angles, varying lighting conditions, and grain noise.

6. Synthetic Datasets

Although the significance of dataset quality is widely acknowledged, comprehensive evaluations of synthetic data generation as a potential solution remain relatively scarce. Existing scholarship underscores synthetic data’s capacity to enhance privacy, address access constraints, and expand limited datasets, yet often provides only partial insights into its underlying methodologies and evaluative frameworks [126,127,128]. To fill this void, our current investigation scrutinizes both the efficacy and limitations of synthetic data techniques, thereby promoting best practices and encouraging robust adoption. Recent advances in diffusion models substantiate their potential to produce high-fidelity synthetic datasets across diverse domains. In medical imaging, diffusion-based methods yield high-resolution 3D brain images and synthetic MRI/CT scans, reinforcing patient privacy and mitigating data scarcity [129,130], ultimately enhancing downstream training [131], though at considerable computational cost [132]. Meanwhile, Generative Adversarial Networks (GANs) remain pivotal, particularly in agricultural tasks where environmental variability and limited labeled data hinder model robustness [72,133,134,135,136]. Techniques like SMOTE [137] and ADASYN [138] further address class imbalance but require careful parameter tuning. Although retrieval-based augmentations can sometimes outperform generative approaches under constrained resources [139], diffusion frameworks such as DatasetDM continue to excel, generating both synthetic images and detailed annotations [131]. Collectively, these developments underscore synthetic data’s critical role in mitigating data scarcity, reducing annotation costs, and fostering more resilient and generalizable models across healthcare, agriculture, and beyond.

Table 4 presents an overview of pivotal publications and their contributions to synthetic dataset generation across diverse domains, which will be explored in greater detail later in this section. It highlights the methodologies employed, the primary focus of each study, and their main contributions. The entries encompass diverse approaches, including diffusion models, GANs, and hybrid datasets, showcasing their applications in fields such as medical imaging, agriculture, and generative modeling. This summary emphasizes advancements in synthetic data generation, its effectiveness in addressing domain-specific challenges, and its potential to enhance machine learning and data-driven solutions.

Yang et al. [140] explored the potential of using AI-generated images as data sources for enhancing visual intelligence. It delves into how generative AI, including Generative Adversarial Networks (GANs) and diffusion models (DMs), can produce synthetic images that closely resemble real-world photographs, offering unmatched abundance, scalability, and the ability to rapidly generate vast datasets. These synthetic images are useful for training machine learning models, simulating scenarios for computational modeling, and performing testing and validation. The paper discusses the technological foundations of this approach, including the utilization of neural rendering and the integration of 3D scene representations. It also addresses ethical, legal, and practical considerations, highlighting the transformative potential of synthetic data in advancing various computer vision tasks and applications such as image classification, segmentation, and object detection. The comprehensive survey underscores the significant impact of synthetic data on improving the efficiency, cost, and performance of AI models in visual intelligence. Burg et al. [139] investigate the efficacy of using diffusion models versus image retrieval for data augmentation in computer vision tasks. The authors conducted an evaluation comparing various techniques for generating augmented images, particularly focusing on the performance of diffusion models against a simpler nearest-neighbor retrieval method from the pre-training dataset. The key finding was that retrieval-based methods not only improved classification accuracy more effectively but also required significantly less computational resources compared to the sophisticated diffusion models. This suggests that for data augmentation purposes, the simplicity and efficiency of retrieval-based methods make them a more practical choice over complex generative approaches, especially in scenarios with limited computational resources.This was evident across different datasets, including a 10% subset of ImageNet [102] and the Caltech256 dataset [143]. Wu et al. [131] generated data using diffusion models, addressing the challenges of data scarcity and annotation costs in deep learning. The authors proposed DatasetDM, a model that leverages pre-trained diffusion models to generate diverse synthetic images with high-quality perception annotations, such as segmentation masks and depth maps. This is achieved through a unified perception decoder (P-Decoder), which decodes the latent code from the diffusion model into perception annotations. The methodology consists of two stages: training the P-Decoder with a minimal set of manually labeled images (less than 1% of the original dataset), and using the trained P-Decoder for infinite data generation guided by text prompts. This paradigm shift from text-to-image to text-to-data generation allows for the creation of large-scale annotated datasets efficiently. Experiments demonstrated that models trained on synthetic data generated by DatasetDM achieve state-of-the-art results in various tasks, including semantic and instance segmentation, depth estimation, and pose estimation. For instance, DatasetDM improves the mIoU on the VOC 2012 dataset [144] by 13.3% and the AP on the MS COCO 2017 dataset by 12.1%. Moreover, the synthetic data show enhanced robustness in domain generalization and can be flexibly applied to new tasks, such as image editing. The use of diffusion models in generating synthetic datasets for agriculture presents a significant advancement in overcoming data scarcity and enhancing the performance of machine learning models. These methods provide scalable solutions for generating annotated datasets that are crucial for training robust computer vision models in agricultural applications. Sapkota et al. [135] investigated the use of OpenAI’s DALL.E model for generating synthetic image datasets for agriculture. The study examined both text-to-image and image-to-image generation methods to create realistic agricultural images. It evaluated the generated images against ground truth images using metrics such as Peak Signal-to-Noise Ratio (PSNR) and Feature Similarity Index (FSIM), finding that image-to-image generation yielded better clarity but lower structural similarity. The research highlights the potential of AI-generated imagery to streamline data collection and enhance machine vision applications in agriculture, ultimately improving crop monitoring and yield estimation. Wachter et al. [141] emphasized the necessity of high-quality, diverse data for training AI systems. They identified the ‘data problem’, highlighting issues like data sparsity and class imbalance prevalent in agricultural data. The authors proposed hybrid datasets, combining real and synthetic data, as a solution to bridge the ‘reality gap’ of synthetic data. A unified taxonomy for data types—real, synthetic, augmented, and hybrid—is presented to clarify terminological inconsistencies in the literature. Real data are defined as information collected from the physical world, while synthetic data are generated through algorithms or manual processes. Augmented data, considered a subset of synthetic data, involves transformations of real data. Hybrid datasets contain both real and synthetic samples, improving model performance, especially in scenarios with limited real data availability. Voetman et al. [136] challenged the notion that deep object detection models always require extensive real-world training data. They introduced ‘Genfusion’, a framework that leverages pre-trained stable diffusion models to generate synthetic datasets. The key idea is to fine-tune these models on a small set of real-world images using a technique called DreamBooth, enabling the generation of images that closely resemble specific real-world scenarios. The authors demonstrated the effectiveness of Genfusion in the context of apple detection. They fine-tuned the model on a subset of images from the MinneApple dataset and used the generated synthetic data to train YOLO object detectors (YOLOv5 and YOLOv8). The performance of these detectors was then compared to a baseline model trained on real-world data. The results showed that object detectors trained on synthetic data performed comparably to the baseline model, with the average precision (AP) deviation ranging from 0.09 to 0.12. While the baseline models achieved higher AP scores, the results highlight the potential of synthetic data generation as a viable alternative to collecting large amounts of training data. Sehwag et al. [142] addressed the challenge of sample deficiency in low-density regions of data manifolds in image datasets. Applying diffusion models to generate novel high-fidelity images from these low-density regions can be particularly useful in agricultural datasets where certain conditions or scenarios are underrepresented. Their modified sampling process guides image generation towards low-density regions while preserving fidelity, ensuring the production of unique, high-quality samples without overfitting or memorizing training data. Lu et al. [133] provided an extensive review of the application of Generative Adversarial Networks (GANs) in agricultural image analysis. They focused on the challenges posed by biological variability and unstructured environments in obtaining large-scale, annotated datasets for training high-performance models. The review details the evolution of various GAN architectures such as DCGAN, CycleGAN, and StyleGAN, and their roles in image augmentation for tasks like plant health detection, weed recognition, and postharvest quality assessment. Olaniyi et al. [134] investigated the applications of Generative Adversarial Networks as a deep learning approach to data augmentation of agricultural images. They reported significant performance improvements when using GAN-augmented datasets for various tasks, including disease recognition, weed management, fruit detection, and zootechnics. Ref. [145] used Deep Convolutional Generative Adversarial Networks (DCGAN) to create a synthetic dataset for cotton leaves affected by various diseases. The DCGAN algorithm generates images that mimic the characteristics of the original dataset, thereby providing a larger and balanced set of training samples. This method enhances the performance of machine learning models in detecting diseases in cotton leaves. The study demonstrates that synthetic data generated using DCGAN improves model accuracy and efficiency, validating its potential in agricultural data augmentation.

7. Edge Computing and Real Time: Optimal Algorithms and Advancements in Hardware

In the context of digital agriculture, recent contributions converge on the utilization of edge computing devices (often Raspberry Pi and NVIDIA Jetson platforms) to achieve low-latency data processing and enable real-time decision making in diverse agricultural tasks. Table 5 lists key contributions to edge computing for digital agriculture.

By integrating advanced machine learning frameworks (including CNNs, RNNs, and Transformers) on resource-constrained edge devices such as Raspberry Pi and NVIDIA Jetson Nano, these works collectively advance the state of precision agriculture. Low-latency solutions and accurate, near-real-time analytics are attained through architectural layering, sensor fusion, and optimized models, paving the way for sustainable and scalable agricultural practices.

In order to achieve low latency when communicating information from the production environment, part of the data processing can be executed in edge computing devices that are close to the sensors. According to Restrepo-Arias et al. [146], the Raspberry Pi microprocessor is the most used device in edge computing for precision agriculture. Abioye et al. [147] presented an IoT-based monitoring framework for measuring soil moisture content, irrigation volume, and computation of the reference evapotranspiration. The data collected were transmitted to a Raspberry Pi 3 controller for onward online storage and displayed on the IoT dashboard. Adami et al. [148] proposed a system for detecting ungulates and for protecting the fields from their intrusion through the creation of virtual fences based on ultrasound emission. They evaluated the object detectors YOLOv3 and Tiny-YOLOv3 on two edge computing devices: the Raspberry Pi Model 3 B+ with or without the Intel Movidius Neural Compute Stick and the NVIDIA Jetson Nano. As connectivity solutions, they recommend LoRa and LoRaWAN. When the PIR sensor of the Animal Repelling device detects a movement, a message is sent to the edge computing device through a xbee radio interface. Then, the edge computing device executes the object detector, and if an animal is detected, a message is sent back to the Animal Repelling Device, which generates the ultrasound according to the detected animal. The authors collected 1000 images of wild boars and deer in the Tuscany region of Italy with both cloudy and sunny weather conditions, augmented to 10,000 through jitter, image rotation, flip, cropping, multi-scale transformation, hue, saturation, Gaussian noise, and intensity change. YOLOv3 reaches an mAP of 82.5%, while Tiny-YOLO an mAP of 62.4%. On the NVIDIA Jetson Nano, the first one has an average frame rate of 3 FPS, while the second one of 15 FPS in the 20 W operational mode. With the Raspberry the frame rate reaches at most 4 FPS. The authors also noted that with the out-of-the-box Jetson, the CPU temperature remains acceptable, while with Raspberry, a PWM fan must be used.

Prabu et al. [149] proposed an IoT-based crop field protection system that detects both crop diseases and animal interference. Motion, temperature, ultrasonic, and acoustic sensors in poles and drones sent collected data to a Raspberry Pi module 4b through wired and wireless mediums. Recurrent Convolutional Neural Networks (RCNNs) were used to detect abnormalities in leaf and field images; then, the observations were sent to Recurrent Generative Adversarial Neural networks (RGANs) for detailed analysis and to find the definitive anomalies. The authors tested their system on a dataset containing 1200 leaf images of tomato, potato, spinach, wheat, and corn plants and on a dataset containing 250 images of animal and bird intrusions, obtaining classification accuracies from 98.7% to 99.8%. Gomez et al. [150] introduced FARMIT, an architecture based on IoT and machine learning/deep learning for continuous assessment of agricultural crop quality. It is divided into three layers: the physical layer, which gathers information from sensors about crops and executes the corrective actions through the actuators according to the commands received from the other layers, the edge layer, whose purpose is to obtain low latency in communication with the sensors and to control and manage tasks on physical devices, and the cloud layer, which analyzes collected data through machine learning/deep learning models and gives the results to desktop, mobile, or web applications through REST services. Corrective actions can be executed by users of applications or automatically when an anomaly is measured by the sensors. The authors deployed FARMIT in a tomato plantation in the south of Spain, using sensors for temperature, wind, rain, electrical conductivity, humidity, radiation, and carbon dioxide and images from RGB cameras converted to the Lab color space. Data analysis was carried out by a Random Forest regressor with 100 estimators. Devi et al. [151] proposed a combination of IoT sensors and devices, image processing, and machine learning/deep learning for disease detection, weed detection, and process control for the cultivation of beans in India. Their system uses the Normalized Difference Vegetation Index (NDVI), which computes the amount of green vegetation contained by a land, to quantify the health of bean leaves, K-Means Clustering (KMC), Fuzzy C-Means clustering (FCM), and Region Growing methods to extract and analyze the diseased regions of the bean leaves, Local Binary Patterns (LBP), Gray—Level Co-occurrence Matrix (GLCM), and their combination Local Binary Gray Level Co-occurrence Matrix (LBGLCM) to capture physiological attributes of the bean leaves. Temperature, humidity, and soil moisture sensors are directly connected with an Arduino UNO board, while the ThingSpeak platform is used for visualization of captured data. They tried two different deep learning frameworks: the first one is based on an EfficientNetB7 with Bidirectional Long Short-Term Memory (BiLSTM), while the second one on a VGG16 with an attention layer integrated at each stage of the network to enhance the feature awareness, both pre-trained on ImageNet. The authors compare the proposed approach with the performance of three humans with up to 5 years of experience: the two models reached accuracies, respectively, of 95% and 96% in the classification between diseased and healthy leaves and in weed detection, while the human accuracy did not go above 80%.

Rashavand et al. [152] proposed a novel approach to Automatic Modulation Recognition (AMR) using Transformer networks, a deep learning architecture originally designed for natural language processing. The authors highlighted the potential of Transformers in AMR due to their ability to process sequential data in parallel and capture dependencies across different parts of the input sequence. They introduced four distinct tokenization strategies and evaluate their performance on two datasets: RadioML2016.10b and CSPB.ML.2018+. The results demonstrate that their proposed Transformer-based model, TransIQ, outperformed existing deep learning techniques in AMR accuracy, particularly in low signal-to-noise ratio conditions. This research aligns with the broader trend of applying Transformers to various domains beyond natural language processing, as seen in works like [74,154], showcasing the versatility and effectiveness of this architecture. Singh et al. [153] discussed the development and experimental validation of a smart agricultural drone integrated with IoT technologies and machine learning techniques. Their drone employed TensorFlow Lite with the EfficientDetLite1 model to identify crops, achieving an inference time of 91 ms. The system features two spray modes for optimal pesticide application, operating autonomously using real-time data. The drone, equipped with an X500 development kit, has a payload capacity of 1.5 kg, a flight time of 25 min, and a speed of 7.5 m/s at a height of 2.5 m. The research aims to enhance sustainable farming by improving pesticide use efficiency and crop health monitoring through precise and autonomous agricultural practice.

7.1. Emerging Edge Devices

In Table 6, we offer a concise survey of emerging high-performance edge devices—beyond the commonly employed Raspberry Pi and NVIDIA Jetson Nano—that can substantially enhance computational throughput for precision agriculture. These platforms leverage powerful GPUs, specialized AI accelerators, or advanced CPUs to accommodate increasingly demanding deep learning pipelines (e.g., complex CNNs, transformers, and other computationally intensive architectures). Our investigation considers the following selection criteria, which we deem pivotal for successful edge deployment in agricultural contexts:

Computational Performance: Assessment of CPU/GPU capabilities and the presence of hardware accelerators or specialized AI modules.
Memory Capacity: Evaluation of on-board RAM and storage, essential for real-time image/video processing and handling large model footprints.
Power Consumption: Analysis of operational power requirements, thermal management, and energy efficiency in variable environmental conditions.
Form Factor and Portability: Consideration of device size, weight, and suitability for integration with agricultural machinery or field stations.
Cost and Scalability: Balancing initial investment with scalability and ease of upgrades as sensor networks grow.
Ecosystem and Developer Support: Examination of software libraries, community resources, and compatibility with popular ML frameworks (e.g., TensorFlow Lite, PyTorch).

Table 6 highlights devices that exhibit higher computational capabilities than the Raspberry Pi or NVIDIA Jetson TK1 or TX1 or TX2, allowing more complex and larger-scale inference tasks at the edge. By adopting a holistic set of assessment metrics (computational performance, memory capacity, power consumption, form factor, cost, and ecosystem support), we aim to capture the practical trade-offs between theoretical efficiency and real-world feasibility in demanding agricultural scenarios. For instance, although the NVIDIA Jetson AGX Orin dominates in raw GPU power, its higher price point and increased power draw might prove challenging for large-scale field deployments that lack stable power infrastructure. Conversely, compact solutions like the Google Coral Dev Board or Intel Movidius NCS2 offer remarkable inference speed at a fraction of the energy cost, albeit with tighter memory constraints and certain limitations on model complexity.

As modern agriculture continues to incorporate more sophisticated deep learning models—e.g., transformers for specialized tasks such as automatic modulation recognition [152], object detection in aerial imagery [153], or real-time disease identification [149]—selecting an edge platform that effectively balances performance and resource usage becomes paramount. Next-generation platforms not only expand the computational envelope but also open avenues for real-time data fusion, multi-modal sensor integration, and advanced interpretability (e.g., through on-device saliency mapping). Consequently, they enable continuous or near-continuous monitoring of crops, weather, and livestock, thereby enhancing the timeliness and specificity of agronomic interventions. Future research should further quantify the lifecycle cost of these edge devices, as well as their environmental footprint, to guide sustainable technology adoption in precision agriculture.

7.2. The Role of Wireless Communication

In this review, we primarily focus on a technological framework centered on edge computing applications in modern agriculture. However, wireless communication and remote computing remain essential for tasks that cannot be performed in real time, such as complex analyses requiring human interaction, long-term data storage, and, all in all, operations that do not benefit from on-the-move execution. Examples include the generation of prescription maps, the acquisition of 3D models from point clouds, the representation of digital twins, and the creation of field datasets for training machine learning models.

Two comprehensive reviews on wireless communication in agriculture can be found in [5,6]. These works analyze and evaluate technologies such as Bluetooth, GPRS/3G/4G, Long Range Radio (LoRa), SigFox, WiFi, ZigBee, RFID, and NB-IoT in the context of digital agriculture. Their findings are particularly insightful, as they systematically compare key aspects such as power consumption, communication range, cost, applications, and limitations. We recommend them as practical references for selecting the most suitable protocol and designing a wireless framework tailored to specific agricultural applications. Beyond these references, recent reviews on advancements in the mentioned technologies include [155], which examined the role of LoRa in smart farming and its integration with IoT, and [156], which evaluated the performance of 5G in agricultural robotics, comparing it with 4G and WiFi6—an emerging wireless communication standard—particularly in the context of real-time applications.

A specific application of wireless technology in agriculture involves passive (battery-free) sensors that are externally powered by ground rovers or drones, enabling wireless data transmission without the need for an embedded battery or, even more limiting, cabled connections. This approach provides a significant advantage, as powering sensors and other fixed, crop-based devices remains a critical challenge, in addition to the risk of soil contamination from battery chemicals. While small photovoltaic panels can be used as an energy source, they are not an optimal solution due to economic and practical constraints. In [157], notable examples of passive humidity and temperature sensors are presented, along with an introduction to the concept of energy harvesting. Although this is traditionally associated with UAVs, it is also applicable to ground rovers, where the reduced distance to the sensors provides a clear advantage. An extreme implementation of passive sensors is presented in [158], where nature-inspired, millimeter-scale devices mimicking dandelion seeds are designed to enhance scalability and flexibility for various sensing and computing applications. These devices, capable of traveling 50–100 m in a gentle to moderate breeze, are powered by lightweight solar cells and energy harvesting, and can efficiently cover large crop areas.

As UGVs and agricultural machines primarily operate in open fields where continuous connectivity is not guaranteed, it is essential to consider solutions based on opportunistic communication. This network paradigm enables data transmission only under favorable conditions, making it well-suited for scenarios where traditional communication infrastructure is unreliable, intermittent, or energy-constrained. In such cases, data are temporarily stored on local devices (e.g., a UGV) and transmitted once a coverage area is reached. Notably, drones can also function as mobile relays, flying over a network of isolated sensors to collect data and subsequently transmit it to a receiving station. Another compelling application is opportunistic inter-peer communication, where, for example, multiple UGVs or UGV-UAV pairs exchange data to coordinate and optimize strategies for handling complex or large-scale tasks [159,160].

We conclude this section with a discussion on Global Navigation Satellite System (GNSS), a fundamental technology for precision farming. While not an absolute novelty, GNSS has recently become accessible through low-cost solutions that were unfeasible just a few years ago, when only high-end, next-generation tractors or expensive aftermarket products provided such capabilities. This affordability is primarily due to recent advancements in hardware, such as U-blox chips and compact antennas with enhanced performance. Related technologies, including Precise Point Positioning (PPP) and high-accuracy Real-Time Kinematic (RTK), have direct applications in the generation of prescription maps, variable-rate treatments, precise crop georeferencing, and the autonomous navigation of UGVs, UAVs, and modern agricultural machinery. In [161], a recent state-of-the-art review on low-cost GNSS receivers, the authors conclude that in open-sky conditions—common in agriculture—the performance of these devices is comparable to that of traditional, high-cost systems.

8. New Architectures for New Neural Networks

A fascinating yet challenging feature of the deep learning landscape is the rapid pace of innovation, which makes it difficult to keep up with state-of-the-art technologies. However, this also enables the resolution of problems previously deemed unattainable, either due to methodological limitations or the need for hardware resources incompatible with edge-based approaches. It is, therefore, essential to continuously monitor the literature to identify potential breakthroughs within this plethora of new proposals. One particularly promising research direction involves the development of novel architectures alternative to the traditional multi-layer perceptron, which forms the basis of conventional deep learning models. In this context, we present two innovations that, in our view, hold significant potential and merit close attention from researchers and developers in the field of image classification.

8.1. The Kolmogorov–Arnold Networks

Mathematically, the neural networks underlying most deep learning models can be viewed as computational graphs [162]. Under general conditions—specifically, multilayer feedforward networks with at least one hidden layer and arbitrary nonlinear activation functions—such networks are proven to be universal approximators of Borel measurable functions [163], which encompass all continuous functions. Notably, similar theoretical results extend to specialized architectures like convolutional neural networks, which are extensively used in image classification tasks (see [164] and references therein). For instance, when CNNs are applied to the benchmark task of classifying handwritten digits from the MNIST dataset [165] achieving nearly 100% accuracy, it is reasonable to assume the existence of a continuous mapping between the network’s input (the 784 integer values, ranging from 0 to 255, representing a 28 pixel × 28 pixel grayscale image of a handwritten digit) and its output (10 softmax values corresponding to the probability of each digit class). Although this mapping cannot be explicitly expressed in terms of elementary functions, it can theoretically be approximated to arbitrary precision by an appropriately trained CNN.

Interestingly, another fundamental result, the Kolmogorov–Arnold representation theorem [166] states that any continuous function of multiple variables defined on an n-dimensional cube can be expressed by sums and superpositions of continuous fuction of a single variable. This is reported in Equation of (2), denoted by KANs:

\{\begin{matrix} f (x) ≃ \sum_{i = 1}^{N} a_{i} σ (w_{i} x + b_{i}) & MLPs \\ f (x) = \sum_{q = 1}^{2 n + 1} Φ_{q} (\sum_{p = 1}^{n} ϕ_{q, p} (x_{p})) & KANs \end{matrix}

(2)

The first and the second expression of (2) are the conceptual foundations behind the well known fully-connected feedforward neural networks (usually denoted with MLPs, Multi-Layer Perceptrons, and less frequently with FFNs), and the new KANs, respectively. We omit detailed mathematical explanations and direct interested readers to [166]. For the traditional MLPs, the learnable parameters are the weights

w_{i}

and the bias

b_{i}

, with the activation function

σ

remaining the same for every node across layers. KANs operate differently: the learnable parameters, denoted by

ϕ_{q, p}^{l + 1}

, are functions of a single variable that relate the p-th node in layer l to the q-th node in layer

l + 1

(the upper layer index is omitted when not necessary). Determining the explicit form of

ϕ_{q, p}

is generally not feasible for practical applications and can usually only be achieved in toy problems (e.g., representing

x_{1} x_{2} / x_{3} = exp (ln x_{1} + ln x_{2} + (- ln x_{3}))

). KANs employ learnable activation functions along the edges, unlike MLPs, which use fixed activation functions at each node (neuron). The activation functions in KANs are built as linear combinations of basis functions, such as B-splines, providing greater flexibility and expressiveness.

Table 7 situates the principal references on KANs and their applications in relation to foundational theoretical works on neural networks. Our primary criterion for inclusion is the explicit treatment of KANs—either as a theoretical construct, an empirical methodology, or a comparative framework against more conventional architectures (e.g., MLPs, CNNs). Accordingly, each entry highlights the paper’s domain, methodological approach, and key findings. This structure aims to illuminate the evolving landscape of KAN research, drawing attention to gaps, open questions, and emergent best practices.

In assembling Table 7, we employed the Kolmogorov–Arnold framework as a unifying vantage point. Specifically, we sought to track how each cited study advances or critiques KAN research—whether by formalizing its theoretical underpinnings, proposing new implementations, or evaluating performance in complex application domains such as hyperspectral imaging, medical imaging, and remote sensing. We separated the references into three broad thematic clusters:

Foundational Theory [163,164,167,168]: These works outlined the universal approximation properties of neural networks and explore the feasibility of the Kolmogorov–Arnold theorem for high-dimensional tasks.
Algorithmic Development [166,169,170,174]: Here, the authors proposed libraries and computational strategies to make KANs more viable for real-world usage, addressing issues such as training speed, memory overhead, and numerical stability.
Empirical Validation [171,172,173,175,176,177,178]: These studies demonstrate KANs’ capabilities (or their shortcomings) across diverse imaging tasks, offering comparative analyses against well-established architectures like MLPs, CNNs, and transformers.

8.2. XNets

The second notable innovation we examine is the so-called XNet [179], introduced in 2024. Unfortunately, the choice of this name was not ideal, as it is widely used in ICT and can lead to confusion with unrelated initiatives (e.g., CNNs for medical X-ray imaging).

XNet is based on the concept of extending real-valued functions into the complex domain and leveraging the Cauchy approximation theorem (Theorem 1 in [179]), which states: for any analytic complex function f defined on an open domain

U \subset C

, let

M \subset U

be a compact subset of U; given any

ϵ > 0

, there exists a set of points

ξ_{1}, \dots, ξ_{m} \in U

and corresponding parameters

λ_{1}, \dots, λ_{m}

such that:

|f (z) - Σ_{k = 1}^{m} \frac{λ_{k}}{ξ_{k} - z}| < ϵ

(3)

for any

z \in M

. When f is real, the approximation (Equation (3)) can be used to derive the following Cauchy activation function:

ϕ_{λ_{1}, λ_{2}, d} (x) = \frac{λ_{1} x}{x^{2} + d^{2}} + \frac{λ_{2}}{x^{2} + d^{2}}

(4)

where

λ_{1}, λ_{2}, d

are all trainable real numbers. The trend of Equation (4) is illustrated in Figure 5.

ϕ_{λ_{1}, λ_{2}, d}

differs from the commonly used activation functions in deep learning (e.g., ReLU, sigmoid, tanh), which are generally monotonic and lack trainable parameters. Beyond being theoretically well-founded, supported by a general approximation theorem proven in [179], it also enables the development of DL models particularly suited to addressing high-dimensional practical applications. This is based on the authors’ remark that their method can approximate functions in any dimension, at least on theoretical grounds.

Numerical experiments presented in [180] demonstrate that XNets perform well compared to traditional multi-layer perceptrons and KANs, often showing superior performance. This advantage is especially pronounced in tasks such as solving differential equations and predicting time series, where XNets outperform reference methods like PINNs and LSTMs. For the area of our interest, image classification, examples provided in [179] include benchmarks such as MNIST and CIFAR-10. The results suggest that XNets can achieve faster convergence and higher final accuracy. However, more demanding tests involving larger, noisy images—like those taken in the crops—are necessary to confirm their viability as a competitive alternative to state-of-the-art deep learning technologies reviewed in Section 5. To date, no studies have explored the application of XNets to agricultural image classification. This is unsurprising given the method’s recent introduction.

A notable advantage of XNets is their compatibility with existing deep learning frameworks. By simply replacing the activation function with

ϕ_{λ_{1}, λ_{2}, d}

, one can integrate XNets into established implementations with minimal effort. Furthermore, modern deep learning libraries such as TensorFlow and PyTorch support automatic differentiation for custom activation functions, allowing seamless handling of trainable parameters. For these reasons, experiments with XNets for agricultural image classification are highly desirable. Nevertheless, we note that the expected advantage of XNets is achieving similar or better results compared to those provided by CNNs, using simpler architectures. This can enhance inference speed and facilitate real-time analysis. Therefore, the optimal use of XNets in digital agriculture is likely to require more than simply replacing existing activation functions with Equation (4).

9. Future Directions

There are several promising directions for the development of AIoT in agricultural image classification.

One significant challenge is ‘data seasonality’—images of fruits and vegetables can only be captured during narrow time windows, accordingly with the natural growth cycle of plants. Additionally, phenotypic variations occur both spatially (the same crop may have distinct cultivars across different regions) and temporally (variability in crops within the same location over different years). These characteristics make it difficult to compile high-quality datasets with a large number of samples. Synthetic image generation, which has begun to show promise in fields such as biomedical imaging, is expected to play a crucial role in addressing these challenges in the near future. This technique could also help mitigate the issue of manual annotation, a difficult and time-intensive task requiring expert knowledge. In fact, the overall performance of state-of-the-art deep learning models, as discussed in Section 5, has advanced to the point where the primary limitation in image classification now lies in the quality of training datasets, rather than in the models themselves. If the methods discussed in Section 6 succeed, the availability of realistic, balanced, and rich datasets could significantly enhance the potential of image classification in agriculture, even for difficult cases such as weeds that simulate cultivated crops through Vavilovian mimicry.

AIoT and edge computing, too, are definitely worth further research. While recent applications have mainly focused on low-cost, low-power devices such as Raspberry Pi, the hardware market now offers high-quality sensors and computing devices that are both affordable—a significant change from the past—and capable of medium to high performance: this includes GPUs and specialized AI inference devices. The latest systems on chips are even capable of hosting lightweight language models, potentially enabling rovers and smart farming machines to interact with humans using natural language [181]. This presents a notable opportunity to shift agricultural image classification from a traditional cloud-based paradigm to a hybrid model, where real-time data analysis is performed via edge computing, allowing fully autonomous unmanned ground vehicles to be integrated with more resource-intensive processing on remote servers. At the same time, advanced sensors like 3D LiDARs are now accessible to a broader audience, enabling the generation of precise spatial models that can complement images within a promising data fusion framework.

Several recent proposals aim to supersede the classical multi-layer perceptron formulation by introducing innovative concepts, often rooted in the mathematical foundations of deep learning, particularly alternative formulations of the representation theorem. These advancements lead to either entirely new architectures for image classifiers or hybrid designs that integrate classical implementations with novel elements, such as innovative activation functions. The goal of these proposals is to achieve improved performance at similar computational costs or, alternatively, to maintain comparable performance with reduced models capable of running on devices with limited hardware resources. The latter approach is particularly significant, as it reduces reliance on cloud-based computation and facilitates the adoption of edge-based AIoT in agriculture. Two of the most promising new architectures are KANs and XNets, which are briefly described in Section 8. Both have the potential to advance deep learning techniques and are beginning to demonstrate their capabilities. However, there are key differences: while XNets offer straightforward implementation within popular deep learning libraries, KANs require more effort to develop effective classifiers for real-world agricultural applications. Despite this, KANs promise a broader cutting-edge potential, making them strong candidates for transformative innovations such as continuous learning. A pressing challenge is that current hardware and accelerators, optimized for MLP-based algorithms (and thus compatible with XNets), may not be well-suited for running KANs. This underscores the need to reformulate KANs using efficient paradigms based on matrix operations, similar to those employed in traditional deep learning models.

10. Discussion and Conclusions

Recent advances in ICT technology, including hardware devices and optimized software libraries for computer vision, alongside the availability of innovative deep learning models for image classification, now enable the application of modern agriculture solutions that were previously unattainable. However, several challenges remain to be addressed before these concepts can be translated into practical tools for everyday farming.

In Section 2, we reviewed a substantial number of studies addressing various aspects of AI, IoT, and AIoT in agriculture. Some of these studies highlight open issues and challenges, offering roadmaps for future advancements. Notably, Adli et al. [8] identified several key challenges in technology adoption, including the complexity of interconnected devices, data privacy and security concerns, trust in AIoT technologies, the resilience of complex sensor systems, and the difficulties in developing robust, effective, and cost-efficient wireless networks capable of covering entire crop areas. Interestingly, a part of these challenges are alleviated by adopting edge-based AIoT solutions. These systems reduce reliance on cloud-based analysis, facilitate real-time decision making, eliminate the need for complex wireless communication networks in open fields, enhance data security, and lower the costs of remote analysis services. However, deferred data processing on remote platforms remains essential for tasks not requiring immediate results, such as generating high-precision spatial models for crop digital twins, and for long-term data storage. This research topic [9] provides valuable insights and critical analyses of the open challenges in AIoT for agriculture. This field is still in its early stages and faces several issues, including data acquisition, the optimization of AI algorithms, and the limited performance of hardware required to operate under harsh environmental conditions. The use of ground rovers, which could mitigate significant labor shortages in manual-intensive practices, also requires the resolution of non-trivial technological challenges.

Several concurrent developments in the current AI landscape support the emergence of a new edge-based AIoT paradigm for digital agriculture. First, concerning the training phase, present AI models for image classification (see Section 5) have reached a high level of maturity, with many modern techniques achieving excellent accuracy without the need for excessively powerful computational resources. Moreover, transfer learning from pre-trained models has demonstrated high effectiveness, significantly reducing the need to train large-scale classifiers from scratch, an approach that would otherwise require expensive high-performance computing platforms and considerable processing time. This favorable context facilitates precise model calibration through iterative and trial-and-error methodologies, enabling the creation of highly optimized deep learning classifiers.

Second, advancements in hardware we reviewed in Section 7 are delivering a new generation of devices designed to balance performance and energy efficiency. These include edge GPUs and other accelerators for deep learning inference. Combined with faster and larger memory chips, this marks a departure from the traditional reliance on low-cost, low-power, and low-performance hardware like Raspberry Pi. The last generation of system-on-chips, while remaining affordable, offers enhanced AI capabilities and more consistent processing throughput. Examples of practical applications made possible by these innovations will be presented shortly.

This brings attention to a critical issue: training datasets, often the weakest link in the process. Obtaining high-quality plant images in large quantities is challenging due to the seasonality of crops and the variability of phenotypes across space and time. Publicly available high-quality datasets (see Section 4) are limited and often focus on plant species and cultivars that differ from those of interest to end-users, restricting their utility to the setup and validation of newly developed AI workflows. In this context, the capability to generate realistic, automatically annotated synthetic images could represent a significant breakthrough and is likely to become a key research focus in the near future. The advanced AI models for image-to-image and text-to-image generation discussed in Section 6 can utilize either powerful data servers or high-end consumer GPUs to produce, respectively, large- or small- to medium-scale synthetic datasets. Images in these collections feature realistic dimensions, with the entire dataset generated within typical computation times ranging from several hours to a few days. This capability provides a valuable resource for enhancing the quality of training datasets and, consequently, improving classifier performance.

Within this framework, novel AIoT applications can be designed and implemented, with edge computing and real-time inference as key enablers. One example is autonomous systems for continuous agricultural monitoring, where early detection of issues such as pest infestations and disease outbreaks enables cost-effective and environmentally friendly protection strategies. These challenges cannot be reliably addressed by human experts due to labor shortages in modern agriculture or by existing remote sensing methods, such as satellites, which face technical limitations. Instead, they can be tackled by unmanned ground vehicles (UGVs) with low costs, capable of autonomous operation (potentially in conjunction with drones) in vineyards, orchards, olive groves, and other low-density crops. Another practical example is the use of smart booms attached to tractors for spraying chemical active ingredients to control weeds or pests. Edge-based AIoT enables robust systems capable of localizing crop emergencies on the move, allowing circumscribed actions such as precision spraying. These systems can adjust their operational efforts based on the presence of the target, enabling precise and efficient treatments.

While not all desired applications in modern agriculture are currently feasible with existing software and hardware technologies, it is important to note the rapid pace of innovation in these fields. On the one hand, it is essential to continuously monitor advancements from ICT manufacturers to identify new products that could significantly enhance the capabilities of edge-based AIoT systems, such as those exemplified earlier. On the other hand, progress in algorithmic development can greatly facilitate the transition to expert systems, for instance, by moving from quasi-real-time to true real-time performance through improved inference speeds. Two recent methods, KANs and XNets, have emerged within the past few months: they are reviewed in Section 8. Although still in their early stages, these approaches have generated high expectations. In the context of this review, they hold significant promise for enabling more efficient models that can be deployed on edge devices for real-time agricultural practices. Monitoring the development of these technologies is critical, as they may lead to substantial advancements in the near future.

Funding

This research was funded by the Autonomous Region of Sardinia under project AI and by Project “SMAART”—n. F/350396/01/X60—MIMIt Accordi per l’innovazione D.M. 31 dicembre 2021.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The complete bibliographic dataset is available in BibTeX format upon reasonable request for non-commercial purposes.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Correspondence between abbreviated and full journal names of the first-level bibliography.

Abbreviated Journal Name	Full Journal Name
Multimed. Tools Appl.	Multimedia Tools and Applications
Inf. Process. Agric.	Information Processing in Agriculture
Remote Sensing	Remote Sensing
Comput. Electron. Agric.	Computers and Electronics in Agriculture
Front. Plant Sci.	Frontiers in Plant Science
IEEE Access	IEEE Access
Sensors	Sensors
Sci. Rep.	Scientific Reports
Neural Comput. Appl.	Neural Computing and Applications
Artif. Intell. Agric.	Artificial Intelligence in Agriculture
Environ. Monit. Assess.	Environmental Monitoring and Assessment
Agriculture	Agriculture
Data Brief	Data in Brief
Agronomy	Agronomy
J. Indian Soc. Remote Sens.	Journal of the Indian Society of Remote Sensing
Eur. Food Res. Technol.	European Food Research and Technology
Appl. Sci.	Applied Sciences
J. Plant Dis. Prot.	Journal of Plant Diseases and Protection
Precision Agric.	Precision Agriculture
Sustainability	Sustainability
Soft Comput.	Soft Computing
Plant Methods	Plant Methods
Artif. Intell. Rev.	Artificial Intelligence Review
Environ. Dev. Sustain.	Environment Development and Sustainability
Signal Image Video Process.	Signal Image and Video Processing
Comput. Electr. Eng.	Computers and Electrical Engineering
Ecol. Informatics	Ecological Informatics
Environ. Sci. Pollut. Res.	Environmental Science and Pollution Research
Expert Syst. Appl.	Expert Systems with Applications
Sci. Data	Scientific Data
Wirel. Pers. Commun.	Wireless Personal Communications
AgriEngineering	AgriEngineering
Arab. J. Geosci.	Arabian Journal of Geosciences
Arch. Comput. Methods Eng.	Archives of Computational Methods in Engineering
Int. J. Inf. Technol.	International Journal of Information Technology
J. Amb. Intell. Hum. Comp.	Journal of Ambient Intelligence and Humanized Computing
Model. Earth Syst. Environ.	Modeling Earth Systems and Environment
PLOS ONE	PLOS ONE
Appl. Geomatics	Applied Geomatics
Eur. J. Remote Sens.	European Journal of Remote Sensing
Plants	Plants
Smart Agric. Technol.	Smart Agricultural Technology

References

Eurostat. Farms and farmland in the European Union—Statistics. Available online: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Farms_and_farmland_in_the_European_Union_-_statistics (accessed on 6 February 2025).
Eurostat. Farmers and the Agricultural Labour Force—Statistics. Available online: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Farmers_and_the_agricultural_labour_force_-_statistics (accessed on 6 February 2025).
Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A review of convolutional neural networks in computer vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
Agelli, M.; Corona, N.; Maggio, F.; Moi, P. Unmanned Ground Vehicles for Continuous Crop Monitoring in Agriculture: Assessing the Readiness of Current ICT Technology. Machines 2024, 12, 750. [Google Scholar] [CrossRef]
Soussi, A.; Zero, E.; Sacile, R.; Trinchero, D.; Fossa, M. Smart Sensors and Smart Data for Precision Agriculture: A Review. Sensors 2024, 24, 2647. [Google Scholar] [CrossRef] [PubMed]
Dadhich, S.M.; Pandey, Y.; Mehraj, N.; Mir, G. A Review on Wireless Communication Technologies for Agriculture. J. Community Mobilization Sustain. Dev. 2023, 18, 1012–1022. [Google Scholar] [CrossRef]
Abbasi, R.; Martinez, P.; Ahmad, R. The digitization of agricultural industry—A systematic literature review on agriculture 4.0. Smart Agric. Technol. 2022, 2, 100042. [Google Scholar] [CrossRef]
Adli, H.K.; Remli, M.A.; Wan Salihin Wong, K.N.S.; Ismail, N.A.; González-Briones, A.; Corchado, J.M.; Mohamad, M.S. Recent Advancements and Challenges of AIoT Application in Smart Agriculture: A Review. Sensors 2023, 23, 3752. [Google Scholar] [CrossRef]
Majeed Y, F.L.; L, H. Artificial Intelligence-of-Things (AIoT) in Precision Agriculture; Frontiers Media SA: Lausanne, Switzerland, 2024. [Google Scholar] [CrossRef]
Sun, H.; Bingqing, W.; Jinlin, X. YOLO-P: An efficient method for pear fast detection in complex orchard picking environment. Front. Plant Sci. 2023, 13, 2022. [Google Scholar] [CrossRef]
Zhou, X.; Zou, X.; Tang, W.; Yan, Z.; Meng, H.; Luo, X. Unstructured road extraction and roadside fruit recognition in grape orchards based on a synchronous detection algorithm. Front. Plant Sci. 2023, 14, 2023. [Google Scholar] [CrossRef]
Wang, S.; Khan, A.; Lin, Y.; Jiang, Z.; Tang, H.; Alomar, S.Y.; Sanaullah, M.; Bhatti, U.A. Deep reinforcement learning enables adaptive-image augmentation for automated optical inspection of plant rust. Front. Plant Sci. 2023, 14, 2023. [Google Scholar] [CrossRef]
Akbar, J.U.M.; Kamarulzaman, S.F.; Muzahid, A.J.M.; Rahman, M.A.; Uddin, M. A Comprehensive Review on Deep Learning Assisted Computer Vision Techniques for Smart Greenhouse Agriculture. IEEE Access 2024, 12, 4485–4522. [Google Scholar] [CrossRef]
Tripathi, M.K.; Maktedar, D.D. A role of computer vision in fruits and vegetables among various horticulture products of agriculture fields: A survey. Inf. Process. Agric. 2020, 7, 183–203. [Google Scholar] [CrossRef]
Farooq, H.; Rehman, H.U.; Javed, A.; Shoukat, M.; Dudely, S. A Review on Smart IoT Based Farming. Ann. Emerg. Technol. Comput. 2020, 4, 17–28. [Google Scholar] [CrossRef]
Hasan, R.I.; Yusuf, S.M.; Alzubaidi, L. Review of the State of the Art of Deep Learning for Plant Diseases: A Broad Analysis and Discussion. Plants 2020, 9, 1302. [Google Scholar] [CrossRef] [PubMed]
Mishra, P.; Polder, G.; Vilfan, N. Close Range Spectral Imaging for Disease Detection in Plants Using Autonomous Platforms: A Review on Recent Studies. Curr. Robot. Rep. 2020, 1, 43–48. [Google Scholar] [CrossRef]
Ngugi, L.C.; Abelwahab, M.; Abo-Zahhad, M. Recent advances in image processing techniques for automated leaf pest and disease recognition—A review. Inf. Process. Agric. 2021, 8, 27–51. [Google Scholar] [CrossRef]
Orchi, H.; Sadik, M.; Khaldoun, M. On Using Artificial Intelligence and the Internet of Things for Crop Disease Detection: A Contemporary Survey. Agriculture 2021, 12, 9. [Google Scholar] [CrossRef]
Omaye, J.D.; Ogbuju, E.; Ataguba, G.; Jaiyeoba, O.; Aneke, J.; Oladipo, F. Cross-comparative review of Machine learning for plant disease detection: Apple, cassava, cotton and potato plants. Artif. Intell. Agric. 2024, 12, 127–151. [Google Scholar] [CrossRef]
Jafar, A.; Bibi, N.; Naqvi, R.A.; Sadeghi-Niaraki, A.; Jeong, D. Revolutionizing agriculture with artificial intelligence: Plant disease detection methods, applications, and their limitations. Front. Plant Sci. 2024, 15, 1356260. [Google Scholar] [CrossRef]
Hasan, A.S.M.M.; Sohel, F.; Diepeveen, D.; Laga, H.; Jones, M.G. A survey of deep learning techniques for weed detection from images. Comput. Electron. Agric. 2021, 184, 106067. [Google Scholar] [CrossRef]
Murad, N.Y.; Mahmood, T.; Forkan, A.R.M.; Morshed, A.; Jayaraman, P.P.; Siddiqui, M.S. Weed Detection Using Deep Learning: A Systematic Literature Review. Sensors 2023, 23, 3670. [Google Scholar] [CrossRef] [PubMed]
Juwono, F.H.; Wong, W.; Verma, S.; Shekhawat, N.; Lease, B.A.; Apriono, C. Machine learning for weed–plant discrimination in agriculture 5.0: An in-depth review. Artif. Intell. Agric. 2023, 10, 13–25. [Google Scholar] [CrossRef]
Qu, H.R.; Su, W.H. Deep Learning-Based Weed–Crop Recognition for Smart Agricultural Equipment: A Review. Agronomy 2024, 14, 363. [Google Scholar] [CrossRef]
Hu, K.; Wang, Z.; Coleman, G.; Bender, A.; Yao, T.; Zeng, S.; Song, D.; Schumann, A.; Walsh, M. Deep learning techniques for in-crop weed recognition in large-scale grain production systems: A review. Precis. Agric. 2023, 25, 1–29. [Google Scholar] [CrossRef]
Yang, J.; Guo, X.; Li, Y.; Marinello, F.; Ercisli, S.; Zhang, Z. A survey of few-shot learning in smart agriculture: Developments, applications, and challenges. Plant Methods 2022, 18, 28. [Google Scholar] [CrossRef] [PubMed]
Ragu, N.; Teo, J. Object detection and classification using few-shot learning in smart agriculture: A scoping mini review. Front. Sustain. Food Syst. 2023, 6, 1039299. [Google Scholar] [CrossRef]
Luo, Z.; Yang, W.; Yuan, Y.; Gou, R.; Li, X. Semantic segmentation of agricultural images: A survey. Inf. Process. Agric. 2024, 11, 172–186. [Google Scholar] [CrossRef]
Guerri, M.F.; Distante, C.; Spagnolo, P.; Bougourzi, F.; Taleb-Ahmed, A. Deep learning techniques for hyperspectral image analysis in agriculture: A review. ISPRS Open J. Photogramm. Remote Sens. 2024, 12, 100062. [Google Scholar] [CrossRef]
Vairavan, C.; Kamble, B.; Durgude, A.; Ingle, S.R.; Pugazenthi, K. Hyperspectral Imaging of Soil and Crop: A Review. J. Exp. Agric. Int. 2024, 46, 48–61. [Google Scholar] [CrossRef]
Scopus Document Search. Available online: https://www.scopus.com (accessed on 6 February 2025).
Google Scholar. Available online: https://scholar.google.com/ (accessed on 6 February 2025).
arXiv. Available online: https://arxiv.org (accessed on 6 February 2025).
IEEE Xplore. Available online: https://ieeexplore.ieee.org/ (accessed on 6 February 2025).
NIH National Library of Medicine. Available online: https://pubmed.ncbi.nlm.nih.gov (accessed on 6 February 2025).
Science.gov. Available online: https://www.science.gov (accessed on 6 February 2025).
ScienceDirect. Available online: https://www.sciencedirect.com (accessed on 6 February 2025).
Semantic Scholar. Available online: https://www.semanticscholar.org (accessed on 6 February 2025).
World Wide Science. Available online: https://worldwidescience.org (accessed on 30 September 2024).
Lu, Y.; Young, S. A survey of public datasets for computer vision tasks in precision agriculture. Comput. Electron. Agric. 2020, 178, 105760. [Google Scholar] [CrossRef]
Hughes, D.P.; Salathe, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv 2016, arXiv:1511.08060. [Google Scholar] [CrossRef]
Singh, D.; Jain, N.; Jain, P.; Kayal, P.; Kumawat, S.; Batra, N. PlantDoc: A Dataset for Visual Plant Disease Detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, Hyderabad, India, 5–7 January 2020; ACM: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Barbedo, J.; Koenigkan, L.; Halfeld-Vieira, B.; Costa, R.; Nechet, K.; Godoy, C.; Lobo Junior, M.; Patrício, F.; Talamini, V.; Chitarra, L.; et al. Annotated Plant Pathology Databases for Image-Based Detection and Recognition of Diseases. IEEE Lat. Am. Trans. 2018, 16, 1749–1757. [Google Scholar] [CrossRef]
Giselsson, T.M.; Jørgensen, R.N.; Jensen, P.K.; Dyrmann, M.; Midtiby, H.S. A Public Image Database for Benchmark of Plant Seedling Classification Algorithms. arXiv 2017, arXiv:1711.05458. [Google Scholar] [CrossRef]
Chouhan, S.; Koul, A.; Singh, D.U.; Jain, S. A Data Repository of Leaf Images: Practice towards Plant Conservation with Plant Pathology. In Proceedings of the 4th International Conference on Information Systems and Computer Networks (ISCON), Mathura, India, 21–22 November 2019; pp. 700–707. [Google Scholar] [CrossRef]
Patrick, M.; Akoto-Adjepong, V.; Adu, K.; Ayidzoe, M.; Bediako, E.; Nyarko-Boateng, O.; Boateng, S.; Donkor, E.; Umar Bawah, F.; Awarayi, N.; et al. CCMT: Dataset for Crop Pest and Disease Detection. Data Brief 2023, 49, 109306. [Google Scholar] [CrossRef]
Chen, J.; Chen, J.; Zhang, D.; Sun, Y.; Nanehkaran, Y. Using deep transfer learning for image-based plant disease identification. Comput. Electron. Agric. 2020, 173, 105393. [Google Scholar] [CrossRef]
Teimouri, N.; Dyrmann, M.; Nielsen, P.R.; Mathiassen, S.K.; Somerville, G.J.; Jørgensen, R.N. Weed Growth Stage Estimator Using Deep Convolutional Neural Networks. Sensors 2018, 18, 1580. [Google Scholar] [CrossRef]
Olsen, A.; Konovalov, D.A.; Philippa, B.; Ridd, P.; Wood, J.C.; Johns, J.; Banks, W.; Girgenti, B.; Kenny, O.; Whinney, J.; et al. DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning. Sci. Rep. 2019, 9, 2058. [Google Scholar] [CrossRef]
Leminen Madsen, S.; Mathiassen, S.K.; Dyrmann, M.; Laursen, M.S.; Paz, L.C.; Jørgensen, R.N. Open Plant Phenotype Database of Common Weeds in Denmark. Remote Sens. 2020, 12, 1246. [Google Scholar] [CrossRef]
Wang, P.; Tang, Y.; Luo, F.; Wang, L.; Li, C.; Niu, Q.; Li, H. Weed25: A deep learning dataset for weed identification. Front. Plant Sci. 2022, 13, 1053329. [Google Scholar] [CrossRef] [PubMed]
Sethy, P.K.; Barpanda, N.K.; Rath, A.K.; Behera, S.K. Deep feature based rice leaf disease identification using support vector machine. Comput. Electron. Agric. 2020, 175, 105527. [Google Scholar] [CrossRef]
Li, K.; Li, X.; Liu, B.; Ge, C.; Zhang, Y.; Chen, L. Diagnosis and application of rice diseases based on deep learning. PeerJ Comput. Sci. 2023, 9, e1384. [Google Scholar] [CrossRef]
Saini, P.; Nagesh, D. CottonWeeds: Empowering precision weed management through deep learning and comprehensive dataset. Crop. Prot. 2024, 181, 106675. [Google Scholar] [CrossRef]
Chen, D.; Lu, Y.; Li, Z.; Young, S. Performance evaluation of deep transfer learning on multi-class identification of common weed species in cotton production systems. Comput. Electron. Agric. 2022, 198, 107091. [Google Scholar] [CrossRef]
Rahman, A.; Lu, Y.; Wang, H. Deep Neural Networks for Weed Detection Towards Precision Weeding. In Proceedings of the 2022 ASABE Annual International Meeting, Houston, TX. USA, 17–20 July 2022. [Google Scholar] [CrossRef]
Dang, F.; Chen, D.; Lu, Y.; Li, Z. YOLOWeeds: A novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Comput. Electron. Agric. 2023, 205, 107655. [Google Scholar] [CrossRef]
David, E.; Madec, S.; Sadeghi-Tehran, P.; Aasen, H.; Zheng, B.; Liu, S.; Kirchgessner, N.; Ishikawa, G.; Nagasawa, K.; Badhon, M.A.; et al. Global Wheat Head Detection (GWHD) dataset: A large and diverse dataset of high resolution RGB labelled images to develop and benchmark wheat head detection methods. arXiv 2020, arXiv:2005.02162. [Google Scholar] [CrossRef] [PubMed]
David, E.; Mario, S.; Smith, D.; Madec, S.; Velumani, K.; Liu, S.; Wang, X.; Pinto, F.; Shafiee, S.; Tahir, I.; et al. Global Wheat Head Detection 2021: An Improved Dataset for Benchmarking Wheat Head Detection Methods. Plant Phenomics 2021, 2021, 9846158. [Google Scholar] [CrossRef]
Chebrolu, N.; Lottes, P.; Schaefer, A.; Winterhalter, W.; Burgard, W.; Stachniss, C. Agricultural robot dataset for plant classification, localization and mapping on sugar beet fields. Int. J. Robot. Res. 2017, 36, 1045–1052. [Google Scholar] [CrossRef]
Thapa, R.; Zhang, K.; Snavely, N.; Belongie, S.; Khan, A. The Plant Pathology Challenge 2020 data set to classify foliar disease of apples. Appl. Plant Sci. 2020, 8, e11390. [Google Scholar] [CrossRef] [PubMed]
Yang, Q.; Duan, S.; Wang, L. Efficient Identification of Apple Leaf Diseases in the Wild Using Convolutional Neural Networks. Agronomy 2022, 12, 2784. [Google Scholar] [CrossRef]
Shabrina, N.H.; Indarti, S.; Maharani, R.; Kristiyanti, D.A.; Irmawati; Prastomo, N.; Adilah, M.T. A novel dataset of potato leaf disease in uncontrolled environment. Data Brief 2024, 52, 109955. [Google Scholar] [CrossRef]
Bell, J.; Dee, H.M. Aberystwyth Leaf Evaluation Dataset [Dataset], 2016. Zenodo. Available online: https://zenodo.org/records/168158 (accessed on 6 February 2025).
Ahmad, A.; Saraswat, D.; Gamal, A.E.; Johal, G. CD&S Dataset: Handheld Imagery Dataset Acquired Under Field Conditions for Corn Disease Identification and Severity Estimation. arXiv 2021, arXiv:2110.12084. [Google Scholar] [CrossRef]
Sara, U.; Rajbongshi, A.; Shakil, R.; Akter, B.; Uddin, M.S. VegNet: An organized dataset of cauliflower disease for a sustainable agro-based automation system. Data Brief 2022, 43, 108422. [Google Scholar] [CrossRef] [PubMed]
Uddin, M.S.; Mazumder, M.K.A.; Prity, A.J.; Mridha, M.F.; Alfarhood, S.; Safran, M.; Che, D. Cauli-Det: Enhancing cauliflower disease detection with modified YOLOv8. Front. Plant Sci. 2024, 15, 1373590. [Google Scholar] [CrossRef] [PubMed]
Sara, U.; Rajbongshi, A.; Shakil, R.; Akter, B.; Sazzad, S.; Uddin, M.S. An extensive sunflower dataset representation for successful identification and classification of sunflower diseases. Data Brief 2022, 42, 108043. [Google Scholar] [CrossRef]
Ma, J.; Du, K.; Zheng, F.; Zhang, L.; Sun, Z. A segmentation method for processing greenhouse vegetable foliar disease symptom images. Inf. Process. Agric. 2019, 6, 216–223. [Google Scholar] [CrossRef]
Skovsen, S.; Dyrmann, M.; Mortensen, A.K.; Laursen, M.S.; Gislum, R.; Eriksen, J.; Farkhani, S.; Karstoft, H.; Jorgensen, R.N. The GrassClover Image Dataset for Semantic and Hierarchical Species Understanding in Agriculture. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; IEEE: Piscataway, NJ, USA, 2019. [Google Scholar] [CrossRef]
Sa, I.; Ge, Z.; Dayoub, F.; Upcroft, B.; Perez, T.; McCool, C. DeepFruits: A Fruit Detection System Using Deep Neural Networks. Sensors 2016, 16, 1222. [Google Scholar] [CrossRef]
Altaheri, H.; Alsulaiman, M.; Muhammad, G.; Amin, S.U.; Bencherif, M.; Mekhtiche, M. Date fruit dataset for intelligent harvesting. Data Brief 2019, 26, 104514. [Google Scholar] [CrossRef] [PubMed]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2016, arXiv:1506.02640. [Google Scholar] [CrossRef]
Zhao, Y.; Yang, Y.; Xu, X.; Sun, C. Precision detection of crop diseases based on improved YOLOv5 model. Front. Plant Sci. 2023, 13, 1066835. [Google Scholar] [CrossRef] [PubMed]
Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. arXiv 2020, arXiv:1911.11907. [Google Scholar] [CrossRef]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
Yu, S.; Xie, L.; Huang, Q. Inception convolutional vision transformers for plant disease identification. Internet Things 2022, 21, 100650. [Google Scholar] [CrossRef]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
Arnab, A.; Dehghani, M.; Heigold, G.; Sun, C.; Lučić, M.; Schmid, C. ViViT: A Video Vision Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 6816–6826. [Google Scholar] [CrossRef]
Thakur, P.; Sheorey, T.; Ojha, A. VGG-ICNN: A Lightweight CNN model for crop disease identification. Multimed. Tools Appl. 2022, 82, 497–520. [Google Scholar] [CrossRef]
Khan, A.T.; Jensen, S.M.; Khan, A.R.; Li, S. Plant disease detection model for edge computing devices. Front. Plant Sci. 2023, 14, 1308528. [Google Scholar] [CrossRef]
Macdonald, W.; Sari, Y.A.; Pahlevani, M. Grow-light smart monitoring system leveraging lightweight deep learning for plant disease classification. Artif. Intell. Agric. 2024, 12, 44–56. [Google Scholar] [CrossRef]
Dai, G.; Tian, Z.; Fan, J.; Sunil, C.; Dewi, C. DFN-PSAN: Multi-level deep information feature fusion extraction network for interpretable plant disease classification. Comput. Electron. Agric. 2024, 216, 108481. [Google Scholar] [CrossRef]
Bouacida, I.; Farou, B.; Djakhdjakha, L.; Seridi, H.; Kurulay, M. Innovative deep learning approach for cross-crop plant disease detection: A generalized method for identifying unhealthy leaves. Inf. Process. Agric. 2024. [Google Scholar] [CrossRef]
Zeng, Q.; Sun, J.; Wang, S. DIC-Transformer: Interpretation of plant disease classification results using image caption generation technology. Front. Plant Sci. 2024, 14, 1273029. [Google Scholar] [CrossRef] [PubMed]
Ahad, M.T.; Li, Y.; Song, B.; Bhuiyan, T. Comparison of CNN-based deep learning architectures for rice diseases classification. Artif. Intell. Agric. 2023, 9, 22–35. [Google Scholar] [CrossRef]
Hu, K.; Liu, Y.; Nie, J.; Zheng, X.; Zhang, W.; Liu, Y.; Xie, T. Rice pest identification based on multi-scale double-branch GAN-ResNet. Front. Plant Sci. 2023, 14, 1167121. [Google Scholar] [CrossRef]
Ritharson, P.I.; Raimond, K.; Mary, X.A.; Robert, J.E.; J, A. DeepRice: A deep learning and deep feature based classification of Rice leaf disease subtypes. Artif. Intell. Agric. 2024, 11, 34–49. [Google Scholar] [CrossRef]
Rahman, A.; Lu, Y.; Wang, H. Performance evaluation of deep learning object detectors for weed detection for cotton. Smart Agric. Technol. 2023, 3, 100126. [Google Scholar] [CrossRef]
Zhu, S.; Ma, W.; Wang, J.; Yang, M.; Wang, Y.; Wang, C. EADD-YOLO: An efficient and accurate disease detector for apple leaf using improved lightweight YOLOv5. Front. Plant Sci. 2023, 14, 1120724. [Google Scholar] [CrossRef]
Si, H.; Li, M.; Li, W.; Zhang, G.; Wang, M.; Li, F.; Li, Y. A Dual-Branch Model Integrating CNN and Swin Transformer for Efficient Apple Leaf Disease Classification. Agriculture 2024, 14, 142. [Google Scholar] [CrossRef]
Lin, J.; Yu, D.; Pan, R.; Cai, J.; Liu, J.; Zhang, L.; Wen, X.; Peng, X.; Cernava, T.; Oufensou, S.; et al. Improved YOLOX-Tiny network for detection of tobacco brown spot disease. Front. Plant Sci. 2023, 14, 1135105. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; Liao, Y.; Lin, F.; Huang, B. An Improved Lightweight YOLOv5 Algorithm for Detecting Strawberry Diseases. IEEE Access 2023, 11, 54080–54092. [Google Scholar] [CrossRef]
Hu, X.; Wang, R.; Du, J.; Hu, Y.; Jiao, L.; Xu, T. Class-attention-based lesion proposal convolutional neural network for strawberry diseases identification. Front. Plant Sci. 2023, 14, 1091600. [Google Scholar] [CrossRef] [PubMed]
Kanna, G.; Kumar, S.; Kumar, Y.; Changela, A.; Woźniak, M.; Shafi, J.; Ijaz, M.F. Advanced deep learning techniques for early disease prediction in cauliflower plants. Sci. Rep. 2023, 13, 18475. [Google Scholar] [CrossRef]
Masood, M.; Nawaz, M.; Nazir, T.; Javed, A.; Alkanhel, R.; Elmannai, H.; Dhahbi, S.; Bourouis, S. MaizeNet: A Deep Learning Approach for Effective Recognition of Maize Plant Leaf Diseases. IEEE Access 2023, 11, 52862–52876. [Google Scholar] [CrossRef]
Wang, Y.; Yin, Y.; Li, Y.; Qu, T.; Guo, Z.; Peng, M.; Jia, S.; Wang, Q.; Zhang, W.; Li, F. Classification of Plant Leaf Disease Recognition Based on Self-Supervised Learning. Agronomy 2024, 14, 500. [Google Scholar] [CrossRef]
V, S.; Bhagwat, A.; Laxmi, V. LeafSpotNet: A deep learning framework for detecting leaf spot disease in jasmine plants. Artif. Intell. Agric. 2024, 12, 1–18. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Zhang, P.; Hu, S.; Li, W.; Zhang, C.; Cheng, P. Improving Parcel-Level Mapping of Smallholder Crops from VHSR Imagery: An Ensemble Machine-Learning-Based Framework. Remote Sens. 2021, 13, 2146. [Google Scholar] [CrossRef]
Gao, J.; Liao, W.; Nuyttens, D.; Lootens, P.; Alexandersson, E.; Pieters, J. Transferring learned patterns from ground-based field imagery to predict UAV-based imagery for crop and weed semantic segmentation in precision crop farming. arXiv 2022, arXiv:2210.11545. [Google Scholar] [CrossRef]
Goyal, A.; Bochkovskiy, A.; Deng, J.; Koltun, V. Non-deep Networks. arXiv 2021, arXiv:2110.07641. [Google Scholar] [CrossRef]
Ji, S.; Wang, X.; Lyu, T.; Liu, X.; Wang, Y.; Heinen, E.; Sun, Z. Understanding cycling distance according to the prediction of the XGBoost and the interpretation of SHAP: A non-linear and interaction effect analysis. J. Transp. Geogr. 2022, 103, 103414. [Google Scholar] [CrossRef]
Mohanty, S.; Hughes, D.; Salathé, M. Using Deep Learning for Image-Based Plant Disease Detection. Front. Plant Sci. 2016, 7, 1419. [Google Scholar] [CrossRef]
Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.Y.; Girshick, R. Detectron2. 2019. Available online: https://github.com/facebookresearch/detectron2 (accessed on 6 February 2025).
Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
Phan, T.H.; Yamamoto, K. Resolving Class Imbalance in Object Detection with Weighted Cross Entropy Losses. arXiv 2020, arXiv:2006.01413. [Google Scholar] [CrossRef]
Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context. arXiv 2015, arXiv:1405.0312. [Google Scholar] [CrossRef]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar] [CrossRef]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021, arXiv:2103.14030. [Google Scholar] [CrossRef]
Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar] [CrossRef]
He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked Autoencoders Are Scalable Vision Learners. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15979–15988. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521. [Google Scholar] [CrossRef]
Dey, R.; Salem, F.M. Gate-variants of Gated Recurrent Unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar] [CrossRef]
Qian, S.; Ning, C.; Hu, Y. MobileNetV3 for Image Classification. In Proceedings of the 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Nanchang, China, 26–28 March 2021; pp. 490–497. [Google Scholar] [CrossRef]
Banks, A.; Vincent, J.; Anyakoha, C. A review of particle swarm optimization. Part II: Hybridisation, combinatorial, multi-criteria and constrained optimization, and indicative application. Nat. Comput. 2008, 7, 109–124. [Google Scholar] [CrossRef]
Ridnik, T.; Ben-Baruch, E.; Noy, A.; Zelnik-Manor, L. ImageNet-21K Pretraining for the Masses. arXiv 2021, arXiv:2104.10972. [Google Scholar] [CrossRef]
Figueira, Á.; Vaz, B. Survey on Synthetic Data Generation, Evaluation Methods and GANs. Mathematics 2022, 10, 2733. [Google Scholar] [CrossRef]
Lu, Y.; Shen, M.; Wang, H.; Wang, X.; van Rechem, C.; Fu, T.; Wei, W. Machine Learning for Synthetic Data Generation: A Review. arXiv 2024, arXiv:2302.04062. [Google Scholar] [CrossRef]
Yang, S.C.H.; Eaves, B.; Schmidt, M.; Swanson, K.; Shafto, P. Structured Evaluation of Synthetic Tabular Data. arXiv 2024, arXiv:2403.10424. [Google Scholar] [CrossRef]
Pinaya, W.H.L.; Tudosiu, P.D.; Dafflon, J.; da Costa, P.F.; Fernandez, V.; Nachev, P.; Ourselin, S.; Cardoso, M.J. Brain Imaging Generation with Latent Diffusion Models. arXiv 2022, arXiv:2209.07162. [Google Scholar] [CrossRef]
Khader, F.; Mueller-Franzes, G.; Arasteh, S.T.; Han, T.; Haarburger, C.; Schulze-Hagen, M.; Schad, P.; Engelhardt, S.; Baessler, B.; Foersch, S.; et al. Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation. arXiv 2023, arXiv:2211.03364. [Google Scholar] [CrossRef]
Wu, W.; Zhao, Y.; Chen, H.; Gu, Y.; Zhao, R.; He, Y.; Zhou, H.; Shou, M.Z.; Shen, C. DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models. arXiv 2023, arXiv:2308.06160. [Google Scholar] [CrossRef]
Croitoru, F.; Hondru, V.; Ionescu, R.; Shah, M. Diffusion Models in Vision: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 10850–10869. [Google Scholar] [CrossRef]
Lu, Y.; Chen, D.; Olaniyi, E.; Huang, Y. Generative Adversarial Networks (GANs) for Image Augmentation in Agriculture: A Systematic Review. Comput. Electron. Agric. 2022, 200, 107208. [Google Scholar] [CrossRef]
Olaniyi, E.; Chen, D.; Lu, Y.; Huang, Y. Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic Review. arXiv 2022, arXiv:2204.04707. [Google Scholar] [CrossRef]
Sapkota, R.; Karkee, M. Creating Image Datasets in Agricultural Environments using DALL.E: Generative AI-Powered Large Language Model. Qeios 2024. [Google Scholar] [CrossRef]
Voetman, R.; Aghaei, M.; Dijkstra, K. The Big Data Myth: Using Diffusion Models for Dataset Generation to Train Deep Detection Models. arXiv 2023, arXiv:2306.09762. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, 1–8 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1322–1328. [Google Scholar] [CrossRef]
Burg, M.F.; Wenzel, F.; Zietlow, D.; Horn, M.; Makansi, O.; Locatello, F.; Russell, C. Image retrieval outperforms diffusion models on data augmentation. arXiv 2023, arXiv:2304.10253. [Google Scholar] [CrossRef]
Yang, Z.; Zhan, F.; Liu, K.; Xu, M.; Lu, S. AI-Generated Images as Data Source: The Dawn of Synthetic Era. arXiv 2023, arXiv:2310.01830. [Google Scholar] [CrossRef]
Wachter, P.; Kruse, N.; Schöning, J. Synthetic fields, real gains: Enhancing smart agriculture through hybrid datasets. In Proceedings of the 44. Jahrestagung der Gesellschaft für Informatik in der Land-, Forst- und Ernährungswirtschaft (GIL), Hohenheim, Germany, 27–28 February 2024. [Google Scholar] [CrossRef]
Sehwag, V.; Hazirbas, C.; Gordo, A.; Ozgenel, F.; Ferrer, C. Generating High Fidelity Data from Low-density Regions using Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 11482–11491. [Google Scholar] [CrossRef]
Griffin, G.; Holub, A.; Perona, P. Caltech 256. 2022. [Dataset]. Available online: https://data.caltech.edu/records/nyy15-4j048 (accessed on 6 February 2025).
Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
Lakkshmi Yogesh, N.; Bharathraj, G.; Sanjay Prasanth, A.; Shreyas, S.; CB, R. Generating Synthetic Dataset for Cotton Leaf using DCGAN. In Proceedings of the 2023 4th International Conference on Signal Processing and Communication (ICSPC), Coimbatore, India, 23–24 March 2023; pp. 249–252. [Google Scholar] [CrossRef]
Restrepo-Arias, J.F.; Branch-Bedoya, J.W.; Awad, G. Image classification on smart agriculture platforms: Systematic literature review. Artif. Intell. Agric. 2024, 13, 1–17. [Google Scholar] [CrossRef]
Abioye, E.A.; Abidin, M.S.Z.; Mahmud, M.S.A.; Buyamin, S.; AbdRahman, M.K.I.; Otuoze, A.O.; Ramli, M.S.A.; Ijike, O.D. IoT-based monitoring and data-driven modelling of drip irrigation system for mustard leaf cultivation experiment. Inf. Process. Agric. 2021, 8, 270–283. [Google Scholar] [CrossRef]
Adami, D.; Ojo, M.O.; Giordano, S. Design, Development and Evaluation of an Intelligent Animal Repelling System for Crop Protection Based on Embedded Edge-AI. IEEE Access 2021, 9, 132125–132139. [Google Scholar] [CrossRef]
Prabu, A.V.; Kumar, G.S.; Rajasoundaran, S.; Malla, P.P.; Routray, S.; Mukherjee, A. Internet of things-based deeply proficient monitoring and protection system for crop field. Expert Syst. 2022, 39, e12876. [Google Scholar] [CrossRef]
Gomez, A.L.; López-de Teruel, P.; Ruiz, A.; García-Mateos, G.; Bernabé, G.; García Clemente, F.J. FARMIT: Continuous assessment of crop quality using machine learning and deep learning techniques for IoT-based smart farming. Clust. Comput. 2022, 25, 2163–2178. [Google Scholar] [CrossRef]
Devi, N.; Sarma, K.K.; Laskar, S. Design of an intelligent bean cultivation approach using computer vision, IoT and spatio-temporal deep learning structures. Ecol. Infor. 2023, 75, 102044. [Google Scholar] [CrossRef]
Rashvand, N.; Witham, K.; Maldonado, G.; Katariya, V.; Prabhu, N.M.; Schirner, G.; Tabkhi, H. Enhancing Automatic Modulation Recognition for IoT Applications Using Transformers. IoT 2024, 5, 212–226. [Google Scholar] [CrossRef]
Singh, I.; Jaiswal, A.; Sachdeva, N. Comparative Analysis of Deep Learning Models for Potato Leaf Disease Detection. In Proceedings of the 2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 18–19 January 2024; pp. 421–425. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar] [CrossRef]
Aldhaheri, L.; Alshehhi, N.; Manzil, I.I.J.; Khalil, R.A.; Javaid, S.; Saeed, N.; Alouini, M.S. LoRa Communication for Agriculture 4.0: Opportunities, Challenges, and Future Directions. IEEE Internet Things J. 2025, 12, 1380–1407. [Google Scholar] [CrossRef]
Zhivkov, T.; Sklar, E.I. 5G on the Farm: Evaluating Wireless Network Capabilities for Agricultural Robotics. arXiv 2023, arXiv:2301.01600. [Google Scholar] [CrossRef]
Vassiliou, L.; Nadeem, A.; Chatzichristodoulou, D.; Vryonides, P.; Nikolaou, S. Novel Technologies towards the Implementation and Exploitation of “Green” Wireless Agriculture Sensors. Sensors 2024, 24, 3465. [Google Scholar] [CrossRef]
Iyer, V.; Gaensbauer, H.; Daniel, T.L.; Gollakota, S. Wind dispersal of battery-free wireless devices. Nature 2022, 603, 427–433. [Google Scholar] [CrossRef]
Dutta, A.; Roy, S.; Kreidl, O.P.; Boloni, L. Multi-Robot Information Gathering for Precision Agriculture: Current State, Scope, and Challenges. IEEE Access 2021, 9, 161416–161430. [Google Scholar] [CrossRef]
Touseau, L.; Le Sommer, N. Contribution of the Web of Things and of the Opportunistic Computing to the Smart Agriculture: A Practical Experiment. Future Internet 2019, 11, 33. [Google Scholar] [CrossRef]
Hamza, V.; Stopar, B.; Sterle, O.; Pavlovčič-Prešeren, P. Recent advances and applications of low-cost GNSS receivers: A review. GPS Solut. 2025, 29, 56. [Google Scholar] [CrossRef]
Aggarwal, C.C. Neural Networks and Deep Learning: A Textbook; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar] [CrossRef]
Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
Kumagai, W.; Sannai, A. Universal Approximation Theorem for Equivariant Maps by Group CNNs. arXiv 2020, arXiv:2012.13882. [Google Scholar] [CrossRef]
Deng, L. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2404.19756. [Google Scholar] [CrossRef]
Girosi, F.; Poggio, T. Representation Properties of Networks: Kolmogorov’s Theorem Is Irrelevant. Neural Comput. 1989, 1, 465–469. [Google Scholar] [CrossRef]
Kůrková, V. Kolmogorov’s Theorem Is Relevant. Neural Comput. 1991, 3, 617–622. [Google Scholar] [CrossRef]
Liu, Z.; Ma, P.; Wang, Y.; Matusik, W.; Tegmark, M. KAN 2.0: Kolmogorov-Arnold Networks Meet Science. arXiv 2024, arXiv:2408.10205. [Google Scholar] [CrossRef]
An Efficient Implementation of Kolmogorov-Arnold Network. Available online: https://github.com/Blealtan/efficient-kan?tab=readme-ov-file (accessed on 20 January 2025).
Lobanov, V.; Firsov, N.; Myasnikov, E.; Khabibullin, R.; Nikonorov, A. HyperKAN: Kolmogorov-Arnold Networks make Hyperspectral Image Classificators Smarter. arXiv 2024, arXiv:2407.05278. [Google Scholar] [CrossRef]
Cheon, M. Kolmogorov-Arnold Network for Satellite Image Classification in Remote Sensing. arXiv 2024, arXiv:2406.00600. [Google Scholar] [CrossRef]
Drokin, I. Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies. arXiv 2024, arXiv:2407.01092. [Google Scholar] [CrossRef]
Qiu, Q.; Zhu, T.; Gong, H.; Chen, L.; Ning, H. ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU. arXiv 2024, arXiv:2406.02075. [Google Scholar] [CrossRef]
Cacciatore, A.; Morelli, V.; Paganica, F.; Frontoni, E.; Migliorelli, L.; Berardini, D. A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2409.13550. [Google Scholar] [CrossRef]
Yunusa, H.; Shiyin, Q.; Lawan, A.; Chukkol, A.H.A. KonvLiNA: Integrating Kolmogorov-Arnold Network with Linear Nyström Attention for feature fusion in Crop Field Detection. arXiv 2024, arXiv:2408.13160. [Google Scholar] [CrossRef]
Azam, B.; Akhtar, N. Suitability of KANs for Computer Vision: A preliminary investigation. arXiv 2024, arXiv:2406.09087. [Google Scholar] [CrossRef]
Cheon, M. Demonstrating the Efficacy of Kolmogorov-Arnold Networks in Vision Tasks. arXiv 2024, arXiv:2406.14916. [Google Scholar] [CrossRef]
Li, X.; Xia, Z.; Zhang, H. Cauchy activation function and XNet. arXiv 2024, arXiv:2409.19221. [Google Scholar] [CrossRef]
Li, X.; Xia, Z.J.; Zheng, X. Model Comparisons: XNet Outperforms KAN. arXiv 2024, arXiv:2410.02033. [Google Scholar] [CrossRef]
Yato, C.; Franklin, D.; Welsh, J. Bringing Generative AI to Life with NVIDIA Jetson. 2023. Available online: https://developer.nvidia.com/blog/bringing-generative-ai-to-life-with-jetson/ (accessed on 6 February 2025).

Figure 1. Two examples of edge-based AIoT applications in agriculture are illustrated. Left: a typical Unmanned Ground Vehicle (UGV) development platform for agricultural monitoring [4]. Right: a precision boom for spraying active ingredients (AI—not to be confused with Artificial Intelligence). The number and placement of sensors, devices, and electromechanical components may vary; the diagrams are conceptual representations only. In both cases, network connectivity is not required, as data processing and deep learning (DL) inference are performed locally.

Figure 2. The trends in the first level bibliographic dataset (>1400 references) over time (left) and its topic distribution (right). Here, ‘novel NN’ denotes the emerging architectures for neural networks, while ‘RS’ and ‘Ps’ stand for ‘remote sensing’ and ‘perspectives’, respectively. Data for 2024 are incomplete, and thus, have been excluded from the left plot.

Figure 3. The number of citations per journal in the first level bibliographic database. Only journals with at least five citations have been included in the analysis. Proceedings, books, and book chapters are excluded from the statistics. Bibliographic search engines and references [1,2] were not counted in the total reference number.

Figure 4. Topic distribution for the second-level bibliographic dataset reviewed in this paper (>170 references; the sum of the donut chart is slightly below 100% due to truncation errors in the individual terms). Refer to Figure 2 for explanations of the acronyms.

Figure 5. The activation function (Equation (4)) used by XNets (in black), along with the two terms occurring in the definition of

ϕ_{λ_{1}, λ_{2}, d} (x)

.

Figure 5. The activation function (Equation (4)) used by XNets (in black), along with the two terms occurring in the definition of

ϕ_{λ_{1}, λ_{2}, d} (x)

.

Table 4. Key publications and their contributions to synthetic dataset generation.

Reference	Methodology	Key Focus	Main Contribution
[129]	Diffusion Models	Medical Imaging	Demonstrates high-quality synthetic datasets for medical applications using diffusion models.
[132]	Diffusion Models	Generative Modeling	Review of diffusion models’ effectiveness in generating diverse, high-quality synthetic data.
[131]	DatasetDM (Diffusion Model-based)	Perception Annotation Generation	Introduces a new methodology for generating annotated datasets, improving performance in segmentation tasks.
[130]	Diffusion Models	Medical Imaging (MRI, CT)	Explores privacy-preserving AI in medical imaging using synthetic data.
[140]	GANs, Diffusion Models	Visual Intelligence	Comprehensive review on the use of synthetic data to enhance machine learning and computer vision tasks.
[139]	Diffusion Models vs. Image Retrieval	Data Augmentation	Evaluates the efficacy of diffusion models versus image retrieval for data augmentation, highlighting efficiency.
[135]	DALL.E (Text-to-Image, Image-to-Image)	Agricultural Image Generation	Investigates AI-generated imagery for agriculture, assessing its effectiveness and quality.
[141]	Hybrid Datasets (Real + Synthetic)	Agricultural Data	Proposes hybrid datasets as a solution to the ‘reality gap’ in synthetic data for agricultural applications.
[136]	Genfusion (Stable Diffusion)	Object Detection in Agriculture (Apple Detection)	Demonstrates the effectiveness of fine-tuning diffusion models for object detection with synthetic data.
[142]	Diffusion Models	Agricultural Datasets (Low-Density Regions)	Uses diffusion models to generate high-fidelity samples for low-density agricultural datasets.
[127]	GANs	Agricultural Image Augmentation	Reviews various GAN architectures applied to agricultural data, highlighting image augmentation techniques.
[134]	GANs	Agricultural Image Augmentation	Explores the role of GANs in augmenting agricultural datasets for tasks like disease recognition.
[20]	DCGAN	Agricultural Image Augmentation (Cotton Leaves Diseases)	Demonstrates DCGAN’s efficacy in generating synthetic datasets for cotton leaf disease detection.

Table 5. Key contributions to edge computing for agriculture and their relationships.

Reference	Focus/Domain	Approach/Method	Key Findings
[146]	Precision agriculture, edge computing trends	Survey on edge devices for production environments	Identifies Raspberry Pi as the dominant edge device in precision agriculture
[147]	IoT-based soil monitoring and irrigation	Soil moisture/evapotranspiration sensing; Raspberry Pi 3 for data streaming	Demonstrates the effectiveness of integrating Raspberry Pi to transmit sensor data to an online platform
[148]	Wildlife intrusion detection and virtual fencing	YOLOv3 vs. Tiny-YOLOv3 on Raspberry Pi vs. Jetson Nano; LoRa/LoRaWAN communication	Achieves higher accuracy (82.5% mAP) but lower frame rate on YOLOv3; higher FPS (15) but lower mAP (62.4%) on Tiny-YOLO; Raspberry Pi requires fan
[149]	Crop disease detection and animal interference	IoT sensors streaming to Raspberry Pi 4b; RCNNs + RGANs for in-depth analysis	Reports high classification accuracy (98.7–99.8%) across multiple crop diseases and intrusions
[150]	FARMIT: Layered IoT/ML architecture	Three-layer system (physical, edge, cloud); Random Forest for data analysis	Demonstrates a flexible architecture enabling local (edge) low-latency control and cloud-based ML
[151]	Bean cultivation disease/weed detection	Combined IoT sensors, image processing (NDVI, KMC, FCM), deep learning (EfficientNetB7/BiLSTM, VGG16 + attention)	Achieves > 95% accuracy, outperforming human experts in classification of diseased/healthy leaves and weed detection
[152]	Automatic modulation recognition (AMR) in IoT	Transformer-based architecture (TransIQ), four tokenization strategies	Demonstrates superior AMR accuracy in low SNR conditions, aligning with broader adoption of Transformers beyond NLP
[153]	Smart agricultural drone (IoT & ML integration)	TensorFlow Lite (EfficientDetLite1) for crop identification, real-time data	Achieves 91 ms inference time, with a payload of 1.5 kg and 25-min flight for automated pesticide spraying

Table 6. Comparison of advanced edge devices for precision agriculture.

Device/Platform	CPU/GPU	TOPS	Memory	Power Consumption	Key Advantages/Notes
NVIDIA Jetson AGX Orin	12-core ARM v8.2; up to 1792-core GPU	Up to 278	Up to 32 GB LPDDR5	15–60 W (config. dependent)	High GPU performance for real-time vision tasks; extensive CUDA libraries; robust ecosystem
NVIDIA Jetson Xavier NX	6-core ARM v8.2; 384-core Volta GPU	32	8 GB LPDDR4x	10–15 W	Balance of performance and power for moderate-scale ML tasks; smaller footprint than AGX Orin
Google Coral Dev Board	Quad-core ARM Cortex-A53; Edge TPU	4	1 GB LPDDR4, 8 GB eMMC	∼4 W typical	On-board Edge TPU excels in low-power CNN inference; ideal for classification or object detection
Intel Movidius Myriad X (e.g., Neural Compute Stick 2)	N/A (USB accelerator)	4	N/A (integrates memory on-chip)	1–2 W typical	Plug-in accelerator for edge devices; efficient for CNN inferences; widely supported in OpenVINO
Qualcomm Snapdragon XR2	Octa-core Kryo 585; Adreno 650 GPU	15	Up to 8 GB LPDDR5	5–15 W	Integrated AI engine; strong CPU/GPU synergy; optimized for lightweight AR/VR but adaptable to agriculture tasks
AMD Ryzen V2000 Series (embedded)	Up to 8-core Zen 2 CPU; integrated Radeon GPU	Up to 171	Up to 16 GB DDR4 (board-dependent)	10–25 W	Robust x86 performance; flexible platform for CPU-centric workloads and moderate GPU acceleration

Table 7. Key works on Kolmogorov–Arnold Networks (KANs) and related contributions.

Reference	Focus/Domain	Key Contribution or Findings
[162,163,164,165]	Neural network theory; CNN architectures for image classification	Establish fundamental conditions under which feedforward NN (and, by extension, certain CNNs) can serve as universal approximators for continuous functions. Provide classical benchmarks (MNIST, etc.) to evaluate CNNs.
[167]	Theoretical critique of Kolmogorov–Arnold representation	Highlights challenges in applying the Kolmogorov–Arnold theorem to real-world problems, noting potential non-smoothness of functions and the risk of high complexity.
[168]	Extension of Kolmogorov–Arnold theory	Address practical applications by an approximate version of the Kolmogorov–Arnold representation to circumvent non-smoothness issues.
[166,169]	Foundational KAN theory; introduction of `pyKAN`	Provide formal underpinnings of KANs, detailing how learnable activation functions in edges can theoretically outperform fixed activation functions. Introduce a Python library (`pyKAN`).
[170]	Efficient KAN implementation	Builds on `pyKAN` with an alternative approach to regularization and matrix multiplication, aiming to improve computational performance while preserving the theoretical strengths of KANs.
[171]	Hyperspectral image classification	Proposes hybrid architectures augmented with KAN blocks. Reports improved performance with KAN-based transformers compared to their purely CNN-based counterparts.
[172]	Remote sensing (EuroSAT dataset)	Integrates KAN modules into pre-trained CNNs. Achieves comparable accuracy to standard MLP pipelines but with faster convergence.
[173]	General computer vision tasks (MNIST, CIFAR, ImageNet, HAM10000)	Explores several basis functions in KANs. Demonstrates that U-Net with KAN are a viable alternative to conventional MLPs, although optimization remains an open challenge.
[174]	Computational optimization of KANs	Streamlines KAN design using ReLU activations and point-wise multiplication with CUDA, yielding up to 20× speedups over standard KANs.
[175]	Continuous learning (MNIST dataset)	Shows that KANs exhibit reduced catastrophic forgetting compared to MLPs in incremental learning, with similar computational costs.
[176]	Agricultural imaging	Applies KANs with Nyström attention to large-scale crop field detection, achieving state-of-the-art performance under reduced computational overhead.
[177,178]	Medium- to large-scale image classification	Observe that KANs outperform MLPs on smaller datasets but do not consistently surpass established CNN or transformers for more complex datasets, highlighting the nascent stage of KANs.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pintus, M.; Colucci, F.; Maggio, F. Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification. IoT 2025, 6, 13. https://doi.org/10.3390/iot6010013

AMA Style

Pintus M, Colucci F, Maggio F. Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification. IoT. 2025; 6(1):13. https://doi.org/10.3390/iot6010013

Chicago/Turabian Style

Pintus, Maurizio, Felice Colucci, and Fabio Maggio. 2025. "Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification" IoT 6, no. 1: 13. https://doi.org/10.3390/iot6010013

APA Style

Pintus, M., Colucci, F., & Maggio, F. (2025). Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification. IoT, 6(1), 13. https://doi.org/10.3390/iot6010013

Article Menu

Emerging Developments in Real-Time Edge AIoT for Agricultural Image Classification

Abstract

1. Introduction

2. Related Work

Contributions of This Study

3. Methodology

4. Public Datasets

5. Computer Vision Algorithms

6. Synthetic Datasets

7. Edge Computing and Real Time: Optimal Algorithms and Advancements in Hardware

7.1. Emerging Edge Devices

7.2. The Role of Wireless Communication

8. New Architectures for New Neural Networks

8.1. The Kolmogorov–Arnold Networks

8.2. XNets

9. Future Directions

10. Discussion and Conclusions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI