*3.4. Study Selection*

Figure 2 depicts the flow-chart of the approach adopted to select the articles according to the PRISMA guidelines [4]. The search in the digital libraries using the search string provided a total of **142** articles. In order to discard studies not relevant to our review, we removed the papers due to the following technical criteria, based on: (1) the type of publication, by eliminating materials such as editorials, short papers, posters, theses, dissertations, brief communications, commentaries, and unpublished works; (2) articles partially or wholly not written in English; (3) papers with text unavailable in full. In this step, a total of 24 papers were removed, obtaining 118 publications. To select the appropriate studies for this review, in the first screening task, only the records (title, abstract, and keyword) of each article were analyzed independently by two of the authors. Each researcher evaluated the title and the abstract according to the eligibility criteria to decide if that paper should be included in the next screening phase. A paper included by one of the researchers resulted in a full-text assessment in the next phase, so 50 papers were selected by the reviewers in this phase. In the last phase, all the researchers read the full papers and decided whether to include the work in the review based on the eligibility criteria and on criteria of relevance, rigorousness, credibility, and quality. Most of the papers were excluded in this phase because EI was used only in the title or abstract (as a buzzword) [36–39] or only in the related works Section [40], and therefore, it did not represent a fundamental element of the solution proposed in the manuscript [41,42]. To guarantee the high quality of the selected studies, final inclusion of a paper in the review was reached by consensus among the researchers (i.e., only if the majority of the researchers evaluated it as suitable for the review, or in case of parity, a discussion between the researchers took place to decide about the inclusion). In this phase, the researchers selected and analyzed a total of 20 papers. In parallel, we performed also an extensive *snowballing* search to identify other eligible studies (relevant, but not found by our query) according to the references' lists (*back-in-time search*) and citations (*forward-in-time search*) of the included studies. In particular, we repeated our query by including the term "intelligence continuum", which is sometimes used to refer to smart solutions distributed across all the architectural levels, hence including the edge. Finally, for the sake of maximum comprehensiveness, we also attempted to search for *gray literature*, thus covering relevant documents, unlisted in electronic databases since they are usually provided by both governmen<sup>t</sup> and professional organizations, such as technical reports, Ph.D. theses, patents, company's white papers, etc. Lastly, **14** secondary studies specifically related to EI were analyzed in more detail, and they are reported and compared in the framework of Table 2.

## **4. Literature Review**

Already from an initial screening of our search results, we observed that the literature related to EI is not well consolidated. Although the term EI occasionally appears in older EI-related works such as [16,41], only in the last decade has the research established clearer boundaries of the domain, and only in the last four years (2019–2023) have more systematic studies been published in conference proceedings and journals, mostly. From a deep study of all the analyzed works, indeed, we were able to identify a clear classification of the studies under the general umbrella of EI: the majority of them are truly narrowed on specific techniques and domains (e.g., [18,19,43–45]), while few horizontal studies seek to explore the fundamentals, perspectives, and trends of EI (whereas with coarse-grained [42,46] or fine-tuned [12,14,35] analysis). However, as previously pointed out, the specific aim of this survey was to methodically shed light on the state-of-the-art of EI by performing a systematic analysis (interesting and somewhat surprisingly, there is only a systematic literature review [7] and a systematic classification [35]) in the form of a tertiary study, thus centering our analysis on comprehensive reviews, surveys, roadmaps, etc. In this direction, given the time of writing, milestone works are [12,35,47], which provide key contributions by discussing core components and concepts, designing theoretical frameworks, and analyzing technology drivers, exploring capabilities, benefits, opportunities, gaps, and use cases for current EI scenarios, as well as for the next decade. These three manuscripts are the most-cited ones (with more than 200+ citations in a short period of time), but obviously, many other interesting and relevant findings came from the other works identified adopting the research methodology discussed in Section 3 and whose comparative analysis is summarized in Table 2. Therefore, driven by the Research Questions (RQs) outlined in Section 3.1, we carried out an accurate literature analysis, whose research directions and main findings (enclosed in dotted boxes) are concisely presented in Figure 3.

**Figure 3.** Directions and main findings (enclosed in dotted boxes) of the performed study on EI.


**Table 2.** *Cont.*


**Table 2.** *Cont.*

Referring to **RQ1**, albeit almost twelve years have passed since the first appearance of the term [16] (also referred to as "Edge AI" in [41]), there is still not a formal definition of EI. All the surveyed works promote similar definitions of EI in which the terms edge computing and AI appear, naturally side-by-side. With a deeper look, however, these definitions can be classified into two groups:


The definitions of the first groups aim to stress the achieved independence of edge nodes from the cloud, but, in this way, they definitively narrow down the scope of EI; the EI definition of the second group, instead, exposes a holistic perspective (not centered on the algorithmic capabilities of single edge nodes), reasoning in terms of a seamless edge–cloud ecosystem [35], promoting a continuum between the two domains and all their actors, technology enablers, etc. [53]. Indeed, for example, in [12], six EI levels are defined, and they form a collaborative hierarchy to be integrated for the design of efficient EI solutions. Far from providing the umpteenth EI definition, we adhered to the latter definition, and we believe that the "evolutionary" one limits the potential of EI: indeed, in order to enable novel IoT services or to optimize the overall system performance, all the available system's data and resources should be fully and opportunistically exploited. Just in the direction of such a full-fledged EI vision, Refs. [47,51] provide an interesting interpretation, by identifying two complementary contributions, namely *"AI for Edge"* (or Intelligent Edge) and *"AI on Edge"*, also referred to in [55] as *"AI for Operations"* and *"Operations for AI"*. The former focuses on providing optimal solutions to solve key problems in edge computing (e.g., data offloading, energy management, nodes coordination) with the help of popular AI techniques, while the latter studies how (i.e., which hardware platforms, programming framework, methods, and tools) to perform the whole process of AI model building, i.e., training, inference, and optimization, on edge devices despite their intrinsic resource limitations.

Referring to **RQ2**, it was found that a reference architecture purposely designed for EI is still missing. Indeed, even if the international community is actively working towards the development of a comprehensive edge computing reference architecture [56–59] with a relevant portion of "intelligence" located on edge devices, the full development of an "Edge-native AI system" is currently far away, being only sketched in [27,35]. With respect to the analyzed works, half of them (7 out of 14 [7,47,48,50,52,53,55]) do not deal with such a point, while the remaining ones discuss a multi-level architecture, which is, implicitly or explicitly, strongly influenced by the IoT and by the ETSI MEC reference architecture [14,60]; indeed, these edge computing architectures look tailored to conform and mirror the IoT's layered structure, which generally consists of various and closely intertwined layers that manage different system functionalities, such as data collection, processing, and managemen<sup>t</sup> [61]. Notably, the majority of surveyed works (5 out of 7 [12,15,35,51,54]) expose *two-layer architectures* (i.e., edge and cloud layers), while only [14,49] include a third, intermediate layer, which is mainly responsible for networking (from LAN to WAN) and interoperability (protocol conversion) tasks. Therefore, it emerges that the fog computing layer is losing attractiveness, being embedded in the so-called "thick Edge" (including, exactly, gateways and other specific-purpose devices), except for some industrial use cases demanding particular requirements. This can be due to the ever-increasing power and miniaturization and lowering cost of IoT boards and micro-

computers, which, most of the time, can perform typical fog computing duties (caching, pre-processing, etc.). Such a trend is especially noticeable in [51], whose authors define an "Edge Computing network" layer by distinguishing, from one side, devices such as base stations and gateways and, from the other side, tablets, smartphones, smartwatches, etc. Interestingly, only [55] and, primarily, [7] markedly stress the importance of a seamless interaction between the architectures, by presenting ad hoc methods, libraries, and frameworks for machine learning and data analytics on the *edge-to-cloud continuum*, in the spotlight today thanks to the recent initiative "European Cloud, Edge and IoT Continuum" led by the European Commission [62].

Referring to **RQ3**, an examination of the selected works revealed a common focus on EI's general objectives, applications, and use cases (especially [14,50,51,53–55]), while key technical topics can be grouped primarily into four categories we purposely grounded:


A preponderance of the reviewed literature (8 out of 14 [7,12,35,47,49,51,52,54]) concentrated on the category *KDD*, thereby shedding light on techniques pertaining to data cleaning and preprocessing, feature selection and extraction, and model building and evaluation. Notably, within this category, the subjects of edge training and edge inference have garnered significant interest among researchers, as they pertain to the key methods and techniques that address the challenges of implementing intelligent systems at the edge of the network. Additionally, over half of the works (8 out of 14, [7,12,48–51,54]) also address topics related to *HW platforms and SW frameworks*, enumerating the mainstream legacy of EI software and hardware tools. One of the key findings was the prevalent use of GPU, FPGA, and ASIC hardware chips in supporting intelligence at the edge of the network. These chips are favored for their ability to provide the necessary computational power and flexibility for real-time data processing and analysis, leading to the development of various hardware platforms based on them that are widely used in current Edge AI applications. Examples include the Nvidia Jetson family (GPU-based), the Google Coral Edge TPU (ASIC-based), and the Horizon Sunrise (FPGA-based), all of which are known for their high performance and energy efficiency. Additionally, the machine learning libraries that are most-frequently referenced in the analyzed works include TensorFlow Lite, Core ML, and Pytorch Mobile. These libraries are widely used for developing and deploying models on edge devices, and some of them, such as TensorFlow Lite, have been specifically optimized to run natively on hardware configurations such as the Google Coral family. An intriguing discovery is the introduction of an open framework for EI, also known as OpenEI, presented in the paper [48]. This lightweight software platform imbues the edge with sophisticated processing and data-sharing capabilities. OpenEI comprises a deep learning package that is specifically optimized for resource-constrained edge devices, including a plethora of refined AI models, providing a streamlined solution for the deployment of EI applications. Then, we found that approximately half of the examined literature (5 out of 14 [35,47,49,51,52]) focuses on the primary techniques that seek to sustain and preserve the added value of IoT *services*. Within this category, edge caching and edge offloading have been the most-extensively researched [35,51,52], as they address the critical need for efficient data managemen<sup>t</sup> and processing at the edge of the network, followed by ever-green (well-explored in the past, ye<sup>t</sup> still crucial) topics such as service placements, user mobility, topology management, etc. [47,49]. Finally, an intriguing discovery is that

only a minority of the reviewed papers (4 out of 14 [7,14,15,48]) reserves an adequate discussion on the *interoperability* topic, whereas its centrality has been widely recognized in the IoT ecosystem: these works agree that a rapid adoption of EI technologies by vendors and industry go through IoT gateways and unified interfaces for the system life-cycle (e.g., cross-platform software and RESTful AP for requirements assessment, authentication, resource discovery, system configuration, and deployment), but additionally, they focus on different aspects. For example, Ref. [48] delves into the transfer of data between edge nodes and cloud servers, emphasizing the importance of seamless collaboration; instead, Ref. [7] conducts an in-depth analysis of the collaborative aspect of the edge-to-cloud continuum, while, finally, Ref. [15] primarily concentrates on standardization, but from an industry perspective (by shedding light on requirements, potentials, and gaps in multiple use cases and domains, such as manufacturing, smart cities, and smart buildings). Conversely, there is no reference (if not as an open point in [12,35]) about semantic technologies.

Referring to **RQ4**, a noteworthy outcome is that 5 papers [14,15,52,53,55] out of the 14 did not provide insights on any specific enabling techniques for EI, but rather, focused on imparting a general overview of their principal contribution. As for the remaining nine works [7,12,35,47–51,54], a thorough analysis resulted in the classification of EI's key technologies into the following categories:


The categories that the majority of the analyzed works center on are edge inference [7,12,35,47–51] and edge training [7,12,35,49,51,54] (respectively, 8 and 6 out of 9). This is indicative of ongoing research efforts aimed at understanding the most-efficient ways to train ML models and provide timely predictions and analyses as close as possible to both end-devices and end-users. *Edge inference* pertains to the utilization of a pre-trained model or algorithm to make predictions or classify new data on edge devices or servers. The majority of current AI models are optimized for deployment on devices with ample computational resources, making them unsuitable for edge environments. The reviewed literature, however, identifies two main challenges in enabling efficient edge inference [12,35,48,51]: designing models that are suitable for deployment on resourceconstrained edge devices or servers and accelerating inference to provide real-time responses. One widely discussed approach to addressing these challenges is model compression (for reducing the size and computational requirements of existing models without affecting their accuracy) and, especially, its techniques of network pruning and parameter quantization. Another prominent approach discussed in the literature is model partitioning, which involves transferring the computationally intensive portions of a model to an edge server or neighboring mobile device, thus reducing the workload on the endpoint device and significantly enhancing inference performance: in this regard, the technique of model early exit has garnered much attention, as it enables the use of output data from early layers of a DNN to achieve a classification result, thus enabling the inference process to be completed using only a subset of the full DNN model. Unlike traditional centralized training methods that are executed on powerful servers or computing clusters, *edge training* is typically performed in a decentralized manner by using a training dataset located on devices with less computational power at the network's edge. This poses several challenges such as selecting the appropriate training architecture, increasing the training speed, and optimizing performance. The surveyed works propose various

techniques to address these issues. The most-commonly used architectures in the literature are "solo training" [35,51], where tasks are performed on a single device, and "collaborative training" [35,49,51], where multiple devices work together to train a shared model or algorithm. It is noteworthy that solo training has higher hardware requirements, which are often unavailable, and as consequence, several works focus on collaborative training architectures and techniques such as Federated Learning (FL) (which has been proposed in several variations such as communication-efficient FL, resource-optimized FL, security-enhanced FL and hierarchical FL [35,51]) and knowledge transfer learning. The latter method involves training a primary network (referred to as the "teacher network") on a base dataset and then transferring the acquired knowledge, in the form of learned features, to a secondary network (referred to as the "student network") for further training on a target dataset. This technique promises to drastically reduce the energy costs of model training on both end devices and edge servers. Approximately half of the papers (4 out of 9 [7,35,48,49]) also deal with *modeling* techniques for the design of ML models aimed at fully leveraging the limited resources of edge devices. According to the literature reviewed [7,12,35,47,49,51,54], it was noticed that deep learning outperformed other machine learning methods in a variety of tasks, including image classification, object detection, and face recognition. These deep learning models are commonly referred to as Deep Neural Networks (DNNs) due to their layered architecture. Despite the fact that DNNs can take on a variety of structures [12,49,51], such as Multilayer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs), the surveyed works primarily focus on the general DNN architecture. The increasing complexity and computational demands of modern DNN models make it challenging to run them on edge devices with limited resources, such as mobile devices, IoT terminals, and embedded devices. To address this challenge, recent works such as [7,35,48,49] focus on designing lightweight and resource-constraint DNN models that are more suitable for edge environments. According to [35], this approach can significantly improve the performance of training and inference tasks on edge devices. The categories of managemen<sup>t</sup> and collaboration received relatively less attention in the analyzed works, with only 3 [35,47,51] and 2 [7,51] papers, respectively, out of 9 addressing these topics. The *management* techniques primarily focus on optimizing data retrieval and processing speed and on minimizing power consumption and thermal stress on the edge device. Edge caching and computation offloading are widely used techniques to achieve these goals [35,51,52]: the former involves storing frequently accessed data on edge devices, reduces latency, and increases data retrieval speed; the latter, on the other hand, distributes the computational workload among a group of edge devices and encompasses various strategies such as Device-to-Cloud (D2C), Device-to-Edge (D2E), Device-to-Device (D2D), and hybrid offloading [35,52]. The *collaboration* category delves into methods for fostering cooperation and coordination among edge devices and other network entities such as vertical and horizontal collaboration and integral and partial task offloading [51]. It is particularly notable to observe the survey [7] through its extensive citation and analysis of a plethora of works, making a significant, albeit indirect, contribution to all the categories outlined, except for "management".

Finally, answering **RQ5**, the most-frequently mentioned application use cases in the reviewed literature pertain to the domains of smart cities [12,15], smart homes [48,51], smart factories [7,35], healthcare [49,50], entertainment [52,54], and automotive [47,51]. Notably, healthcare applications related to disease prediction [63,64], automotive applications exploiting connected and autonomous vehicles [65,66], as well as smart factory applications for the Industrial IoT [49,67] have been receiving significant attention from both industry professionals and researchers. The benefits of EI, such as low-latency communication, crucial in life-or-death situations, reduced bandwidth consumption, essential for energy efficiency in resource-limited devices, and enhanced privacy through the local storage of sensitive information, render these areas particularly appealing. Then, most of the surveyed works report some well-known, ye<sup>t</sup> still unaddressed challenges typical of distributed computing and, hence, of the IoT [61], such as scalability [53,55], security

and privacy [12,50,52], ethical issues [7,53], pervasiveness and ubiquity [7,14], resource optimization [48,52,54], heterogeneity [14,15,50,54], data scarcity and consistency [35,47,54], etc. However, these issues generally refer to the edge computing scenario rather than EI, whose main specific open challenges (and related future directions), instead, focus on:


Although they all are relevant, some of the identified gaps in the EI literature are particularly challenging. For example, particular emphasis should be given to the preliminary evaluation of EI solutions under development; indeed, while there exist some simulators conceived for IoT and edge computing scenarios, only [70,71] specifically focus on EI and on the many orthogonal issues it leads across the edge–cloud continuum [72]. Then, open (horizontal, vertical, and specialty) standards [12,50], robust platform abstractions [51], and flexible programming approaches that are deployment-transparent [73,74] are also key to deal with the inherent heterogeneity, scalability, and dynamicity of EI scenarios. In particular, even if standardization processes are typically burdensome efforts of indefinite duration and results (as taught by the IoT), commonly accepted practices should be established, possibly integrating the existing de jure and de facto standards and operating frameworks. Finally, themes such as equal accessibility [53] and governance [29,55], trustworthiness, and explainability [27,75], which already have gained attention in conventional AI systems, are carefully observed by institutions, and therefore, they deserve further research efforts from both industry and academia.
