Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs

Canpolat Şahin, Müge; Kolukısa Tarhan, Ayça

doi:10.3390/app15031026

Open AccessArticle

Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs

by

Müge Canpolat Şahin

^*

and

Ayça Kolukısa Tarhan

Department of Computer Engineering, Hacettepe University, 06800 Ankara, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(3), 1026; https://doi.org/10.3390/app15031026

Submission received: 23 October 2024 / Revised: 19 December 2024 / Accepted: 17 January 2025 / Published: 21 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

This study proposes a method for selecting suitable edge hardware and Artificial Intelligence (AI) models to be deployed on these edge devices. Edge AI, which enables devices at the network periphery to perform intelligent tasks locally, is rapidly expanding across various domains. However, selecting appropriate edge hardware and AI models is a multi-faceted challenge due to the wide range of available options, diverse application requirements, and the unique constraints of edge environments, such as limited computational power, strict energy constraints, and the need for real-time processing. Ad hoc approaches often lead to non-optimal solutions and inefficiency problems. Considering these issues, we propose a method based on the ISO/IEC 25010:2011 quality standard, integrating Multi-Criteria Decision Analysis (MCDA) techniques to assess both the hardware and software aspects of Edge AI applications systematically. For the proposed method, we conducted an experiment consisting of two stages: In the first stage of the experiment, to show the applicability of the method across different use cases, we tested the method with four scenarios on UAVs, each presenting distinct edge requirements. In the second stage of the experiment, guided by the method’s recommendations for Scenario I, where the STM32H7 series microcontrollers were identified as the suitable hardware and the object detection model with Single Shot Multi-Box Detector (SSD) architecture and MobileNet backbone as the suitable AI model, we developed a TensorFlow Lite model from scratch to enhance the efficiency and versatility of the model for object detection tasks across various categories. This additional TensorFlow Lite model is aimed to show how the proposed method can guide the further development of optimized AI models tailored to the constraints and requirements of specific edge hardware.

Keywords:

Edge AI; systematic evaluation; ISO/IEC 25010:2011 quality standard; Multi-Criteria Decision Analysis (MCDA); Unmanned Aerial Vehicles (UAVs)

1. Introduction

The integration of edge computing and Artificial Intelligence (AI) has given rise to the Edge AI computing paradigm. Recent breakthroughs in AI model optimizations, the proliferation of Internet of Things (IoT) devices, and the rise of edge computing have established the way for the vast potential of Edge AI. Edge AI is transforming the way IoT devices function by enabling real-time data processing directly on devices rather than relying on distant cloud servers. This shift enables devices to make quicker, more efficient decisions by processing data locally at the “edge”, on devices like wearables, smart cameras, and autonomous vehicles. As IoT devices become smarter, Edge AI helps to reduce latency, enhance security, and optimize bandwidth usage, while making intelligent, autonomous decisions without need for cloud processing [1].

Edge AI improves numerous applications, such as autonomous vehicles [2,3,4], healthcare [2], and smart homes [2,5] by improving responsiveness, preserving privacy, reducing dependency on centralized infrastructure, and enabling real-time data processing [2]. According to Fortune Business Insights report, the global Edge AI market was valued at USD 20.45 billion in 2023; it is expected to grow to USD 269.82 billion by 2032, reflecting the increasing adoption of Edge AI across industries such as automotive, healthcare, manufacturing, and cybersecurity.

As Edge AI proliferates across various domains, effective and high-quality Edge AI solutions become essential as in other computing domains. A critical challenge for Edge AI systems is related to the selection of suitable edge hardware and AI models that meet the specific needs of edge environments. The choice of hardware plays a critical role in ensuring that the edge device can handle the computational demands of AI models while adhering to strict resource constraints, such as limited processing power, memory, and energy [6,7]. Similarly, selection of AI models is equally important for the overall performance of the system [7]. AI models used in edge environments must be designed to be lightweight, fast, and resource-conscious, which often requires techniques like weight pruning, quantization to reduce model size and computational complexity while maintaining accuracy [8]. Inefficient AI models can degrade the performance of the device, causing delays, excessive power consumption, or a reduction in the overall quality of AI-driven tasks. Thus, the AI models must be tailored to the specific edge device’s capabilities. Together, hardware and AI model work in tandem to create a system that is not only capable of performing real-time, intelligent tasks but also operates reliably and efficiently within the constraints of edge environments [8].

However, selecting appropriate edge hardware among a set of candidates that vary from low-power microcontrollers to single-board computer systems; similarly, an appropriate AI model among a set of candidates varying in computational and memory requirements is not a straightforward task [9,10]. Determining the optimal solutions requires considering several factors regarding the hardware aspects, such as the device’s processing power, memory capacity, energy consumption, and physical size, and AI model aspects such as model size, inference speed, resource consumption [6]. Most of the time, decisions on the selection of hardware and the AI models for deployment on the selected edge devices are made by senior practitioners using ad hoc approaches, which can cause inconsistency and inefficiency problems, and make the decision process unjustifiable [9]. Moreover, these approaches may also overlook critical trade-offs (e.g., performance vs. power consumption), and can lead to sub-optimal choices and issues with resource allocation (e.g., more powerful hardware selection for tasks that do not require it, wasting energy and increasing costs; or less capable hardware chosen for tasks that require more computational power, leading to performance bottlenecks). Without a structured method for selection, the effectiveness and efficiency of the solutions can be sacrificed [9,11].

The challenges regarding hardware selection in edge computing closely mirror those encountered in IoT platform selection, as both require navigating a diverse landscape of heterogeneous options while evaluating multiple criteria such as performance, energy efficiency, cost, and integration [10]. Existing literature on IoT highlights the challenges regarding the hardware selection: For example, Contreras-Masse et al. [9] emphasized the complexity of IoT platform selection, indicating the challenges faced by companies in selecting the right IoT platform for Industry 4.0 implementation due to variety of available platforms and their distinct characteristics. Ilieva et al. [11] have also addressed the criticality of choosing suitable platforms considering the technical, social, and organizational aspects.

Upon reviewing the literature, we encountered three main approaches that are utilized for hardware selection, which are optimization-based techniques, machine learning-based approaches, and MCDA techniques [11,12,13,14]. Optimization-based techniques are used to optimize specific objectives, such as minimizing cost or energy consumption. However, their limitation lies in their tendency to find local optima rather than global solutions [11]. Machine learning methods, while powerful, rely heavily on large, high-quality datasets, which can be a limitation if the data are limited or noisy. On the other hand, MCDA techniques, such as AHP [15], TOPSIS [16], and VIKOR [17], are widely recognized for their flexibility to work well even on small datasets; numerous techniques have been proposed to determine rankings of objects, attribute weights, and their combinations, and the results can be easily analyzed [11]. MCDA techniques have also been proven to be effective in scenarios involving trade-offs between conflicting criteria [17], making them also well suited for decision-making tasks like edge hardware and AI model selection.

Several studies in literature have applied MCDA techniques for hardware selection tasks: For example, Silva et al. [10] have proposed a multi-criteria decision-making framework to assist engineers and technology managers in evaluating and selecting the appropriate IoT hardware, assessing parameters like performance, storage, and energy consumption to match the specific requirements of the IoT deployment. Similarly, Contreras-Masse et al.’s study [9] has proposed an MCDA approach to guide companies in selecting suitable platforms, evaluating a range of diverse and often conflicting criteria, ensuring that the selection aligns with the specific needs, goals, and constraints of the organization. Ilieva et al. [11] have extended this approach to agriculture and developed a framework using a modified Multi-Attribute Border Approximation Area Comparison (MABAC) method in a fuzzy environment.

Regarding the use of MCDA techniques, one key shortcoming highlighted in the literature is the reliance on a narrow set of criteria during the assessment phase. Most studies on selection tasks [18,19,20,21,22] have focused on a limited set of criteria. While existing work demonstrates the utility of MCDA for specific cases, there is a need for broader evaluation frameworks or methodologies to address the multi-faceted nature of hardware and AI model selection.

The primary work addressed in this study is driven by the lack of a systematic method for the selection of suitable hardware and AI models for edge applications. In this study, we propose a generic method for the evaluation of edge hardware and AI model candidates. The method relies on the ISO25010:2011 quality standard [23], which provides a comprehensive and up-to-date set of quality characteristics and sub-characteristics that support a thorough evaluation of edge hardware and AI models. By integrating MCDA techniques into the method, we aim to enable a structured decision-making process that can handle the complexity and multi-faceted nature of the hardware and AI model selection. The proposed method is then tested through an experiment consisting of two stages: In the first stage of the experiment, to show the applicability of the method across various use cases, we tested the method via four scenarios on UAV context, each presenting distinct requirements. In the second part of the experiments, guided by the suggestions of the method for Scenario I, we created the TensorFlow Lite model from scratch to improve the efficiency and versatility of the AI model for object detection tasks.

This paper is organized as follows: Section 2 presents the background information for the study. Section 3 presents the related work, reviewing the prior studies that have explored the evaluation and selection of edge devices and AI models, and highlights our contributions with this study. Section 4 describes the methods, explaining the proposed approach and how it integrates the ISO/IEC 25010:2011 standard and MCDA techniques for the selection and evaluation of edge hardware and AI models. Section 5 provides the experiments section, showing the applicability of the proposed method to different use cases and provides the additionally created TensorFlow Lite model. Section 6 provides the discussion on the results and implications of the experiments, emphasizing the effectiveness of the proposed method. Section 7 addresses the threats to validity for this study. Finally, Section 8 provides the overall conclusions.

2. Background

This section is organized as follows: It first introduces the Edge AI paradigm in Section 2.1, followed by an exploration of the key components and technologies integral to Edge AI by providing edge device families and hardware in Section 2.2, Edge AI software in Section 2.3, and Edge AI model architectures in Section 2.4. After that, in Section 2.5, the ISO/IEC 25010:2011 standard is explained since the proposed method is structured on this standard. Lastly, in Section 2.6, MCDA techniques are explained with a focus on fuzzy MCDA methods, which are employed in the evaluation stage of the proposed method.

2.1. Edge AI Paradigm

Edge computing has emerged due to the limitations of cloud computing, to address the issues of latency, bandwidth, and quality of experience (QoE) for users [1]. Initially, cloud computing provided significant advantages in terms of storage and processing power, but its centralization led to high latency and bandwidth constraints, which become especially problematic for mobile and IoT devices [1]. To overcome these challenges, the concept of edge computing was introduced. Instead of relying on distant mega-data centers, edge computing distributes computing resources closer to users, at the “edge” of the network. This involves deploying smaller, localized data centers, such as micro data centers and cloudlets, to handle tasks closer to where data are generated and consumed. This setup reduces latency, increases bandwidth, and enhances user QoE by processing data locally rather than in a remote data center.

The evolution of edge computing further includes the integration of AI at the edge, known as Edge AI. This development includes advancements in AI and the proliferation of edge devices to perform intelligent computations closer to the user. Edge AI has been used in a wide range of applications, from autonomous driving to real-time healthcare issues [6], and is enabled by recent improvements in hardware and network technologies such as 5G and beyond [1].

2.2. Edge AI Hardware

There is a diverse range of Edge AI hardware platforms in the market to meet the varied requirements and challenges of different edge computing applications such as performance requirements, power constraints, size and form factors and cost considerations. Sipola et al. [24] provided a comprehensive examination of hardware platforms utilized in Edge AI: AI Acceleration Units, which are specialized processors designed to accelerate AI and machine learning (ML) tasks, are typically optimized for specific computations like neural network operations. Intel Neural Compute Engine (Intel Corporation, Santa Clara, CA, USA), MediaTek AI Processing Unit (APU) (MediaTek Inc., Hsinchu City, Taiwan), Google Edge TPU (Google LLC, Mountain View, CA, USA), NVIDIA Deep Learning Accelerator (NVDLA) (NVIDIA Corporation, Santa Clara, CA, USA), Gyrfalcon Matrix Processing Unit (MPE) (Gyrfalcon Technology Inc., Beijing, China), Mythic M1076 Mythic AMP (Mythic Inc., Austin, TX, USA), Syntiant Neural Decision Processors (Syntiant Corp., Irvine, CA, USA), and Hailo AI Processor (Hailo, Tel Aviv, Israel) can be given as examples. Field-Programmable Gate Arrays (FPGAs) are flexible processors that can be configured to optimize AI tasks, allowing custom design for specific applications. Some examples are Intel MAX V CPLD (Intel Corporation, Santa Clara, CA, USA), Intel Cyclone 10 LP (Intel Corporation, Santa Clara, CA, USA), and Intel Cyclone 10 GX (Intel Corporation, Santa Clara, CA, USA). System-on-a-Chip (SoC) and System-on-Module (SoM) Devices integrate multiple components, including AI accelerators, into a single chip or module, often used in embedded and edge devices. Intel Movidius Myriad X (Intel Corporation, Santa Clara, CA, USA), HiSilicon Kirin 970 (HiSilicon Technologies Co., Ltd., Shenzhen, China), Qualcomm Snapdragon 855+/860: SoC (Qualcomm Technologies, Inc., San Diego, CA, USA), and MediaTek Helio P90 (MediaTek Inc., Hsinchu City, Taiwan) can be given as examples. Specialized (Dedicated) AI Modules are AI-specific modules designed to be added to various systems or used for development purposes. Google Coral Accelerator Module (Google LLC, Mountain View, CA, USA), Gyrfalcon MPE (Gyrfalcon Technology Inc., Beijing, China), and Mythic M1076 (Mythic Inc., Austin, TX, USA) can be provided as examples to this category. Development Boards (e.g., BeagleBone AI (BeagleBoard.org Foundation, Austin, TX, USA), OpenMV Cam (OpenMV, San Francisco, CA, USA), SparkFun Edge Development Board (SparkFun Electronics, Niwot, CO, USA), Syntiant Tiny Machine Learning Development Board (Syntiant Corp., Irvine, CA, USA), and STMicroelectronics STM32 (STMicroelectronics, Geneva, Switzerland)) are designed for prototyping, development, and testing of AI applications and solutions. NVIDIA devices (NVIDIA Corporation, Santa Clara, CA, USA) are another family of single board computer systems used in Edge AI applications. NVIDIA devices utilize GPUs for AI and machine learning tasks. Examples include Jetson Nano (NVIDIA Corporation, Santa Clara, CA, USA), Jetson TX2 (NVIDIA Corporation, Santa Clara, CA, USA), and Jetson Xavier NX (NVIDIA Corporation, Santa Clara, CA, USA).

2.3. Edge AI Software

Edge AI software includes frameworks and tools that are designed to optimize the deployment and execution of AI algorithms on edge devices. Edge AI software includes specialized libraries and platforms such as TensorFlow Lite v2.14.0, PyTorch Mobile v1.13, and ONNX Runtime v1.13, which are designed to run efficiently on resource-constrained environments with high performance [24]. Issues regarding the deployment of AI models on edge hardware platforms also need to be considered. For instance, for the optimization of neural networks for edge devices, TensorFlow Lite and PyTorch provide tools for optimizing models for low-power devices by supporting the techniques like quantization, pruning, and weight clustering techniques to compress models for efficiency [24]. There are also Edge AI software tools for microcontrollers. For example, TensorFlow Lite v2.14.0 for microcontrollers and deepC v1.0 enable running neural network models on microcontrollers, with tools for compiling models to efficient machine code. Other tools include Glow v0.5, ONNC v1.3, TVM v0.11, and OpenVINO v2021.4, which support various platforms and optimization techniques [24]. NNoM v0.4, X-CUBE-AI v10.0, e-AI v1.0, eIQ v5.0, and nncase v1.0 [24] are vendor-specific tools that provide model conversion and optimization for microcontrollers. These tools and frameworks facilitate the deployment of neural network models on resource-constrained devices, optimizing performance across various hardware platforms.

2.4. Edge AI Model Architectures

Edge AI model architectures are specifically designed to balance the computational demands of AI with the limited resources available on edge devices. These architectures often utilize lightweight neural network models such as MobileNet [25,26], SqueezeNet [27,28], and EfficientNet [29,30], which are optimized for reduced size and lower power consumption while retaining high accuracy. Additionally, techniques like model quantization and pruning are employed to further enhance performance and efficiency by reducing the complexity and size of models. Edge AI also benefits from the deployment of specialized models like YOLO (You Only Look Once) for real-time object detection [31] and LSTM (Long Short-Term Memory) [32,33] networks for sequential data processing. These architectures are designed to operate effectively within the constraints of edge devices, supporting real-time inferencing and decision-making directly on the device without relying on cloud-based processing. The development of these models involves tailoring them to fit within the computational, memory, and power limitations of edge hardware while still delivering robust AI capabilities.

2.5. ISO/IEC 25010:2011 Standard

ISO/IEC 25010:2011 [23], part of the “Software Engineering—Software Product Quality Requirements and Evaluation (SQUARE)” series, is a comprehensive standard designed to specify and evaluate software product quality. It provides a structured framework for assessing software quality across diverse domains, covering essential quality dimensions such as performance, reliability, and security. This holistic approach enables consistent evaluations, clear justifications for decisions, and informed decision-making. The principles and quality characteristics outlined in ISO/IEC 25010:2011 can be effectively applied to evaluate both edge hardware products and AI models for edge applications. In this study, this standard is used in the methodology provided in Section 4 to evaluate edge hardware and AI model candidates.

The use of ISO/IEC 25010:2011 for this study is motivated by both technical and business considerations. From a technical perspective, the standard offers a comprehensive set of quality characteristics that are highly relevant to Edge AI applications. Moreover, since ISO 25010 provides a well-established framework for quality assessment, some studies in the literature have also utilized it for the quality assessment of edge systems [34,35,36]: For instance, Orsini et al. [34] referenced ISO/IEC 25010 for mobile edge computing. White et al. [35] focused on IoT systems layers and investigated quality dimensions, utilizing ISO 25010 characteristics.

As shown in Figure 1, the ISO/IEC 25010 standard identifies 8 key quality characteristics and 31 sub-characteristics. The definitions and the relevance of each characteristic for evaluating edge hardware and AI models follow [37]:

Functional Suitability: The standard defines it as the “degree to which a product or system provides functions that meet stated and implied needs when used under specified conditions”. This characteristic ensures that the edge applications perform their intended functions within the constrained environments of edge computing, such as limited processing power, memory, and energy resources.
Reliability: The standard defines it as the “degree to which a system, product or component performs specified functions under specified conditions for a specified period”. It is a critical factor for edge systems, as they often operate in real-time and in dynamic environments, such as vehicles, drones, or remote monitoring systems. These systems are expected to function consistently under specified conditions over an extended period without failure.
Usability: The standard defines it as the “degree to which a product or system can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use”. The usability characteristic ensures that the system can be effectively and efficiently used by the intended stakeholders. For example, for end-users, systems like smart home devices should be intuitive, responsive, and easy to interact with.
Performance Efficiency: The standard defines it as the “degree of the performance relative to the number of resources used under stated conditions”. Performance efficiency is one of the most critical characteristics for Edge AI applications. Edge devices often have limited resources in terms of processing power, memory, and battery life. Ensuring that the system delivers expected performance under these limited resources is essential for effective deployment in edge environments.
Portability: The standard defines it as the “degree of effectiveness and efficiency with which a system, product or component can be transferred from one hardware, software or other operational or usage environment to another”. This characteristic ensures that AI models and hardware can be easily deployed across various devices, allowing for flexible and scalable Edge AI solutions.
Maintainability: The standard defines it as the “degree of effectiveness and efficiency with which a product or system can be modified to improve it, correct it, or adapt it to changes in the environment, and in requirements”. This characteristic ensures that edge hardware and AI models can be easily updated or optimized, which is essential for adapting to evolving requirements and ensuring long-term system viability.
Compatibility: The standard defines it as the “degree to which a product, system, or component can exchange information with other products, systems, or components, and perform its required functions while sharing the same hardware or software environment”. Ensuring compatibility ensures that edge devices and AI models can work cohesively within a larger ecosystem, enabling seamless data exchange, integration with other services, and interoperability across different technological stacks.
Security: The standard defines it as the “degree to which a product or system protects information and data so that persons or other products or systems have the degree of data access appropriate to their types and levels of authorization”. Edge devices and AI models often process personal, medical, financial, or other confidential data, which makes robust security mechanisms essential. The security characteristic ensures that the system protects data integrity, confidentiality, and availability by preventing unauthorized access or tampering.

From the business perspective, adopting ISO/IEC 25010:2011 provides several advantages: First, it is a globally recognized standard for software product evaluation, ensuring that the proposed methodology aligns with industry best practices. This supports confidence in the evaluation process among stakeholders. Second, the standard provides a flexible framework that can be applied to a wide range of edge hardware (e.g., microcontrollers, single-board computers) and AI model types, and across various domains (e.g., autonomous vehicles, smart homes, MAVs). This flexibility offers significant business value, as the methodology can be generalized for a variety of use cases, enabling adaptable Edge AI solutions.

2.6. Multi-Criteria Decision Analysis (MCDA) Methods

Many daily problems are intricate and require evaluating several parameters to come up to a final decision. MCDA methods are applied to facilitate the identification of rational solutions [38]; they are employed to develop Decision Support Systems (DSS) [38,39] in various fields such as supplier selection [38,40], location choice [38,41], industry [38,42], and healthcare [38,43]. In many real-world scenarios, we need to make rational decisions from a range of options. However, real-world problems often involve uncertainties that complicate this process. MCDA techniques offer valuable assistance for these issues. Various evaluation techniques, which are based on fuzzy logic and known as fuzzy MCDA techniques [44,45], have been developed to facilitate the efficient and effective assessment of decision alternatives in uncertain conditions.

Fuzzy MCDA techniques are also well suited to the problem of selecting suitable edge hardware and AI models for edge applications for the following reasons:

Handling the complexity and uncertainty in decision parameters: Edge hardware selection involves assessing various criteria, many of which are subjective, uncertain, or difficult to quantify accurately. For example, while supported AI frameworks or cost might have clear numerical values, criteria like processing efficiency or power consumption may involve ranges of values. In this regard, fuzzy MCDA techniques allow decision-makers to handle these uncertainties by using fuzzy sets, such as Triangular Fuzzy Numbers (TFN), which represent values as ranges instead of precise numbers [38].
Weighting and Prioritizing Criteria: Different applications may prioritize certain criteria over others. For example, in battery-powered drones (MAVs), power consumption may be weighted higher than processing power, or similarly, in an industrial setting, reliability and processing power may take precedence over cost. Fuzzy MCDA techniques enable decision-makers to define weighted importance for each criterion and ensure that the chosen option aligns with the specific goals and the requirements of the application.
Multi-dimensional trade-offs: In selecting edge hardware, trade-offs often arise due to conflicting objectives (e.g., maximizing performance while minimizing power consumption and cost). These trade-offs can be difficult to handle with traditional methods since they require balancing multiple dimensions simultaneously. Fuzzy MCDA techniques are well suited for handling such multi-dimensional trade-offs. They provide a structured way to evaluate different alternatives, considering both profit criteria (those that should be maximized, like performance) and cost criteria (those that should be minimized, like power consumption or cost). By this way, decision-makers can ensure that the trade-offs are balanced appropriately based on the context.

In this study, six fuzzy MCDA methods are applied both to address the problem of handling trade-offs and to allow for cross-validation of the results. The employed fuzzy MCDA methods are as follows:

TOPSIS (Technique for the Order of Prioritization by Similarity to Ideal Solution) [16]: This method ranks alternatives based on their distance from an ideal solution–best possible outcome and a negative-ideal solution–worst possible outcome. It is useful in the presence of many conflicting criteria to be optimized, such as processing power vs. power consumption.
MABAC (Multi-Attributive Border Approximation Area Comparison) [46]: This method is designed to rank alternatives based on their relative distance from the boundary of acceptability, and it is ideal for cases where the decision involves defining thresholds or boundaries for acceptable hardware performance under fuzzy conditions.
COMET (Characteristic Objects Method) [47]: This method evaluates decision alternatives according to their characteristic objects and ranks them according to a comprehensive comparison of criteria. The fuzzy COMET method is well suited for representing subjective preferences and uncertainties in criteria such as performance stability or long-term reliability.
SPOTIS (Stable Preference Ordering Towards Ideal Solution) [48]: This method simplifies the decision process by comparing alternatives to an ideal solution while considering fuzzy criteria. It is advantageous when evaluating hardware which must satisfy certain threshold criteria, such as a minimum acceptable level of power consumption or processing speed, under uncertain conditions.
COCOSO (Combined Compromise Solution) [49]: This method combines different compromise solutions to provide a more robust ranking of alternatives, which is beneficial in multi-criteria hardware selection problems where performance and cost must be balanced. Fuzzy COCOSO helps handle the uncertainty in performance and energy efficiency.
VIKOR [17]: This method focuses on finding a compromise solution, which is particularly relevant when there are multiple conflicting criteria with trade-offs. It considers both the best and worst performance across the criteria and aims to minimize regret. Fuzzy VIKOR is useful in edge hardware selection because it addresses trade-offs between attributes like cost, power efficiency, and processing power under fuzzy conditions.

3. Related Work

In this section, firstly, the studies that have examined the evaluation and selection of edge devices and AI models are presented. Then, our contribution with this study is emphasized.

Mandel et al. [50] studied selecting suitable hardware for UAVs. They proposed a decision-making approach using the Analytic Hierarchy Process (AHP) method to assess hardware alternatives, and the ISO 25000 family (SQUARE) [23] to evaluate software quality, considering computational performance, size, weight, and power constraints. Garcia-Perez et al. [8] studied analyzing the edge computing device behavior with embedded AI models, utilizing devices such as Raspberry Pi 4 and Google Dev Coral. They measured the inference time, energy consumption, RAM, and CPU usage. Sipola et al. [24] presented a review of Edge AI applications, where they focused on different categories of hardware and software options that can be useful for the selection process. Regarding hardware evaluation and development boards, Imran et al. [51] provided an analysis of development boards that can run AI algorithms at the edge. Their study provided various hardware options and their capabilities, which can be beneficial for selecting the appropriate hardware considering the performance and application requirements. Regarding machine learning and edge device implementation, Merenda et al. [52] reviewed models, architectures, and requirements for implementing edge machine learning on IoT devices. Hadidi et al. [7] evaluated commercial edge devices, such as Raspberry Pi 3B and Jetson TX2, using popular frameworks and Convolutional Neural Networks (CNNs), and examined the impact of various frameworks and software optimizations on device performance. Regarding benchmarking and performance evaluation, various benchmarking (e.g., EdgeFaasBench [53], the YOLO benchmark [54], DL models benchmark [55]) are conducted to evaluate performance metrics such as response times, accuracy, Frame Per Second (FPS) on various edge devices such as Google Coral, NVIDIA Jetson Nano, and the Raspberry Pi series.

The following shortcomings in the literature have motivated us to conduct this study:

Most of the studies in the literature e.g., [7,50] have primarily focused on performance metrics and specific hardware and software combinations for Edge AI applications.
There are a limited number of studies that rely on standards such as ISO 25010 for the evaluation and selection process of edge hardware and AI models. Most of the studies have conducted benchmarking and focused on specific performance metrics (e.g., inference time, power consumption) for specific hardware, e.g., [53,54,55].
In terms of evaluation scope, the studies in the literature, e.g., [53,54,55], are often centered around benchmarking on specific devices (e.g., development boards such as the NVIDIA family).
A limited number of studies have utilized MCDA techniques for the selection of edge hardware or AI models. Moreover, these studies have considered a limited number of assessment criteria [9].
We have not encountered studies that employ fuzzy MCDA techniques for the evaluation and selection of edge hardware and AI models, assessment criteria of which may require considering uncertainties.

Different from the studies in the literature, and to the best of our knowledge, there is no comprehensive, generic method or framework for evaluating and selecting edge hardware and AI models based on a quality standard. Our study presents an adaptable method for the evaluation and selection of edge hardware and AI models relying on the ISO 25010:2011 quality standard, which provides a set of quality characteristics and sub-characteristics, supporting a thorough evaluation process. Unlike the previous works that primarily focused on performance benchmarking for specific edge devices (e.g., Raspberry Pi 4, NVIDIA Jetson Nano), our framework provides an adaptable structure that that can integrate quality attributes from ISO 25010:2011 to enable a comprehensive evaluation method.

4. Method

This section elaborates on the proposed method for systematically evaluating edge hardware and AI models. The method aims to translate stakeholder goals into measurable quality criteria to ensure alignment with operational needs and technical specifications. The method provides a four-layered structure—Requirements, Specification, Measurement, and Evaluation—which provides a comprehensive framework for assessing relevant quality dimensions and guiding for decision-making on edge hardware and AI models. Figure 2 presents the meta-model diagram that serves as the backbone of the method, illustrating the elements and their interactions within each layer. For the overall structure in the figure, we adopted the meta-model structure from our previous study [56] that provides a validated meta-model [57] for open-source software applications. In this current study, we extended that structure in the Requirements and Specification layers as specific to Edge AI quality.

The following subsections provide a detailed explanation of each layer and their contributions to the quality assessment and the selection processes of edge hardware and the AI model: Section 4.1 explains the structure related to the quality assessment facility of the method. Section 4.2 explains the structure that can be utilized for the selection of hardware and AI models. In addition, Section 4.3 provides the process flow diagram to adapt this framework to Edge AI applications, illustrating the steps in applying the method to various use cases.

4.1. Quality Assessment Framework for Edge AI Applications

4.1.1. Requirements Layer (Stakeholder Goal Identification)

The Requirements Layer forms the foundation of the meta-model by identifying and addressing stakeholders and their expectations from the system in interest. Stakeholders (e.g., end-users) specify their desired outcomes and goals for the system, referred to as Quality Goals. These goals include high-level characteristics like reliability, performance efficiency, functional suitability, or safety. By capturing and associating these goals with the rest of the framework systematically, this layer ensures that the evaluation process aligns with real-world needs and priorities. For instance, a drone system’s user might prioritize reliability through minimal system downtime or extended flight duration, which translates into measurable entities in later layers.

4.1.2. Specification Layer (Quality Attributes, Sub-Quality Attributes, and Metrics Definition)

The Specification Layer operationalizes the Quality Goals from the Requirements Layer by associating them with the Edge Device Aspect, which acts as a container or an aggregator for the Hardware Aspect and Software Aspect. Both the Hardware Aspect and the Software Aspect are assessed by specialized Quality Models, which represent the structure defining what constitutes the quality for these aspects. A Quality Model is characterized by one or more Quality Attributes, which may have Sub-Quality Attributes. While defining Quality Model and Quality Attributes, the structure defined in ISO 25010 is taken as reference, which defines top-level Quality Attributes such as Performance Efficiency, Reliability; sub-level quality attributes (Sub-Quality Attribute) such as Time Behavior, Resource Utilization, and Capacity under Performance Efficiency QA. Each Sub-Quality Attribute is evaluated by one or more Quality Metrics, which is a Measurable Entity. Metrics are specific indicators that provide numerical or qualitative data about the performance of the edge device in relation to its Quality Attributes. A Quality Metric can have a positive or a negative Impact on a Quality Attribute. To give an example, while a modular design may have a positive Impact on the Maintainability QA, it may degrade the performance efficiency QA of the system. Each edge hardware candidate (EdgeHWCandidate) is evaluated against the Hardware Aspect, which is evaluated by the Quality Model that is specialized for edge hardware. Similarly, each software or AI model candidate (SWCandidate) is evaluated against the Software Aspect, which is evaluated by the Quality Model specialized for AI Models/Software. Each aspect has its own set of Quality Attributes and Quality Metrics. For instance, Hardware Aspect of an edge device can be associated with the Quality Model that is characterized by the performance efficiency QA; the performance efficiency QA of the hardware aspect can include metrics such as CPU utilization and memory usage, which are measurable entities. These can be evaluated by the percentage of time the CPU spends executing instructions and by the amount of RAM used by applications, respectively. Since each Stakeholder of the system may have different Quality Goals, their priorities and perceptions of what constitutes quality can vary. For example, the quality concerns of the system when in operation mode can be crucial for an end-user, and she or he may prioritize the characteristics such as performance efficiency, reliability, and functional suitability of the system over maintainability or portability characteristics. Therefore, a weighting concept is employed to assign weights to Quality Attributes and Sub-Quality Attributes using the Weighting Method (e.g., Analytical Hierarchical Process).

4.1.3. Measurement Layer (Data Collection and Normalization)

The Measurement Layer focuses on quantifying the Measurable Entities identified in the Specification Layer. This quantification process uses specific Measurement Methods, which can be manual, tool-based, or derived from product specifications. The measurements have Units and Scales. To facilitate comparisons and aggregation across diverse metrics, measures are normalized into Normalized Measures to eliminate inconsistencies that can be caused by differing units or scales. These normalized values are aggregated using Measure Aggregation Methods (e.g., averaging) to produce Aggregated Measures. The outcome of this layer is the Measurement Results, serving as essential inputs for the evaluation stage.

4.1.4. Evaluation Layer (Evaluation and Ranking of Hardware and AI Model Candidates)

The Evaluation Layer integrates the outputs from the Measurement Layer to compute an overall assessment of the system. In this stage, the resulting values, impacts, and weights of sub-characteristics serve as inputs for evaluation. These inputs are aggregated by an Evaluation Aggregation process using an Evaluation Aggregation Method like TOPSIS or SPOTIS; and an Evaluation Aggregation Function such as the weighting aggregation function. After obtaining the measurement results and using the evaluation concept, an Evaluation Result is calculated (e.g., scaled between 0 and 1).

4.2. Selection of Suitable Edge Hardware and AI Models

The method not only provides a structured approach for assessing the quality of Edge AI applications, but also guides the selection of appropriate edge hardware and AI models based on the outcomes of the evaluation process.

4.2.1. Selection of Edge Hardware (Phase I)

In Phase I of the selection process, in the evaluation layer and using Evaluation Aggregation Methods such as TOPSIS or SPOTIS, the method combines normalized measures and stakeholder-defined weights to rank edge hardware candidates.

4.2.2. Selection of AI Model (Phase II)

Once the edge hardware has been selected, Phase II involves selecting the suitable AI model that can run on the chosen hardware. Similar to the Phase I hardware selection process, Evaluation Aggregation Methods are used to rank the Edge AI software candidates.

Overall, the proposed method presents a structured framework for the quality assessment of the edge products and presents a guide for the selection of edge hardware and Edge AI models across specified quality dimensions.

4.3. Process Flow to Adopt the Method

The process flow provided in Figure 3 outlines the steps to adopt the meta-model to Edge AI applications by mapping the steps to the structure provided by the meta-model.

5. Experiments

This section presents the experiments for the method. It is organized as follows: Section 5.1 provides the details of the case study for evaluating the proposed method across four scenarios in the context of UAV, demonstrating the method’s applicability to different requirements. Section 5.2 demonstrates the practical relevance of the method. In Section 5.2.1, we first provide the deployment results of Scenario I. Then, in Section 5.2.2, we provide the details of the TensorFlow Lite model that we created from scratch using the outcomes of the method for Scenario I, which is aimed to improve the performance and efficiency of the model in object detection tasks.

5.1. Test of the Proposed Method

The proposed method given in Section 4 is tested through four scenarios in a UAV context to demonstrate its applicability to different use cases. For the test of the proposed method, we employed a sequential embedded case study. The high-level structure of the case study is depicted in Figure 4 (case study design). In the first phase of the case study, we evaluate edge hardware candidates and identify the best hardware option for each scenario. For the AI model, the object detection problem is chosen since UAVs often employ object detection for tasks such as obstacle avoidance and navigation. After selecting the best hardware option for each scenario, in the second stage of the case study, for the suggested edge hardware option for each scenario, we evaluate feasible object detection models differing in architectures and select the best AI model option for each scenario.

This subsection is organized as follows: Section 5.1.1 provides an overview of the scenarios that are proposed to evaluate the model. Section 5.1.2 provides the details on how the method is applied to each scenario by aligning the stages and the steps of the scenario cases with those of the proposed method.

5.1.1. Overview of the Scenarios

To demonstrate the method’s applicability to different use cases, we propose four scenarios in UAV context (with Scenario I on MAV context, where MAV is a smaller, more lightweight type of UAV), each presenting different levels of computational requirements, power efficiency, and operational complexities. The rest of this part provides an overview of each scenario.

Scenario I: Low-Cost, Low-Power, Lightweight MAV for Basic Object Detection Tasks

This scenario targets the deployment of a cost-effective, lightweight Micro Air Vehicle (MAV) designed to perform basic object detection and person detection. Designed for affordability, agility, reliability, and energy efficiency, the MAV in this scenario is intended for straightforward applications in resource-constrained environments, such as environmental monitoring and basic navigation. The focus of this MAV design is to achieve a balance between low hardware requirements and functional adequacy for simple detection tasks. While lightweight construction and compact size are essential for agility of the system, energy efficiency is also important for extended operational time. Given the simplicity of its tasks, the system does not require high processing power. Instead, it requires moderate computational capacity to perform reliably without excessive energy consumption. Additionally, memory and storage requirements are minimal, as the MAV is not designed for large-scale data handling or complex model processing.

Scenario II: Mid-Level UAV for AI-Based Obstacle Detection

Scenario II focuses on the design of a UAV intended for applications requiring moderate computational performance, suitable for AI-based obstacle detection. In this scenario, the UAV needs to perform more complex tasks than basic detection algorithms, such as real-time obstacle avoidance, but without significantly increasing cost. While power consumption and reliability are still considered, they are not the primary priorities. This suggests that the UAV is intended for environments where occasional trade-offs in energy efficiency and reliability can be tolerated in exchange for the added computational capabilities needed for AI-based processing. Additionally, weight and size are secondary considerations, suggesting that the UAV can accommodate slightly bulkier or heavier components to support the increased processing power required for these AI tasks.

Scenario III: General-Purpose UAV for Outdoor Object Detection

This scenario targets the design of a cost-effective UAV intended for general-purpose object detection in outdoor environments such as forests, agricultural lands, or urban areas. The highest priority is on affordability, with processing power being a secondary concern, sufficient to handle moderate detection tasks. The UAV requires basic resources such as flash memory and RAM to store necessary data and run lightweight detection algorithms effectively. While power consumption is considered, it is not a primary focus, as the UAV is designed for tasks that do not demand extensive power resources. Reliability is also not prioritized, enabling the use of more affordable components that fulfill the functional requirements without high costs. Weight and size are less critical in this design, but should still be managed to ensure adequate operational efficiency in outdoor settings. This UAV is intended for organizations seeking a budget-friendly, straightforward solution for outdoor object detection tasks that do not require extensive computational demands.

Scenario IV: High-Performance UAV for Real-Time Advanced Object Detection and Tracking

In this scenario, the UAV is tasked with performing real-time 3D object detection for autonomous navigation. The UAV operates in a dynamic environment, requiring high computational power for fast and accurate inference to identify objects, obstacles, and terrain in real-time. While cost and power efficiency in this scenario remain a consideration, they are secondary to the performance, allowing for compromises on affordability and power efficiency. Weight and size factors are also less critical in this design, where a bulkier design is acceptable to fulfill the need for the increased processing power.

5.1.2. Application of the Method to the Scenarios

In the subsequent sections, the realization of the method for each scenario is provided by aligning the stages and steps of each scenario case with those of the proposed method.

Requirements: Layer/Stakeholder Goal Identification

Process Flow—Step 1, Step 2: Stakeholder and Stakeholder Goal Identification

Example stakeholders for each scenario, along with their goals, priorities, and the QA trade-offs are provided in Table 1.

Specification Layer/Hardware and AI Model Quality Specification

Process Flow—Step 3: Quality Specification (Define QM, Relevant QAs, and Sub-QAs)
Process Flow—Step 4: Measurement Metrics and Measurement Methods Specification (Define Metrics and Measurement Methods for HW and AI Model Aspects)

Specific evaluation indicators for accuracy, computational efficiency, resource utilization, adaptability, and robustness criteria, tailored to different application scenarios and device constraints, are provided in Appendix A, Table A1. In this study, the retrieved evaluation indicators for each scenario are provided in Table 2. Also, a simplified object model diagram for the case study is given in Appendix B to offer visual assistance on the realization of the method.

It is important to note that we also included the “Cost” metric under the Portability, Replaceability characteristic since the cost of edge hardware is often considered and weighed against other criteria such as performance, power consumption, and reliability. In some cases, although higher-cost edge hardware may offer better capabilities, decision-makers evaluate whether those benefits are worth the additional expense based on the specific use case. In addition, we included the Weight and Size metrics under the Usability characteristic since they are important for UAV’s agility and flight performance [58]. Lighter and compact hardware helps in increasing flight time and maneuverability.

Measurement Layer/HW Model Metrics Measurement

Process Flow—Step 5: Measurement Execution for HW Criteria (Collect Data for Hardware Model Criteria/Metrics)

In this step, relevant metrics of edge hardware candidates are provided. A list of widely used edge hardware products suitable for UAVs are provided in Table 3 along with the relevant quality metrics (namely, MTBF, Processing Power, Supported AI Frameworks, Power Consumption, Weight, Size, Flash Memory, RAM, Cost Metrics).

Evaluation Layer/Evaluation and Ranking of Hardware Options

Process Flow—Step 6: Evaluation & Ranking (Apply MCDA methods to evaluate and rank HW candidates based on weighted scores)
Process Flow—Step 7: Selection of Edge HW

In the evaluation step, six fuzzy MCDA techniques are employed. The rationale on the use of fuzzy MCDA techniques for the selection of edge hardware and AI models, and the descriptions of the employed fuzzy methods, namely the TOPSIS, MABAC, COMET, SPOTIS, COCOSO, VIKOR methods, are explained in Section 2.6.

A significant issue considered in the evaluation process is the assignment of weights to the relevant edge hardware criteria/metrics. The following process is employed to determine the weights of both edge hardware and AI models criteria:

Focus Session Setup: A focus session was conducted with a group of six expert practitioners in the field working for a civil aviation company. Initially, the experts were given a detailed overview of the goals for each scenario given in Section 5.1.1, such as low-cost or high-performance objectives. The experts then engaged in a discussion to evaluate the relative importance of each criterion in achieving these goals.
Expert Scoring: Following the focus session, each expert was asked to assign scores to each criterion, typically on a scale of 1 to 10, where 1 represents the least importance and 10 represents the highest importance. Experts provided their individual opinions on how critical each criterion is for the success of the UAV design in the specific scenario.
Aggregation of Experts Scores: After the individual expert scores were collected, the scores were aggregated to obtain a consensus weight for each criterion. The aggregation process involved calculating the mean and standard deviation (SD) of the expert scores for each criterion.
Statistical Analysis: An average score for each criterion from the experts is calculated. To understand the level of agreement or disagreement among the experts, SD was calculated to understand how the scores varied around the mean. A low SD (<1) indicated strong agreement, while a high SD (>1) showed variability in expert opinions. Population standard deviation is used to find the SD: $σ = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}$ where N is the total number of experts, x_i individual expert score, and μ is the mean (average) of all the scores.
Final Weight Assignment: After analyzing the statistical data, assigning final weights based on consensus was crucial. Since criteria with low SD reflected a strong consensus, their weights were directly based on the mean. In contrast, criteria with high SD (>1) required further discussion and adjustments of scores to ensure that the final weight reflected a more balanced perspective. The weight assignment session for edge hardware was finalized in two rounds of discussion.
Weight Derivation: Finally, weight derivation is employed through normalization, where mean scores were divided by the maximum score to produce normalized weights, and re-adjustment of the weights was performed so that they sum to 1, ensuring that they were correctly proportional. The weights of HW criteria for each scenario were finalized, as given in Appendix D, Table A6.

For each scenario, hardware candidates are assessed based on the rankings calculated by the six fuzzy MCDA methods given in Section 2.6. The results of the evaluations for each scenario are provided in Appendix G (Fuzzy MCDA Software, Evaluation Results for Edge Hardware and AI Models). The best hardware options suggested by the majority of the methods for each scenario are provided in Table 4.

According to the results in Table 4, the first ranking in all the fuzzy MCDA methods are the STM32H7 Series microcontrollers (EdgeHW_11) for Scenario I, NVIDIA Jetson Nano (EdgeHW_1) for Scenario II, and Raspberry Pi 4 for Scenario III. For Scenario 4, two hardware options are suggested: NVIDIA Jetson Xavier NX (EdgeHW_2) received the first ranking for the TOPSIS, COMET, and VIKOR methods, and NVIDIA Jetson AGX Orin (EdgeHW_3) obtained the first ranking for the MABAC and SPOTIS methods.

In addition, to cross-validate the results of the case study and enable the comparison of the edge hardware products listed in Table 3, each hardware criterion is rated based on a relative comparison. For rating, a 1 to 5 scale is used, where 5 indicates the best performance or suitability for the feature, and 1 indicates the least performance or suitability. The ratings of each metric of the hardware candidates are provided in Appendix E, Edge Hardware—Metrics Rating.

Measurement Layer/Measurement for AI Model

Process Flow—Step 8: Measurement Execution for AI Model Criteria (Collect Data for AI Model Metrics/Criteria)

After determining the best edge hardware options for each scenario, the AI model candidates for the suggested hardware options are determined. The list of the AI model candidates along with the specific evaluation indicators for each scenario are as follows:

Scenario 1, AI Model Candidates: Object detection models in different object detection frameworks (e.g., Single Shot Multi-Box Detector—SSD [67]), with different architectures (e.g., MobileNet v1 or MobileNet v2) and trained with the COCO Person Dataset [68] are evaluated according to the metrics given in Table A2 (input resolution, inference time, accuracy) to select the most suitable object detection model for the selected edge hardware. The metrics for these int8 quantized object detection models for STM32H7 series microcontrollers are given in Table A2. These metrics are obtained from the vendor’s GitHub page [69].
Scenario 2, AI Model Candidates: For this scenario, in Table A3, the metrics of the object detection models varying in architectures are determined for NVIDIA Jetson Nano. The experiments were conducted on the Microsoft COCO Dataset, which includes 80 objects. The metrics for the models are obtained from the GitHub page of Q-Engineering [70].
Scenario 3, AI Model Candidates: In this scenario, object detection models in differing frameworks and architectures, given in Table A4, were tested on Raspberry Pi 4. All the experiments were conducted on the Microsoft COCO Dataset, which includes 80 objects. The metrics are retrieved from the GitHub page of Q-Engineering [70].
Scenario 4, AI Model Candidates: In this scenario, deep learning-based 3D object detection models were tested on the KITTI Dataset on NVIDIA Xavier NX. The metrics given in Table A5 are obtained from Choe et al.’s study that conducted benchmark analysis on 3D object detectors for NVIDIA Jetson platforms [71].

Evaluation Layer/Evaluation for AI Models

Process Flow—Step 9: Evaluation and Ranking (apply MCDA methods to evaluate and rank AI model candidates based on weighted scores)
Process Flow—Step 10: Selection of Edge AI Model

In the evaluation step of AI models, for the weights assignment, a focus group was again conducted with the same experts. The process used for determining the weights of the hardware criteria was followed for the derivation of AI model criteria weights. The weight assignment session for AI models was again finalized after two rounds of discussion. The final weights for the AI model criteria for each scenario are provided in Appendix D, Table A7. Afterward, AI model rankings were calculated according to the six fuzzy MCDA methods given in Section 2.6. Suggested AI Model options to be deployed on the suggested hardware for each scenario are given Table 5.

5.1.3. Deployment Phase and Results (Deployment of Selected AI Models on Selected Edge HWs)

In Section 5.1.2, we evaluated the proposed method across four distinct scenarios, each designed to address different use cases and operational constraints in UAV context. The results of these experiments highlight the trade-offs among performance, power consumption, cost, and accuracy, which are critical to consider while forming edge hardware and AI model combinations.

Scenario I demonstrates the suitability of the STM32H7 series microcontrollers when paired with the SSD Mobilenet v1 0.25 model (256 × 256 × 3) for basic object detection in controlled environments. The STM32H7’s affordability, compact size, and low power consumption make it an ideal option for lightweight, resource-constrained applications. The system achieves a slightly low accuracy (lower than 30% mAP) and an average inference time (266.4 ms). While the accuracy is not as high as powerful systems, it is still adequate for basic detection tasks in environments, such as detecting large or stationary objects or general scene understanding. Scenario II pairs the NVIDIA Jetson Nano with the YOLOFastestV2 model to address real-time obstacle detection for mid-level UAVs, delivering high FPS (38.4) at a lower mAP (24.1%), making it suitable for dynamic environments where speed is critical. For general-purpose outdoor detection, Scenario III combines Raspberry Pi 4 with YOLOFastestV2 model, offering a performance of 18.8 FPS and 24.1% mAP. It provides a more cost-effective solution as compared to Jetson Nano, and generally better performance compared to STM32 series. Finally, Scenario IV utilizes the NVIDIA Jetson Xavier NX with Complex-YOLOv3 (Tiny) model for high-performance tasks such as search and rescue or advanced surveillance, achieving a higher mAP (67%) and moderate FPS (17.93). While this scenario prioritizes accuracy and speed for critical missions, it comes with higher costs and power requirements when compared to other combinations.

Overall, this multi-scenario evaluation highlights the adaptability of the proposed method to varying operational goals and constraints. This approach can be extended to other Edge AI domains, such as autonomous vehicles or smart city systems, by tailoring criteria and metrics to their specific requirements.

5.2. A TensorFlow Lite Model for Object Detection

In this part, to show the practical relevance of the method, we created a TensorFlow Lite model by the method’s suggestions for Scenario I. Scenario I is chosen due to strict requirements for agility, endurance, lightweight design, and real-time processing, which make it an ideal case for testing the efficacy of the proposed method. In this context, we first deployed the suggested AI model (SSD Mobilenet v1 0.25 model) on the suggested edge hardware (STM32H7 series), and evaluated the model’s performance on the COCO Person Dataset. The results are explained in Section 5.2.1. Afterward, to enhance the model’s efficiency and versatility for object detection tasks, we created the TensorFlow Lite model, and evaluated the model’s performance across various object categories. The evaluation results of the additional model are explained in Section 5.2.2.

5.2.1. Deployment Results of Scenario I

The suggested object detection model for Scenario I (SSD Mobilenet v1 0.25) is tested on the STM32H7 series microcontroller with both the validation and test data of the COCO Person Dataset. The graphs provided in Figure 5 illustrate the precision–recall (PR) curves for the “person” class on the validation and test data.

According to the first PR curve, the model achieves an average precision (AP) of 28.32 on validation data, which indicates a slightly low performance in detecting the target class. Precision begins at 1.0 for low recall values, meaning that the model is highly accurate when detecting a small number of high-confidence instances. However, as recall increases and more instances are detected, precision gradually declines, falling below 0.94 at a recall value of 0.30. This indicates that the model struggles to maintain accuracy as it detects more objects, likely due to false positives or difficulties in distinguishing objects in more complex scenes. Such behavior is typical of lightweight models like SSD MobileNet deployed on resource-constrained devices like the STM32H747I-DISCO board. Lightweight models are specifically designed to prioritize efficiency, low power consumption, and compact size, which are critical for embedded systems with limited computational resources. However, these benefits come at the cost of reduced capacity for complex feature extraction and lower robustness in challenging detection scenarios. The declining precision at higher recall levels underscores the model’s inability to maintain high confidence in its predictions as it attempts to detect a greater number of objects. This suggests that while the model performs well in identifying easily distinguishable objects, it struggles with overlapping objects, cluttered scenes, or small objects, which require more sophisticated feature representations and higher computational capacity.

The second PR curve represents the model’s performance on the test dataset, showing a slightly lower AP of 27.29 compared to the validation dataset. As with the validation data, precision starts at 1.0 for low recall values and decreases steadily, dropping below 0.92 at a recall value of 0.30. This indicates a similar pattern of performance, with the model excelling at detecting high-confidence instances but struggling to maintain precision as it attempts to detect a greater number of objects. The slightly lower AP on the test data reflects a minor decrease in generalization performance when evaluated on unseen data. The model’s consistent behavior across both datasets suggests that it is robust within the scope of its design but remains constrained by the trade-offs inherent to lightweight, low-power hardware. These results reinforce the suitability of this configuration for tasks with limited computational demands, such as detecting a few large or stationary objects in controlled environments.

5.2.2. Additional TensorFlow Lite Model for Object Detection

Guided by the suggestions of the method for Scenario I, we created a TensorFlow Lite model from scratch using the weights pre-trained on ImageNet [72] with the SSD MobileNet_v1 architecture, aimed at improving the efficiency and versatility of the model for object detection tasks. For training, we used the training data provided in the PASCAL VOC 2012 Dataset since this dataset is widely used for object detection tasks [73]. After training, we quantized the model to int8 to fit in the selected edge hardware. Then, we evaluated the model’s performance on the validation set of the PASCAL VOC Dataset across various object categories such as airplane and bicycle. The results of the int8 quantized model for top three performing categories are provided in Figure 6. The results of all the investigated categories are provided in Appendix D, Evaluation Results for PASCAL VOC-Trained ST SSD Model.

Figure 6 depicts the PR curves for the top three performing object detection classes: “bus”, “person”, and “train”. The PR curve for the bus class (AP = 46.84%) shows that the model is more capable of handling relatively simpler and larger objects, like buses. The curve shows a high precision at the beginning, which remains relatively stable, but as recall increases, precision drops. This behavior is a typical trade-off observed in object detection models. The stability at the beginning and the gradual decline suggests that the model handles this trade-off reasonably well for the “bus” class. Given the resource constraints of STM32H7, which limit its ability to process more complex data, the model performs well on larger, more easily identifiable objects, suggesting that the STM32H7 is suitable for tasks where high accuracy and speed for easily detectable objects are prioritized.

The PR curve for the “train” class follows a somewhat similar pattern to the “bus” class but with a slightly lower precision. The curve starts with high precision and then gradually decreases as recall increases. The lower AP of 43.14% suggests that the model struggles more with the detection of trains compared to buses, especially when recall exceeds 0.3. This indicates that the model may have more difficulty identifying trains, possibly due to the complexity of the class or variability in the appearance of trains in the dataset.

The PR curve for the “person” class exhibits a steep initial drop in precision when recall is very low (around 0). This suggests that the model initially struggles to correctly identify person, which may be due to issues with distinguishing person from other objects. However, the curve flattens out, and precision stabilizes at a higher value, suggesting that the model becomes more reliable as more candidates are considered, but still with lower accuracy (AP = 35.89%) compared to other classes.

6. Discussion

In this study, we presented a systematic and adaptable method for evaluating and selecting edge hardware and AI models for deployment in edge environments, guided by the ISO/IEC 25010:2011 quality standard. To validate the proposed method, a four-scenario case study focused on UAV operations was conducted, incorporating six fuzzy Multi-Criteria Decision Analysis (MCDA) methods. These methods were used to rank edge hardware alternatives and object detection models based on use case-specific requirements, emphasizing the importance of aligning hardware and software capabilities with operational goals.

The quality assessment framework and selection process presented in this study are designed to be flexible and adaptable to different edge devices and AI models, beyond the specific cases considered in the study. The core methodology—comprising Requirements, Specification, Measurement, and Evaluation Layers—is independent of specific hardware or software configurations. Therefore, the process can be applied to a broad range of devices and models by adjusting the quality attributes, metrics, and evaluation methods according to the specific requirements of scenarios.

The results of the case study underscore the role of tailoring hardware and AI model selections to specific scenarios: Scenario I demonstrated that low-cost, low-power solutions such as the STM32H7 series are ideal for basic tasks with minimal computational demands. Their affordability, lightweight design, and efficiency make them suitable for resource-constrained UAV applications requiring agility and endurance. Scenario II highlighted the Jetson Nano’s capability to deliver high-speed performance in dynamic scenarios, such as real-time obstacle detection, where responsiveness and frame rate are prioritized over high accuracy. Scenario III showcased the Raspberry Pi 4 as a versatile, cost-effective alternative for general-purpose tasks such as environmental monitoring or outdoor inspections, striking a balance between performance, power consumption, and cost. Scenario IV demonstrated that the Jetson Xavier NX, paired with robust AI models like Complex-YOLOv3, excels in high-stakes applications requiring advanced AI capabilities, high accuracy, and moderate inference speed, making it suitable for tasks such as search and rescue or advanced surveillance.

These findings illustrate the value of systematically balancing competing trade-offs—such as cost, accuracy, speed, and power consumption—while ensuring compatibility between hardware and software. The deployment of object detection models like YOLOFastestV2 and Complex-YOLOv3 in these scenarios further reinforces the importance of selecting models that align with the computational capabilities of the hardware and the operational requirements of the use case.

To demonstrate the practical relevance of the proposed method, a TensorFlow Lite model was developed and deployed based on the recommendations of Scenario I. The deployment results validated the proposed process, highlighting its ability to guide real-world decisions by systematically addressing stakeholder-defined priorities.

Regarding the broader implications of the method, the structured approach proposed in this study offers potential for Edge AI applications beyond UAV context. Its adaptable structure makes it applicable to a wide range of domains, such as smart home systems, robotics, autonomous vehicles, and IoT environments. By following the process flow and leveraging fuzzy MCDA methods, the method helps decision-makers to evaluate trade-offs and achieve optimal balance across multiple competing priorities, such as cost-efficiency, performance, and accuracy. The process flow provided in Section 4 and illustrated in Figure 3 serves as a framework for adopting the method for future case studies.

However, it is also important to emphasize the limitations of this analysis. Although the fuzzy MCDA methods provide good assistance for selection processes for the contexts in interest, they are reliant on the selected criteria and weights. Therefore, for instance, the STM32H7’s performance may not be sufficient for scenarios where a criterion such as high processing power is critical.

7. Threats to Validity

In this part, we address the factors that could potentially impact the outcome of this study. The main threats to validity of this study may arise due to the validity of the proposed method given in Section 4 and the validity of the case study given in Section 5.1. These factors are given under External Validity, Internal Validity, Construct Validity and Reliability categories along with the measures to mitigate these concerns.

External Validity: In this study, the proposed method was tested through four scenarios in UAV context to support its generalizability and adaptability to different requirements. The best hardware options for each scenario were found to be STM32H7 series microcontrollers, NVIDIA Jetson Nano, Raspberry Pi 4, and NVIDIA Jetson Xavier NX, respectively. The best object detection model options were found as SSD Mobilenet v1 0.25 for STM32H7, YOLOFastestV2 for NVIDIA Jetson Nano and Raspberry Pi 4, and Complex-YOLOv3 for Jetson Xavier NX. We cross-validated the results of the case study by forming a rating table (Appendix E, Edge Hardware—Metrics Rating), where each criterion/metric of the hardware candidates are rated on a relative comparison, 1 to 5 scale. The proposed method’s applicability should also be tested through a broader range of Edge AI applications such as smart home and healthcare systems that may have different requirements and operational goals to show its practical usage in different contexts and/or settings. Another issue is related to the validity of the proposed method. To mitigate this risk, we followed a validated meta-model structure and extended it to the quality of Edge AI applications.

Internal Validity: Internal validity is related to the study’s ability to establish a relationship between the proposed method and the observed outcomes. In the evaluation stage, we applied six fuzzy MCDA methods to find the best edge hardware options and AI model options for each of the four scenarios. A threat to internal validity in this study can be related due to reliance on fuzzy MCDA techniques which may introduce the possibility of bias due to subjective weighting of criteria, which could affect the outcome of the results. To mitigate this risk, we conducted a focus session involving six experienced practitioners from the field to collaboratively participate in the criteria weighting process. Moreover, statistical analysis of expert scores was employed to identify and address discrepancies, to ensure a more balanced and objective weighting approach.

Construct Validity: Construct validity concerns the degree to which the method accurately measures the concepts it aims to evaluate. To ensure this, we have adopted and extended an already validated quality meta-model [74], which enabled us to use and apply correct terms and relations in regard to quality specification and evaluation. In this study, construct validity may be threatened by the appropriateness of the criteria and metrics chosen to evaluate edge hardware and AI model candidates. There is a risk that the selected quality characteristics and metrics may not fully capture all relevant aspects regarding the UAV operational quality across proposed scenarios. To ensure this, we used the ISO/IEC 25010:2011 standard to guide the evaluation and used the characteristics and metrics that are commonly referenced in the literature for MAV and UAV context. In addition, we collected the senior practitioners’ feedback on the relevant quality characteristics and metrics. We believe that we have covered the most significant quality characteristics and metrics.

Reliability: Reliability is related to the consistency and repeatability of the results. In our study, due to the systematic approach that we have defined and utilized, by following the process flow given in Figure 3, the outcome of the case study is expected to produce the same outputs across four scenarios when tested with the same fuzzy MCDA methods under same goal, edge hardware candidates, AI model candidates, and criteria weights.

8. Conclusions

The fusion of edge computing and Artificial Intelligence (AI) has given rise to the Edge AI paradigm, which has shown significant usage across various application domains, from autonomous vehicles to smart home systems. Despite its growing adoption, the process of selecting suitable edge hardware and AI models for deployment is challenging due to availability of diverse options and involved trade-offs. This study addresses this challenge by proposing a structured method for evaluating and selecting hardware and AI models tailored for edge environments, guided by the ISO/IEC 25010:2011 quality standard. Through a systematic four-layered framework—Requirements, Specification, Measurement, and Evaluation—the method provided alignment with stakeholder priorities and operational needs, making it applicable to diverse scenarios. The application of the method in four UAV use-cases demonstrated its utility in balancing trade-offs such as cost, power efficiency, speed, and accuracy, and highlighted the importance of tailored hardware-AI model combinations for specific tasks. For example, lightweight setups like STM32H7 with SSD Mobilenet were found suitable for resource-constrained applications, while powerful configurations such as Jetson Xavier NX paired with Complex-YOLOv3 excelled in high-stakes scenarios requiring high accuracy and robustness.

We also acknowledged certain limitations in this study, such as the reliance on subjective criteria weighting and potential biases introduced through fuzzy MCDA techniques. To address these issues, expert consultations and statistical analyses were integrated into the weighting process to ensure a reliable evaluation process.

To show the practical guidance of the method, we provided an additional experiment which is based on the outcomes of Scenario I. In the experiment, we deployed the ST SSD Mobilenet v1 0.25 model on the STM32H747I-DISCO board, and evaluated the model’s performance on the test data of the COCO Person Dataset. Furthermore, we created a TensorFlow Lite model from scratch, trained with the PASCAL VOC 2012 Dataset, which is widely recognized for object detection tasks. Then, we evaluated this model’s performance on the validation set of the PASCAL VOC Dataset, and measured its average precision across various categories. It performed better than the pre-trained model with the COCO Dataset.

We believe that this study contributes to the field by presenting a systematic approach to a previously ad hoc process, helps reduce inefficiencies and supports decisions on optimal hardware and AI model combinations for edge applications. For future work, we plan to expand the application of the proposed method to various edge domains, such as healthcare and smart cities to show its versatility and relevance in real-world contexts, and to support its generalizability.

Author Contributions

Conceptualization, M.C.Ş.; Methodology, M.C.Ş. and A.K.T.; Software, M.C.Ş.; Validation, M.C.Ş.; Investigation, A.K.T.; Data curation, M.C.Ş.; Writing—original draft, M.C.Ş.; Writing—review & editing, A.K.T.; Supervision, A.K.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data used in this study are put to an accessible research repository: https://zenodo.org/records/14698719 (accessed on 17 December 2024).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Evaluation Indicators Across Application Scenarios

Table A1. Specific evaluation indicators *.

Metric Category	Evaluation Indicator	Description	Application Scenarios	Device Constraints Addressed
Accuracy [75,76,77,78]	Model Accuracy (e.g., Top-1, Top-5, mAP, Prediction Accuracy, Precision/Recall/F1-Score, IoU, MAE, Latency-Adjusted Accuracy) [75,76]	Measures how well the AI model performs on a given task, such as object detection or classification.	Autonomous vehicles, Surveillance	Limited precision or inference errors
	Accuracy under Resource Constraints [75,76]	Tests accuracy with reduced precision (e.g., INT8 quantization) or limited memory.	Low-power IoT devices	Constrained computation and memory
	Accuracy under Noise/Adversarial Conditions [77,78]	Evaluates model performance with added input noise or adversarial attacks.	Drones in dynamic environments	Robustness to real-world interference
Computational Efficiency [79,80]	Frames Per Second (FPS)	Measures real-time processing speed of inference tasks.	Real-time video analytics	Processing under low latency
	Latency (End-to-End Inference Time)	Total time for data input to output prediction.	Healthcare AI systems, Robotics	Low-latency or time-sensitive tasks
	Model Throughput (Inference per Second)	Number of inferences the model can handle per unit time.	Batch processing, industrial use	Scaling with computational constraints
Resource Utilization [79,80]	CPU/GPU/TPU Utilization	Tracks hardware usage during inference.	Resource monitoring in IoT	Avoids overloading hardware
	Power Consumption (Watts/Inferences per Watt) [71]	Measures energy efficiency of the device.	Battery-powered drones, Edge AI	Power-efficient operations
	Memory Utilization (RAM/Flash)	Evaluates memory footprint for models and inputs.	Embedded AI, wearable devices	Limited memory capacity
Adaptability	Transfer Learning Capability [81]	Assesses the ability to fine-tune models for new tasks with limited data.	Personalized recommendations	Adaptation to new environments or datasets
	Scalability [65]	Tests how models and hardware perform with varying workloads.	Distributed Edge AI systems	Dynamic workload adjustment
	Multi-Model Support [80]	Ability to run multiple models concurrently or in parallel.	Smart city applications	Support for diverse use cases
Robustness	Mean Time Between Failures (MTBF) [82]	Evaluates hardware reliability under prolonged usage.	Mission-critical drones, IoT devices	Reliability for long-term deployments
	Environmental Resilience (e.g., temperature, humidity, vibration) [83]	Performance under harsh environmental conditions.	Outdoor drones, industrial IoT	Operates in extreme conditions
	Fault Tolerance [84,85]	Ability to recover from hardware or software failures.	Autonomous vehicles	High reliability under unexpected failures

* Specific evaluation indicators across domains such as healthcare and autonomous vehicles can be retrieved from device vendors’ or general benchmarking web pages e.g., [79,80].

Appendix B. Simplified Object Model Diagram

Figure A1. Simplified object model diagram.

Appendix C. Scenario AI Model Options

Table A2. Scenario I, AI model options.

Model ID	Model	Resolution	Total RAM (KiB)	Total Flash (KiB)	Inference Time (ms)	Accuracy (AP) (%)
EdgeSW_Scenario1_1	SSD Mobilenet v2 0.35 FPN-lite	192 × 192 × 3	781.75	1174.51	512.33	40.73
EdgeSW_Scenario1_2	ST SSD Mobilenet v1 0.25	192 × 192 × 3	296.23	534.67	149.22	33.70
EdgeSW_Scenario1_3	ST SSD Mobilenet v1 0.25	224 × 224 × 3	413.93	702.81	218.68	44.45
EdgeSW_Scenario1_4	ST SSD Mobilenet v1 0.25	256 × 256 × 3	489.85	701.97	266.4	46.26
EdgeSW_Scenario1_5	st_yolo_lc_v1	192 × 192 × 3	174.38	330.23	179.35	31.61
EdgeSW_Scenario1_6	st_yolo_lc_v1	224 × 224 × 3	225.38	330.24	244.7	36.80
EdgeSW_Scenario1_7	st_yolo_lc_v1	256 × 256 × 3	286.38	330.23	321.23	40.58

Table A3. Scenario II, AI model options.

Model ID	Model	Resolution	Objects	mAP (%)	FPS
EdgeSW_Scenario2_1	NanoDet	320 × 320	80	20.6	26.2
EdgeSW_Scenario2_2	NanoDet Plus	416 × 416	80	30.4	18.5
EdgeSW_Scenario2_3	YOLOFastestV2	352 × 352	80	24.1	38.4
EdgeSW_Scenario2_4	YOLOv2	416 × 416	20	19.2	10.1
EdgeSW_Scenario2_5	YOLOv3	352 × 352 (tiny)	20	16.6	17.7
EdgeSW_Scenario2_6	YOLOv4 Darknet	416 × 416 (tiny)	80	21.7	16.5
EdgeSW_Scenario2_7	YOLOv4	608 × 608 (full)	80	45.3	1.3
EdgeSW_Scenario2_8	YOLOv5	640 × 640 (small)	80	22.5	5.0
EdgeSW_Scenario2_9	YOLOv6	640 × 640 (nano)	80	35.0	10.5
EdgeSW_Scenario2_10	YOLOv7	640 × 640 (tiny)	80	38.7	8.5
EdgeSW_Scenario2_11	YOLOx	416 × 416 (nano)	80	25.8	22.6
EdgeSW_Scenario2_12	YOLOx	416 × 416 (tiny)	80	32.8	11.35
EdgeSW_Scenario2_13	YOLOx	640 × 640 (small)	80	40.5	3.65

Table A4. Scenario III, AI model options.

Model ID	Model	Resolution	Objects	mAP	FPS (RPi 4 64-OS 1950 MHz)
EdgeSW_Scenario3_1	NanoDet	320 × 320	80	20.6	13.0
EdgeSW_Scenario3_2	NanoDet Plus	416 × 416	80	30.4	5.0
EdgeSW_Scenario3_3	YOLOFastestV2	352 × 352	80	24.1	18.8
EdgeSW_Scenario3_4	YOLOv2	416 × 416	20	19.2	3.0
EdgeSW_Scenario3_5	YOLOv3	352 × 352 tiny	20	16.6	4.4
EdgeSW_Scenario3_6	YOLOv4 Darknet	416 × 416 tiny	80	21.7	3.4
EdgeSW_Scenario3_7	YOLOv4	608 × 608 full	80	45.3	0.2
EdgeSW_Scenario3_8	YOLOv5	640 × 640 small	80	22.5	1.6
EdgeSW_Scenario3_9	YOLOv6	640 × 640 nano	80	35.0	2.7
EdgeSW_Scenario3_10	YOLOv7	640 × 640 tiny	80	38.7	2.1
EdgeSW_Scenario3_11	YOLOx	416 × 416 nano	80	25.8	7.0
EdgeSW_Scenario3_12	YOLOx	416 × 416 tiny	80	32.8	2.8
EdgeSW_Scenario3_13	YOLOx	640 × 640 small	80	40.5	0.9

Table A5. Scenario IV, AI model options.

Model ID	Model	mAP (Bird Eye View)	Average FPS	Average Power Consumption (W)
EdgeSW_Scenario4_1	PointRCNN	88	1.2	12
EdgeSW_Scenario4_2	Part-A2	89	1.82	12.5
EdgeSW_Scenario4_3	PV-RCNN	88	1.43	12.1
EdgeSW_Scenario4_4	Complex-YOLOv3	82	2.95	13.2
EdgeSW_Scenario4_5	Complex-YOLOv3 (Tiny)	67	17.93	8.9
EdgeSW_Scenario4_6	Complex-YOLOv4	0.833	2.82	13.8
EdgeSW_Scenario4_7	Complex-YOLOv4 (Tiny)	0.68	16.4	9.3
EdgeSW_Scenario4_8	SECOND	0.837	2.6	13.4
EdgeSW_Scenario4_9	PointPillar	0.871	5.73	13.9
EdgeSW_Scenario4_10	CIA-SSD	0.867	3.12	13.1
EdgeSW_Scenario4_11	SE-SSD	0.883	3.17	13.1

Appendix D. Edge HW and AI Model Criteria Weights for Scenarios

Please note that we also supported the weighting process via Krippendorf’s Alpha method.

Table A6. Edge HW criteria weights.

Criterion	Scenario I	Scenario II	Scenario III	Scenario IV
MTBF	0.114	0.047	0.051	0.052
Processing Power	0.104	0.358	0.143	0.439
Supported AI Frameworks	0.056	0.052	0.048	0.054
Power Consumption	0.143	0.097	0.096	0.101
Weight	0.147	0.053	0.026	0.049
Size	0.138	0.057	0.024	0.051
Flash Memory	0.056	0.046	0.108	0.059
RAM	0.077	0.048	0.098	0.057
Cost	0.165	0.242	0.406	0.138

Table A7. AI Model criteria weights.

Scenario	AI Model Criteria	AI Model Criteria Weights
Scenario I	[Resolution, Inference Time, Accuracy]	[0.18, 0.22, 0.60]
Scenario II	[Resolution, Objects, mAP, FPS]	[0.12, 0.08, 0.33, 0.47]
Scenario III	[Resolution, Objects, mAP, FPS]	[0.1, 0.15, 0.35, 0.40]
Scenario IV	[mAP, FPS, Power consumption]	[0.4, 0.4, 0.2]

Appendix E. Edge Hardware—Metrics Rating

Edge Hardware metrics rating (“EdgeHW_candidates_rate.csv”) and the rating details (“EdgeHW_MetricRating.txt”) are provided under “Fuzzy_MCDA_EdgeAI.zip” folder in the following address: https://zenodo.org/records/14698719 (accessed on 16 January 2025).

Appendix F. Object Detection Models, Datasets, Results

“SSD Mobilenet v1 0.25 (256 × 256)” TensorFlow Lite model (pre-trained ST model); “SSD Mobilenet v1 0.25 (256 × 256)” TensorFlow Lite float-model (trained with PASCAL VOC 2012 Dataset from scratch);

“SSD Mobilenet v1 0.25 (256 × 256)” TensorFlow Lite int8 quantized model (trained with PASCAL VOC 2012 Dataset from scratch) are provided under “Object Detection Models.zip”;

PASCAL VOC 2012 Dataset—Train and Valid Dataset is provided under “PASCAL_VOC_Dataset.zip”:

COCO-2017 Person Dataset is provided under “COCO-Dataset.zip”;

Evaluation Results for PASCAL VOC trained ST SSD Model is provided under “TestResults_TFLiteModel.zip”;

Settings for Models are provided under “settings.zip” in the following address: https://zenodo.org/records/14698719 (accessed on 16 January 2025).

Appendix G. Fuzzy MCDA Software, Evaluation Results for Edge Hardware and AI Models

Evaluation Results for Edge Hardware; Evaluation Results for AI Models; fuzzy MCDA python scripts are provided under “Fuzzy_MCDA_EdgeAI.zip” folder in the following address: https://zenodo.org/records/14698719 (accessed on 16 January 2025).

References

Singh, R.; Gill, S.S. Edge AI: Internet of Things and Cyber-Physical Systems. Internet Things Cyber-Phys. Syst. 2023, 3, 71–92. [Google Scholar] [CrossRef]
Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
Liu, S.; Liu, L.; Tang, J.; Yu, B.; Wang, Y.; Shi, W. Edge Computing for Autonomous Driving: Opportunities and Challenges. Proc. IEEE 2019, 107, 1697–1716. [Google Scholar] [CrossRef]
Ke, R.; Zhuang, Y.; Pu, Z.; Wang, Y. A Smart, Efficient, and Reliable Parking Surveillance System with Edge Artificial Intelligence on IoT Devices. IEEE Trans. Intell. Transport. Syst. 2020, 22, 4962–4974. [Google Scholar] [CrossRef]
Sharkov, G.; Asif, W.; Rehman, I. Securing Smart Home Environment Using Edge Computing. In Proceedings of the 2022 IEEE International Smart Cities Conference (ISC2), Pafos, Cyprus, 26–29 September 2022; pp. 1–7. [Google Scholar]
Shankar, V. Edge AI: A Comprehensive Survey of Technologies, Applications, and Challenges. In Proceedings of the 2024 1st International Conference on Advanced Computing and Emerging Technologies (ACET), Ghaziabad, India, 23–24 August 2024; pp. 1–6. [Google Scholar]
Hadidi, R.; Cao, J.; Xie, Y.; Asgari, B.; Krishna, T.; Kim, H. Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices. In Proceedings of the 2019 IEEE International Symposium on Workload Characterization (IISWC), Orlando, FL, USA, 3–5 November 2019; pp. 35–48. [Google Scholar]
Garcia-Perez, A.; Miñón, R.; Torre-Bastida, A.I.; Zulueta-Guerrero, E. Analysing Edge Computing Devices for the Deployment of Embedded AI. Sensors 2023, 23, 9495. [Google Scholar] [CrossRef] [PubMed]
Contreras-Masse, R.; Ochoa-Zezzatti, A.; García, V.; Elizondo, M. Selection of IoT Platform with Multi-Criteria Analysis: Defining Criteria and Experts to Interview. Res. Comput. Sci. 2019, 148, 9–19. [Google Scholar] [CrossRef]
Silva, E.M.; Agostinho, C.; Jardim-Goncalves, R. A multi-criteria decision model for the selection of a more suitable Internet-of-Things device. In Proceedings of the 2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC), Madeira, Portugal, 27–29 June 2017; pp. 1268–1276. [Google Scholar]
Ilieva, G.; Yankova, T. IoT System Selection as a Fuzzy Multi-Criteria Problem. Sensors 2022, 22, 4110. [Google Scholar] [CrossRef]
Gan, G.-Y.; Lee, H.-S.; Liu, J.-Y. A DEA Approach towards to the Evaluation of IoT Applications in Intelligent Ports. J. Mar. Sci. Technol. 2021, 29, 257–267. [Google Scholar] [CrossRef]
Park, S.; Lee, K. Improved Mitigation of Cyber Threats in IIoT for Smart Cities: A New-Era Approach and Scheme. Sensors 2021, 21, 1976. [Google Scholar] [CrossRef]
Zhang, X.; Geng, J.; Ma, J.; Liu, H.; Niu, S.; Mao, W. A hybrid service selection optimization algorithm in internet of things. EURASIP J. Wirel. Commun. Netw. 2021, 2021, 4. [Google Scholar] [CrossRef]
Saaty, T.L. How to Make a Decision: The Analytic Hierarchy Process. Aestimum 1994, 24, 19–43. [Google Scholar] [CrossRef]
Chen, C.T. Extensions of the TOPSIS for Group Decision-making under Fuzzy Environment. Fuzzy Sets Syst. 2000, 114, 1–9. [Google Scholar] [CrossRef]
Opricovic, S. A Fuzzy Compromise Solution for Multicriteria Problems. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 2007, 15, 363–380. [Google Scholar] [CrossRef]
Uslu, B.; Eren, T.; Gur, S.; Ozcan, E. Evaluation of the Difficulties in the Internet of Things (IoT) with Multi-Criteria Decision-Making. Processes 2019, 7, 164. [Google Scholar] [CrossRef]
Soltani, S.; Martin, P.; Elgazzar, K. A hybrid approach to automatic IaaS service selection. J. Cloud Comput. 2018, 7, 12. [Google Scholar] [CrossRef]
Alelaiwi, A. Evaluating distributed IoT databases for edge/cloud platforms using the analytic hierarchy process. J. Parallel Distrib. Comput. 2019, 124, 4146. [Google Scholar] [CrossRef]
Silva, E.M.; Jardim-Goncalves, R. Multi-criteria analysis and decision methodology for the selection of internet-of-things hardware platforms. In Doctoral Conference on Computing, Electrical and Industrial Systems; Springer: Berlin/Heidelberg, Germany, 2017; pp. 111–121. [Google Scholar]
Garg, S.K.; Versteeg, S.; Buyya, R. A framework for ranking of cloud computing services. Future Gener. Comput. Syst. 2013, 29, 1012–1023. [Google Scholar] [CrossRef]
ISO/IEC 25010:2011; Systems and software engineering-Systems and Software Quality Requirements and Evaluation (SQuaRE)-System and software quality models. International Organization for Standardization: Geneva, Switzerland, 2011.
Sipola, T.; Alatalo, J.; Kokkonen, T.; Rantonen, M. Artificial Intelligence in the IoT Era: A Review of Edge AI Hardware and Software. In Proceedings of the 31th Conference of Open Innovations Association FRUCT, Helsinki, Finland, 27–29 April 2022; Volume 31, pp. 320–331. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and <0.5MB Model Size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
Han, S.; Pool, J.; Tran, J.; Dally, W. Learning both Weights and Connections for Efficient Neural Networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA, 10–15 June 2019; pp. 6105–6114. [Google Scholar]
Tan, M.; Le, Q.V. EfficientNetV2: Smaller Models and Faster Training. arXiv 2020, arXiv:2104.00298. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed]
Orsini, G.; Bade, D.; Lamersdorf, W. CloudAware: Empowering context-aware self-adaptation for mobile applications. Trans. Emerg. Telecommun. Technol. 2018, 29, e3210. [Google Scholar] [CrossRef]
White, G.; Nallur, V.; Clarke, S. Quality of service approaches in IoT: A systematic mapping. J. Syst. Softw. 2017, 132, 186–203. [Google Scholar] [CrossRef]
Ashouri, M.; Davidsson, P.; Spalazzese, R. Quality attributes in edge computing for the Internet of Things: A systematic mapping study. Internet Things 2021, 13, 100346. [Google Scholar] [CrossRef]
ISO/IEC 25010:2011. Available online: https://www.iso.org/standard/35733.html (accessed on 17 December 2024).
Więckowski, J.; Kizielewicz, B.; Sałabun, W. pyFDM: A Python Library for Uncertainty Decision Analysis Methods. SoftwareX 2022, 20, 101271. [Google Scholar] [CrossRef]
De Souza Melaré, A.V.; González, S.M.; Faceli, K.; Casadei, V. Technologies and Decision Support Systems to Aid Solid-waste Management: A Systematic Review. Waste Manag. 2017, 59, 567–584. [Google Scholar] [CrossRef]
Maghsoodi, A.I.; Kavian, A.; Khalilzadeh, M.; Brauers, W.K. CLUS-MCDA: A Novel Framework based on Cluster Analysis and Multiple Criteria Decision Theory in a Supplier Selection Problem. Comput. Ind. Eng. 2018, 118, 409–422. [Google Scholar] [CrossRef]
Ulutaş, A.; Balo, F.; Sua, L.; Demir, E.; Topal, A.; Jakovljević, V. A New Integrated Grey MCDM Model: Case of Warehouse Location Selection. Facta Univ. Ser. Mech. Eng. 2021, 19, 515–535. [Google Scholar] [CrossRef]
Youssef, M.I.; Webster, B. A Multi-criteria Decision-making Approach to the New Product Development Process in Industry. Rep. Mech. Eng. 2022, 3, 83–93. [Google Scholar] [CrossRef]
Dell’Ovo, M.; Capolongo, S.; Oppio, A. Combining Spatial Analysis with MCDA for the Siting of Healthcare Facilities. Land Use Policy 2018, 76, 634–644. [Google Scholar] [CrossRef]
Chen, S.J.; Hwang, C.L. Fuzzy Multiple Attribute Decision Making: Methods and Applications; Lecture Notes in Economics and Mathematical Systems; Springer: Berlin/Heidelberg, Germany, 1992; Volume 375. [Google Scholar]
Zimmermann, H.J. Fuzzy Set Theory—And Its Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
Bozanic, D.; Tešić, D.; Milićević, J. A Hybrid Fuzzy AHP-MABAC Model: Application in the Serbian Army–The selection of the location for deep wading as a technique of crossing the river by tanks. Decis. Mak. Appl. Manag. Eng. 2018, 1, 143–164. [Google Scholar] [CrossRef]
Sałabun, W.; Karczmarczyk, A.; Wątróbski, J.; Jankowski, J. Handling Data Uncertainty in Decision Making with COMET. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 18–21 November 2018; pp. 1478–1484. [Google Scholar]
Shekhovtsov, A.; Paradowski, B.; Więckowski, J.; Kizielewicz, B.; Sałabun, W. Extension of the SPOTIS Method for the Rank Reversal Free Decision-making under Fuzzy Environment. In Proceedings of the 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 6–9 December 2022; pp. 5595–5600. [Google Scholar]
Ulutaş, A.; Popovic, G.; Radanov, P.; Stanujkic, D.; Karabasevic, D. A New Hybrid Fuzzy PSI-PIPRECIA-CoCoSo MCDM based Approach to Solving the Transportation Company Selection Problem. Technol. Econ. Dev. Econ. 2021, 27, 1227–1249. [Google Scholar] [CrossRef]
Mandel, N.; Milford, M.; Gonzalez, F. A Method for Evaluating and Selecting Suitable Hardware for Deployment of Embedded System on UAVs. Sensors 2020, 20, 4420. [Google Scholar] [CrossRef]
Imran, H.; Mujahid, U.; Wazir, S.; Latif, U.; Mehmood, K. Embedded Development Boards for Edge-AI: A Comprehensive Report. arXiv 2020, arXiv:2009.00803. [Google Scholar]
Merenda, M.; Porcaro, C.; Iero, D. Edge Machine Learning for AI-enabled IoT devices: A review. Sensors 2020, 20, 2533. [Google Scholar] [CrossRef]
Rajput, K.R.; Kulkarni, C.D.; Cho, B.; Wang, W.; Kim, I.K. Edgefaasbench: Benchmarking Edge Devices using Serverless Computing. In Proceedings of the 2022 IEEE International Conference on Edge Computing and Communications (EDGE), Barcelona, Spain, 11–15 July 2022; pp. 93–103. [Google Scholar]
Feng, H.; Mu, G.; Zhong, S.; Zhang, P.; Yuan, T. Benchmark Analysis of Yolo Performance on Edge Intelligence Devices. Cryptography 2022, 6, 16. [Google Scholar] [CrossRef]
Cantero, D.; Esnaola-Gonzalez, I.; Miguel-Alonso, J.; Jauregi, E. Benchmarking Object Detection Deep Learning Models in Embedded Devices. Sensors 2022, 22, 4205. [Google Scholar] [CrossRef]
Yılmaz, N.; Tarhan, A.K. Matching terms of quality models and meta-models: Toward a unified meta-model of OSS quality. Softw. Qual. J. 2023, 31, 721–773. [Google Scholar] [CrossRef]
Yılmaz, N.; Tarhan, A.K. Quality Evaluation Meta-model for Open-source Software: Multi-method Validation Study. Softw. Qual. J. 2024, 32, 487–541. [Google Scholar] [CrossRef]
Boroujerdian, B.; Genc, H.; Krishnan, S.; Cui, W.; Faust, A.; Reddi, V. MAVBench: Micro Aerial Vehicle Benchmarking. In Proceedings of the 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, 20–24 October 2018; pp. 894–907. [Google Scholar]
NVIDIA Series Products. Available online: https://developer.nvidia.com/ (accessed on 17 December 2024).
Intel Atom 6000E Product. Available online: https://www.intel.com/content/www/us/en/content-details/635255/intel-atom-x6000e-series-processor-and-intel-pentium-and-celeron-n-and-j-series-processors-for-internet-of-things-iot-applications-datasheet-volume-2-book-1-of-3.html (accessed on 17 December 2024).
Google Coral Products. Available online: https://coral.ai/products/ (accessed on 17 December 2024).
Raspberry Products. Available online: https://www.raspberrypi.com/ (accessed on 17 December 2024).
STM32 Microcontroller Products. Available online: https://www.st.com/content/st_com/en.html (accessed on 17 December 2024).
Xilinx Boards. Available online: https://www.xilinx.com/products/boards-and-kits.html (accessed on 17 December 2024).
Wang, N.; Matthaiou, M.; Nikolopoulos, D.S.; Varghese, B. DYVERSE: DYnamic VERtical Scaling in multi-tenant Edge environments. Future Gener. Comput. Syst. 2020, 108, 598–612. [Google Scholar] [CrossRef]
Beagle Board products. Available online: https://www.beagleboard.org/ (accessed on 17 December 2024).
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot Multibox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
COCO Person Dataset. 2017. Available online: https://cocodataset.org/#home (accessed on 17 December 2024).
STMicroelectronics–Object Detection, STM32 Model Zoo. Available online: https://github.com/STMicroelectronics/stm32ai-modelzoo (accessed on 17 December 2024).
QEngineering, Jetson Nano and Raspberry Pi 4 Metrics. Available online: https://github.com/Qengineering (accessed on 17 December 2024).
Choe, C.; Choe, M.; Jung, S. Run Your 3D Object Detector on NVIDIA Jetson Platforms: A Benchmark Analysis. Sensors 2023, 23, 4005. [Google Scholar] [CrossRef] [PubMed]
ImageNet. Available online: https://www.image-net.org/ (accessed on 17 December 2024).
PASCAL VOC Dataset. 2012. Available online: https://datasets.activeloop.ai/docs/ml/datasets/pascal-voc-2012-dataset/ (accessed on 17 December 2024).
Rausand, M.; Hoyland, A. System Reliability Theory: Models, Statistical Methods, and Applications. Wiley-Interscience; John Wiley & Sons Inc: New York, NY, USA, 2004. [Google Scholar]
Fei-Fei, L.; Deng, J.; Russakovksy, O.; Berg, A.; Li, K. Common Evaluation Metrics Used in Computer Vision Tasks, Documented in ImageNet Benchmark Datasets. 2009. Available online: https://image-net.org/about.php (accessed on 17 December 2024).
Lin, T.; Patterson, G.; Ronchi, M.; Maire, M.; Belongie, S.; Bourdev, L.; Girschik, R.; Hays, J.; Perona, P.; Ramanan, D.; et al. Common Evaluation Metrics Used in Computer Vision Tasks, Documented in COCO. 2014. Available online: https://www.picsellia.com/post/coco-evaluation-metrics-explained (accessed on 17 December 2024).
Goodfellow, I.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Mandry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. Stat 2017, 1050, 1–27. Available online: https://arxiv.org/abs/1706.06083 (accessed on 17 December 2024).
Jetsons Benchmark. 2024. Available online: https://developer.nvidia.com/embedded/jetson-benchmarks (accessed on 17 December 2024).
MLPerf Inference: Edge. 2024. Available online: https://mlcommons.org/benchmarks/inference-edge/ (accessed on 17 December 2024).
Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Reliability Prediction of Electronic Equipment. MIL-HDBK-217F. 1991. Available online: https://www.navsea.navy.mil/Portals/103/Documents/NSWC_Crane/SD-18/Test%20Methods/MILHDBK217.pdf (accessed on 17 December 2024).
International Electrotechnical Commission. Environmental Testing-Part 2: Tests (IEC 60068-2 Series); IEC: Geneva, Switzerland, 2013. [Google Scholar]
Laprie, J.C. Dependability: Basic Concepts and Terminology in English, French, German, Italian and Japanese; Dependable Computing and Fault-Tolerant Systems; Springer: Vienna, Austria, 1991; Volume 5. [Google Scholar]
Bloomfield, R.; Rushby, J. Assurance of AI Systems from a Dependability Perspective. arXiv 2024, arXiv:2407.13948. [Google Scholar]

Figure 1. ISO/IEC 25010 quality attributes and sub-attributes.

Figure 2. Meta-model diagram for Edge AI operationalized quality (associated with and extended from [56]).

Figure 3. Process flow to adopt the Edge AI operationalized quality model.

Figure 4. Case study design.

Figure 5. Validation and test results for the pre-trained ST SSD MobileNet v1 0.25 model.

Figure 6. Evaluation results for the PASCAL VOC-trained ST SSD model.

Table 1. Stakeholders and goal specification.

Scenario ID, Name	Stakeholder	Goal	Priority, Impact	Trade-Offs
Scenario I: Low-Cost, Low-Power MAV for Simple Object Detection	Small-scale Users: Organizations or individuals using MAVs in controlled environments (e.g., farms, industrial facilities).	Build a lightweight, low-cost, reliable MAV capable of performing simple object detection tasks, such as detecting a person or large static obstacles (e.g., cars, buses) in controlled environments.	Lightweight design, cost-efficiency, and low power consumption are critical to enhance agility and extend operational time in constrained environments.	The MAV does not need advanced AI processing capabilities, but must be agile, power-efficient, reliable, and operate in environments where only basic detection algorithms are sufficient.
Scenario II: Mid-Level UAV for AI-Based Obstacle Detection	Urban Surveillance Operators: Organizations deploying UAVs for traffic monitoring or public safety, requiring detection of moving obstacles.	Build a UAV with moderate performance capable of real-time AI-based obstacle detection for dynamic objects like pedestrians, cars, and moving obstacles.	Balanced performance, energy efficiency, and cost. The UAV must perform moderately complex tasks like running machine learning models (YOLO, SSD) while keeping power consumption in check to maintain flight time.	The UAV should handle real-time object detection without requiring high-end AI processing. Balancing power consumption with AI performance is critical for efficiency in tasks like monitoring or surveillance.
Scenario III: Cost-Effective, General-Purpose UAV for Outdoor Object Detection	Environmental Monitoring Operators: Organizations using UAVs to inspect areas such as forests, agricultural land, or urban landscapes for environmental changes, wildlife monitoring, or security.	A cost-efficient UAV that performs general-purpose outdoor object detection.	Cost-efficiency is the highest priority. Flash memory and RAM are next in importance to handle basic processing and data storage needs.	The system should reflect a balance between computational power and cost, being capable of processing moderate amounts of data from standard-resolution cameras and sensors while maintaining low power consumption.
Scenario IV: High-Performance UAV for Real-Time Advanced Object Detection and Tracking	Search and Rescue Teams: Organizations using UAVs for lifesaving, where precise object tracking (e.g., people, vehicles) is critical.	A high-performance UAV for real-time advanced object detection and tracking under challenging conditions.	High performance and precision are paramount. The UAV must support advanced AI models (e.g., Faster R-CNN, Complex-YOLOV3V) for real-time, accurate detection and tracking in complex scenarios.	Processing power takes priority, as the UAV needs to perform heavy-duty AI processing for real-time tracking. Cost and power consumption is secondary as to allow for moderate-duration operations in critical applications.

Table 2. Quality specification for hardware and AI model aspects.

Scenario ID	Quality Models	ISO 25010 Quality Attribute	ISO 25010 Quality Sub-Attribute	Related Evaluation Indicators/Metrics	Measurement Method of the Indicator/Metric
Scenario I	MAV_Hardware QM_1; MAV_Software QM_1	Functional Suitability	Functional Correctness	AI Model: Accuracy (mean average precision, recall vs. precision across specified object categories)	Accuracy: STM32 firmware (STM32Cube.AI)
		Performance Efficiency	Time Behavior	HW: Processing power AI Model: Inference time (ms.)	Processing power: product specification. Inference time: STM32 firmware
			Resource Utilization	HW: Average power consumption (W)	Average power consumption: product specification
			Capacity	HW: Flash memory, RAM capacity AI Model: Required flash and RAM for the model.	HW flash memory, RAM capacity: product specification. AI model’s required flash and RAM: STM32CubeMX firmware
		Reliability	Availability	HW: MTBF	MTBF: product specification
		Portability	Adaptability	HW: Number of supported AI environments	Number of supported AI environments: product specification
		Portability	Replaceability	HW: Cost	Cost: product specification
		Usability	Operability	HW: Weight, size	Weight, size: product specification
Scenario II	UAV_Hardware QM_2; UAV_Software QM_2	Functional Suitability	Functional Correctness	AI Model: Object detection precision, mean average precision, localization accuracy, precision, recall, Intersection over Union (IoU)	AI model metrics: benchmarking specifications
		Performance Efficiency	Time Behavior	HW: Processing power AI Model: Inference time (ms)	Processing power: product specification. Inference time: benchmarking specification
			Resource Utilization	HW: Average power consumption (W)	Average power consumption: product specification
			Capacity	HW: Flash memory, RAM capacity	Flash memory, RAM capacity: product specification
		Reliability	Availability	Hardware: MTBF	MTBF: product specification
		Portability	Adaptability	HW: Number of supported AI environments	Number of supported AI environments: product specification
		Portability	Replaceability	HW: Cost	Cost: product specification
		Usability	Operability	HW: Weight, size	Weight, size: product specification
Scenario III	UAV_Hardware QM_3; UAV_Software QM_3	Functional Suitability	Functional Correctness	AI Model: Object detection accuracy (mean average precision), recall vs. precision across specified categories	AI model metrics: benchmarking specifications
		Performance Efficiency	Time Behavior	HW: Processing power AI Model: Inference time (ms)	Processing power: product specification Inference time: benchmarking specifications
			Resource Utilization	HW: Power consumption (W)	Power consumption: product specification
			Capacity	HW: Flash memory, RAM	Flash memory, RAM: product specification
		Reliability	Availability	HW: MTBF	MTBF: product specification
		Portability	Adaptability	HW: Number of supported AI frameworks	Number of supported AI frameworks: product specification
		Portability	Replaceability	HW: Cost	Cost: product specification
		Usability	Operability	HW: Weight, size	Weight, size: product specification
Scenario IV	UAV_Hardware QM_4; UAV_Software QM_4	Functional Suitability	Functional Correctness	AI Model: Mean average precision	mean average precision: benchmarking specifications
		Performance Efficiency	Time Behavior	AI Model: FPS	FPS: benchmarking specifications
			Resource Utilization	AI Model: Average power consumption	Average power consumption: benchmarking specifications
			Capacity	HW: Flash memory, RAM capacity	Flash memory, RAM capacity: product specification
		Reliability	Availability	HW: MTBF	MTBF: product specification
		Portability	Adaptability	HW: Number of supported AI environments	Number of supported AI environments: product specification
		Portability	Replaceability	HW: Cost	Cost: product specification
		Usability	Operability	HW: Weight, size	Weight, size: product specification

Table 3. Candidate edge hardware products—specification *.

Product ID	Product Name	MTBF (Hours)	Processing Power	Supported AI Framework	Power Consumption (W)	Weight (gr)	Size (mm)	Flash Memory	RAM	Cost (~)
EdgeHW_1	NVIDIA Jetson Nano [59]	1836 K+	472 GFLOPS, Quad-core ARM Cortex-A57 1.43 GHz	TensorFlow, PyTorch, Caffe, Darknet, MXNet	5–10	174	100 × 80 × 29	16 GB eMMC	4 GB	$100
EdgeHW_2	NVIDIA Jetson Xavier NX [59]	1634 K+	21 TOPS, 6-core NVIDIA Carmel ARM v8.2	Tensorflow, PyTorch, MXNet, Caffe/Caffe2, Keras, Darknet, ONNXRuntime	10–20	174	100 × 80 × 29	16 GB eMMC	8 GB	$479
EdgeHW_3	NVIDIA Jetson AGX Orin [59]	1381 K+	Up to 275 TOPS (Ampere GPU + ARM)	PyTorch, MXNet, Caffe/Caffe2, Keras, Darknet, ONNXRuntime	15–75	306	105 × 87 × 40	64 GB eMMC	64 GB	$2349
EdgeHW_4	Intel Atom x6000E Series [60]	400 K+	3.0 GHz (Cortex-A76)	TensorFlow, OpenVINO	4.5–12	700	25 × 25 (system-on-module)	Up to 64 GB eMMC	Up to 8 GB	$460+
EdgeHW_5	Google Coral Dev Board [61]	50 K+	4 TOPS (Edge TPU) + Quad Cortex-A53	TensorFlow Lite	4	139	88 × 60 × 22	8 GB eMMC	4 GB	$129
EdgeHW_6	Raspberry Pi 4 [62]	50 K+	Quad-core Cortex-A72 1.8 GHz	TensorFlow, Keras, PyTorch, DarkNet (YOLO), Edge Impulse, MLKit, Scikit-learn	~15	46	85.6 × 56.5 × 19.5	MicroSD (ext.)	8 GB	$35–$75
EdgeHW_7	Raspberry Pi Zero 2 W [62]	50 K+	Quad-core Cortex-A53 1.0 GHz	TensorFlow, Keras, PyTorch, DarkNet (YOLO), Edge Impulse, MLKit, Scikit-learn	~12.5	16	65 × 30 × 5	MicroSD (ext.)	512 MB	$15
EdgeHW_8	Raspberry Pi 3 Model B+ [62]	50 K+	Quad-core Cortex-A53 1.4 GHz	TensorFlow, Keras, PyTorch, DarkNet (YOLO), Edge Impulse, MLKit, Scikit-learn	12.5	100	85.6 × 56.5 × 17	MicroSD (ext.)	1 GB	$40
EdgeHW_9	STM32F4 Series (STM32F429ZI) [63]	1000 K+	ARM Cortex-M4 @180 MHz	TensorFlow, Keras, PyTorch and Scikit-learn via ONNX	~0.1	<10	20 × 20 (LQFP 144)	Up to 2 MB	256 KB	$5–$15
EdgeHW_10	STM32F7 Series (STM32F746ZG) [63]	1000 K+	ARM Cortex-M7 @216 MHz	TensorFlow, Keras, PyTorch and Scikit-learn via ONNX	~0.2	<10	14 × 14 (LQFP 100)	Up to 1 MB	320 KB	$10–$20
EdgeHW_11	STM32H7 Series (STM32H747XI) [63]	1000 K+	ARM Cortex-M7 @480 MHz	TensorFlow, Keras, PyTorch and Scikit-learn via ONNX	~0.3	<10	24 × 24 (LQFP 176)	2 MB	1 MB	$15
EdgeHW_12	STM32MP1 Series (STM32MP153C) [63]	1000 K+	Dual-core Cortex-A7 + Cortex-M4	TensorFlow, Keras, PyTorch and Scikit-learn via ONNX	2	<10	14 × 14 (LQFP 100)	512 MB	8 GB	$15
EdgeHW_13	Xilinx Zynq ZCU102 [64]	1000 K+	Quad-core Cortex-A53 + FPGA	Caffe, PyTorch, TensorFlow [65]	~20	300	200 × 160 × 20	512 MB Flash	4 GB	$3250
EdgeHW_14	Xilinx Zynq ZCU104 [64]	1000 K+	Quad-core Cortex-A53 + FPGA	Caffe, PyTorch, TensorFlow [65]	~25	400	200 × 160 × 20	512 MB Flash	4 GB	$1700
EdgeHW_15	BeagleBone Black [66]	50 K+	ARM Cortex-A8 1 GHz	TensorFlow Lite, PyTorch, OpenVINO, Darknet (YOLO), Keras, Scikit-learn, OpenCV	2–3	40	90 × 60 × 22	4 GB eMMC	512 MB	$55
EdgeHW_16	BeagleBone AI-64 [66]	50 K+	Quad-core Cortex-A72 2.0 GHz	TensorFlow Lite, PyTorch, OpenVINO, Darknet (YOLO), Keras, Scikit-learn, OpenCV, TIDL, Edge Impulse	7–10	60	100 × 60 × 20	16 GB eMMC	4 GB	$145

* Values for the criteria are retrieved from the products’ datasheets and web pages. Metrics that could not be obtained from product datasheets or web pages are retrieved from various manufacturers’ pages.

Table 4. Suggested edge HW option/s for scenarios.

Scenario	Edge Hardware ID	Suggested HW Option
Scenario I	EdgeHW_11	STM32H7 Series Microcontrollers
Scenario II	EdgeHW_1	NVIDIA Jetson Nano
Scenario III	EdgeHW_6	Raspberry Pi4
Scenario IV	EdgeHW_2, EdgeHW_3	NVIDIA Xavier NX, Jetson AGX Orin

Table 5. Suggested AI model options.

Scenario	Selected HW	Suggested AI Model Option for Deployment
Scenario_I	STM32H7 Series Microcontrollers	EdgeSW_Scenario1_4, SSD Mobilenet v1 0.25 (256 × 256)
Scenario_II	NVIDIA Jetson Nano	EdgeSW_Scenario2_3, YOLOFastestV2
Scenario_III	Raspberry Pi 4	EdgeSW_Scenario3_3, YOLOFastestV2
Scenario_IV	NVIDIA Jetson Xavier NX	EdgeSW_Scenario4_5, Complex-YOLOv3 (Tiny)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Canpolat Şahin, M.; Kolukısa Tarhan, A. Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs. Appl. Sci. 2025, 15, 1026. https://doi.org/10.3390/app15031026

AMA Style

Canpolat Şahin M, Kolukısa Tarhan A. Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs. Applied Sciences. 2025; 15(3):1026. https://doi.org/10.3390/app15031026

Chicago/Turabian Style

Canpolat Şahin, Müge, and Ayça Kolukısa Tarhan. 2025. "Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs" Applied Sciences 15, no. 3: 1026. https://doi.org/10.3390/app15031026

APA Style

Canpolat Şahin, M., & Kolukısa Tarhan, A. (2025). Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs. Applied Sciences, 15(3), 1026. https://doi.org/10.3390/app15031026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation and Selection of Hardware and AI Models for Edge Applications: A Method and A Case Study on UAVs

Abstract

1. Introduction

2. Background

2.1. Edge AI Paradigm

2.2. Edge AI Hardware

2.3. Edge AI Software

2.4. Edge AI Model Architectures

2.5. ISO/IEC 25010:2011 Standard

2.6. Multi-Criteria Decision Analysis (MCDA) Methods

3. Related Work

4. Method

4.1. Quality Assessment Framework for Edge AI Applications

4.1.1. Requirements Layer (Stakeholder Goal Identification)

4.1.2. Specification Layer (Quality Attributes, Sub-Quality Attributes, and Metrics Definition)

4.1.3. Measurement Layer (Data Collection and Normalization)

4.1.4. Evaluation Layer (Evaluation and Ranking of Hardware and AI Model Candidates)

4.2. Selection of Suitable Edge Hardware and AI Models

4.2.1. Selection of Edge Hardware (Phase I)

4.2.2. Selection of AI Model (Phase II)

4.3. Process Flow to Adopt the Method

5. Experiments

5.1. Test of the Proposed Method

5.1.1. Overview of the Scenarios

Scenario I: Low-Cost, Low-Power, Lightweight MAV for Basic Object Detection Tasks

Scenario II: Mid-Level UAV for AI-Based Obstacle Detection

Scenario III: General-Purpose UAV for Outdoor Object Detection

Scenario IV: High-Performance UAV for Real-Time Advanced Object Detection and Tracking

5.1.2. Application of the Method to the Scenarios

Requirements: Layer/Stakeholder Goal Identification

Specification Layer/Hardware and AI Model Quality Specification

Measurement Layer/HW Model Metrics Measurement

Evaluation Layer/Evaluation and Ranking of Hardware Options

Measurement Layer/Measurement for AI Model

Evaluation Layer/Evaluation for AI Models

5.1.3. Deployment Phase and Results (Deployment of Selected AI Models on Selected Edge HWs)

5.2. A TensorFlow Lite Model for Object Detection

5.2.1. Deployment Results of Scenario I

5.2.2. Additional TensorFlow Lite Model for Object Detection

6. Discussion

7. Threats to Validity

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Evaluation Indicators Across Application Scenarios

Appendix B. Simplified Object Model Diagram

Appendix C. Scenario AI Model Options

Appendix D. Edge HW and AI Model Criteria Weights for Scenarios

Appendix E. Edge Hardware—Metrics Rating

Appendix F. Object Detection Models, Datasets, Results

Appendix G. Fuzzy MCDA Software, Evaluation Results for Edge Hardware and AI Models

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI