DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations

Fu, Qincui; Deng, Fangming; Xue, Xianfa; Zeng, Jianjun; Wei, Baoquan

doi:10.3390/electronics13173383

Open AccessArticle

DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations

by

Qincui Fu

,

Fangming Deng

^*,

Xianfa Xue

,

Jianjun Zeng

and

Baoquan Wei

School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330013, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(17), 3383; https://doi.org/10.3390/electronics13173383

Submission received: 4 July 2024 / Revised: 15 August 2024 / Accepted: 21 August 2024 / Published: 26 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

With the explosive development of power edge equipment and the continuous improvement in power inspection performance, the requirements of substations and terminal equipment, such as drones with limited resources, cannot meet the strict delay and energy consumption requirements. This paper proposes an adaptive partitioning strategy for heterogeneous substation inspection systems. First, a layer delay prediction model and layer energy consumption prediction model are established on each heterogeneous node, and nonlinear characteristics related to delay and energy consumption are trained. On this basis, a deep neural network (DNN) hybrid partitioning strategy is proposed. The DNN task is divided into synchronous cooperative reasoning between terminal devices and multi-heterogeneous edge nodes. The experimental results show that the average absolute percentage error (MAPE) of the delay model was reduced by 31.49% on average. On drones and mobile edge nodes, the energy consumption model MAPE reduced the average by 21.92%, and the DNN end-to-end latency was reduced by 31.48%. The total cost of the system was reduced and the efficiency of UAV inspection was improved.

Keywords:

DNN; UAV; electricity inspection

1. Introduction

With the rapid development of the power industry, power inspection is particularly important in ensuring the safe and healthy operation of local transmission and distribution networks. However, due to the large scale of transmission lines, the rapid growth of data scales, the performance limitations of drones, and other factors, the difficulty of power inspection is increasing [1,2].

The emergence of edge computing technology has promoted the development of online inspection technology. The target detection model of power inspection is usually composed of deep neural networks, which have the characteristics of high storage and high computation [3,4]. LiX et al. [5,6] proposed a non-destructive, non-contact, and automated visual inspection system using UAV and edge computing to detect defects of large-scale photovoltaic power plants based on edge cloud computing. XuC et al. [7] developed an unmanned aerial system with an advanced embedded processor and binocular vision sensor for automatic inspection of transmission lines. Zhou et al. [8] proposed a joint optimization of smart grid tasks and buffers based on edge computing and used a gradient descent algorithm to allocate computing resources, reducing the time complexity of task processing. Wu et al. [9] proposed an end-to-end framework for cloud-based enhancement and edge-based detection, which utilized cloud edge collaboration to segment object detection tasks, significantly improving drone inspection performance and reducing task processing latency. Cheng et al. [10] proposed a drone object detection algorithm called the Fast-YOLOv4 model and improved it by introducing the lightweight network MobileNetV3. The improved lightweight model was deployed on the drone side, where image feature extraction was performed first, and then the feature maps were uploaded to the cloud, reducing the amount of uploaded data and saving drone power consumption while meeting real-time detection requirements. The above methods have improved the real-time processing of detection tasks, thereby enhancing the efficiency of power inspection and reducing the computational pressure on unmanned aerial vehicles. However, how to accomplish the reasonable allocation of unmanned aerial vehicle inspection and detection tasks for complex network states, unmanned aerial vehicle power, and other complex situations remains a difficulty and the focus of the current power inspection research.

Cloud edge collaboration realizes the centralized processing and analysis of power patrol inspection data by combining cloud computing and edge computing [11]. It makes full use of the powerful computing and storage capabilities of cloud computing, and at the same time, makes computing resources and data processing capabilities as close to the patrol site as possible through edge computing, which can improve the real-time performance and response speed of data processing. Shuang et al. [12] proposed a cloud edge collaborative intelligence method for object detection, which deploys a low-cost insulator string detection model on the unmanned aerial vehicle (UAV) end, reducing the computational load of UAV intelligent computing. According to Ren et al. [13], cloud edge collaboration realizes centralized processing and analysis of power patrol data by combining cloud computing and edge computing, making full use of the computing and storage capabilities of the cloud. Sharma et al. [14,15] proposed a novel cloud edge collaboration framework that uses regression models to predict and select the optimal segmentation point for edge cloud collaboration, ensuring a balance between energy consumption and latency and accelerating data analysis speed. Chen et al. [16,17,18] proposed a distributed real-time object detection framework based on edge cloud collaboration for intelligent video surveillance based on the optimal segmentation of some cloud computing tasks and some edge computing tasks. In summary, although existing collaborative technologies improve the real-time performance and response speed of data processing, the computing power of edge devices is limited, which may not be able to process complex computing tasks in a timely manner, resulting in increased latency and the inability to process and analyze data in a timely manner, affecting the real-time and accuracy of inspections.

DNN partitioning is one of the key technologies in collaborative reasoning, which refers to the reasonable partitioning of object detection tasks and offloading them to edge servers or the cloud [19,20]. Wei Z et al. [21,22] proposed a load balancing algorithm based on an edge-computing-weighted bipartite graph (LBA-EC), which is used to evaluate the energy consumption of edge devices and reduce user delay by making full use of network edge resources. Sun Z et al. [23] proposed an optimal joint offloading scheme based on cloud edge collaborative resource occupancy prediction, using a Gated Recurrent Unit (GRU) to predict the occupancy rate of edge resources, and proposed an algorithm with linear time complexity to achieve the goal of reducing total consumption time. Nayyer MZ et al. [24,25,26] proposed an effective adaptive learning unloading strategy based on distributed reinforcement learning and multi-classification, which is used for the load balancing of resource optimization. It can solve the load balancing problem in edge computing according to user preferences so as to reduce the task processing delay and energy consumption of intelligent patrol groups. Xu et al. [27] considered that most existing collaborative acceleration schemes are aimed at partitioning a single DNN inference task, which cannot quickly make partition decisions for a group of concurrent inference tasks. They proposed a collaborative inference acceleration scheme that integrates DNN partitioning and task offloading to achieve the efficient offloading of partitioned tasks, with a high convergence performance and task success rate. Liang et al. [28] considered the impact of complex and variable network states on the performance of DNN partitioning and designed a dynamic adaptive DNN surgery (DADS) scheme, which limits the size of data transmission while searching for the optimal partitioning strategy of DNNs, optimizing latency and throughput. In summary, most existing models use linear regression models to predict the layer computation delay and energy consumption in DNNs, with insufficient consideration of nonlinear features, which can easily lead to unreasonable DNN partitioning and reduce inspection efficiency.

In summary, this article proposes a DNN adaptive partitioning strategy for substation inspection. Its main contributions are as follows:

(1): In response to the limited computing power of edge devices, which cannot handle complex computing tasks in a timely manner, resulting in increased latency and the inability to process and analyze data in a timely manner, we constructed task-layered computation latency and energy consumption prediction models. When training the models, we considered the characteristics of drone power, network status, task priority, etc. While reducing the model size and prediction time, we ensured that the models maintained high prediction accuracy and improved the real-time performance of data processing.
(2): In response to the problem of insufficient consideration of nonlinear features in existing partitioning methods, which leads to unreasonable model partitioning, the optimal partitioning method is selected based on the resource utilization rate, communication status, task delay, and energy consumption weight of each edge node in the substation. Then, based on the prediction results of delay and energy consumption, the optimal partitioning nodes of the DNN are inferred, and the DNN is adaptively partitioned into terminal devices and multiple edge nodes for collaborative processing, fully utilizing the computing power of heterogeneous nodes and accelerating task processing speed.

2. DNN Adaptive Hybrid Partitioning Architecture for a Heterogeneous System of Unmanned Aerial Vehicle Inspection in Substations

The DNN adaptive partitioning architecture is shown in the following Figure 1, which includes three stages: the real-time delay and energy consumption prediction stage, partition node determination stage, and collaborative inference stage.

In the prediction stage, the delay prediction model and energy consumption prediction model are trained to estimate the layer inference delay and layer inference energy consumption of the DNN at various heterogeneous nodes. It is an important reference factor for DNN adaptive hybrid partitioning.

Determine the node division stage, which comprehensively considers the communication, computing, and cache (3C) resource utilization of various heterogeneous edge nodes, dynamic network conditions, and DNN delay energy consumption weights, and adaptively develops partition strategies. The DAG-type DNN calculation task is divided into end-to-end multi-edge-node synchronous collaborative reasoning.

In the collaborative inference stage, data-intensive blocks are retained for execution on the drone terminal to reduce data transmission latency. The subsequent computationally intensive blocks are unloaded to various heterogeneous edge nodes for synchronous collaborative inference, accelerating the inference speed.

3. Prediction Stage

Due to the significant deviation between the final result obtained by measuring the delay of each layer separately and overlaying it with the actual delay value, it is not conducive to the establishment and training of the prediction model. Therefore, this article obtained more accurate detection task computation delay and energy consumption data on different edge nodes through continuous inference and buried point measurement methods and used them as data and input parameters for model establishment and training. This stage can be divided into the following: computing latency and computing energy consumption, analyzing data transmission latency and energy consumption, and training inference latency and inference energy consumption prediction models. The specific steps are as follows:

(1): Calculate latency and energy consumption: This task aims to predict the computation latency and energy consumption of DNN layers on heterogeneous nodes. This article considers four types of deep neural network layers, namely the activation layer (ACT), convolutional layer (CONV), pooling layer (POOL), and fully connected layer (FC). Analyze and calculate the computation latency, energy consumption, DNN layer types, heterogeneous node computing capabilities, etc.
(2): Analyze data transmission latency and energy consumption: This task aims to predict the data transmission latency and energy consumption of neural network tasks allocated between heterogeneous nodes. This article considers the input data size, output data size, network transmission rate of different types of DNN layers, allocation scheme for synchronous collaborative processing, and additionally calculates the data exchange latency and energy consumption between heterogeneous nodes.
(3): Training inference delay and energy consumption prediction model: This task aims to integrate DNN calculation delay and energy consumption, data transmission delay and energy consumption, and construct a prediction model through neural networks to better learn nonlinear features related to delay and energy consumption, such as task-queuing network status, data transmission distance, drone propulsion energy consumption, task priority, etc.

3.1. Modeling Inference Delay

This article categorizes the heterogeneous nodes in the substation inspection heterogeneous system into three categories based on their computing power: unmanned aerial vehicles and other terminals, heterogeneous edge nodes, and local servers. This article uses V = {V0, V1, V2,…, VN} to represent the topology of a DAG-style DNN. V is the vertex and consists of a series of layers, with the V0 layer added at the beginning. When the partitioning node is between the V0 layer and V1, it indicates that the drone only collects and transmits data, and all computing tasks are processed by the edge server. And use G = {G1, G2, G3} to divide DNN into three parts. S = {S₁, S₂, S₃} represents the collaboration strategy, where S_i ∈ {1, 2, 3} and S_i = 1 represent execution on the drone; S_i = 2 indicates unloading the block to the edge node; and S_i = 3 indicates unloading the block to the local server.

The inference delay for adapting to DNN hybrid partitioning was composed of the following: local inference delay of drone terminals, local server collaborative inference delay, and edge node collaborative inference delay. However, due to the synchronous inference between edge nodes and local servers, and because the inference delay of the edge nodes is smaller than the inference delay of local servers during partitioning, the total system delay can be expressed as follows:

t_{i n f e r e n c e} = t_{i n f e r e n c e}^{e n d} + t_{i n f e r e n c e}^{s e v e r}

(1)

Drone local computation delay: when the i-th DNN vertex is assigned to execute on the terminal device (i.e., S_i = 1), the local computation delay refers to the local execution time of all layers in that vertex, defined as follows:

t_{i n f e r e n c e}^{e n d} = \sum_{i = 1, S_{i} = 1}^{N} t_{e n d} (i)

(2)

t_{e n d} (i) = \sum_{j = 1}^{n} t_{e n d} (i, j)

(3)

In the formula, N represents the number of DAG-type DNN vertices included in the set of vertices allocated by S_i, tend (i) represents the terminal execution time of the i-th vertex, and tend (i, j) represents the execution time of the j-th layer DNN within the i-th vertex.

Local server collaborative inference delay: when the i-th DNN vertex is assigned to the local server for offloading execution (i.e., S_i = 2), the local server inference delay can be divided into two parts: local server node calculation delay and end-to-end data transmission delay, defined as follows:

t_{i n f e r e n c e}^{s e v e r} = \sum_{i = 0, S_{i} = 2}^{N} t_{s e v e r} (i) + \frac{G_{i, j}^{i n p u t}}{B_{e n d - s e v e r}^{u}}

(4)

t_{s e v e r} (i) = \sum_{j = 1}^{n} t_{s e v e r} (i, j)

(5)

In the formula, t_sever(i) represents the execution time of the edge node at the i-th vertex, Bu end-sever represents the uplink bandwidth (Mbps) between terminals such as drones and local servers, and

G_{i, j}^{i n p u t}

represents the input data size of the j-th layer within the i-th vertex.

3.2. Modeling Inference Energy Consumption

Due to the small amount of data exchange and abundant power resources in the substation after the DNN task is divided, the energy consumption of data transmission between edge nodes is ignored. The inference energy consumption of DNN blocks includes two parts: drone energy consumption and heterogeneous node collaborative inference energy consumption, drone flight energy consumption, drone data transmission energy consumption, and edge node collaborative calculation energy consumption [29,30]. Therefore, the total energy consumption of the system can be expressed as follows:

e_{i n f e r e n c e} = E_{i n f e r e n c e}^{U A V} + e_{i n f e r e n c e}^{e d g e}

(6)

E_{i n f e r e n c e}^{U A V} = e_{i n f e r e n c e} + e_{f l y} + e_{t r a n s}

(7)

(1): Local calculation of energy consumption for drones: when the i-th DNN vertex is assigned to the drone device for execution (i.e., S_i = 1), the required energy consumption for all layers in that vertex of the drone is defined as follows:

$e_{i n f e r e n c e}^{U A V} = \sum_{i = 0, S_{i} = 1}^{N} e_{U A V} (i)$

(8)

$e_{U A V} (i) = \sum_{j = 1}^{n} e_{U A V} (i, j)$

(9)

In the formula, N represents the DAG-type DNN vertex number contained in the set of vertices assigned by S_i, e_UAV(i) represents the drone calculation energy consumption of the i-th vertex, and e_UAV(i,j) represents the j-th layer DNN calculation energy consumption within the i-th vertex.

(2): Drone data transmission energy consumption: when the calculation after the i-th vertex is allocated to the local server for calculation, the energy consumption is required for task transmission to the specified node. The signal power gain from drones to local servers follows a path loss model in free space [30].

$h (n) = \frac{β_{0}}{d^{2} (n)} = \frac{β_{0}}{{\sqrt{X {(n)}^{2} + Y {(n)}^{2} + H {(n)}^{2}}}^{2}}$

(10)

In the formula, β₀ represents the channel power at the reference distance d = 1 m, and d is the distance between the drone and the local server.

Therefore, the expression for the energy consumption model of drone data transmission is as follows:

e_{t r a n s} = τ \frac{(2^{G_{i}^{i n p u t} / (τ B_{e n d - s e v e r})} - 1) [σ^{2} + \sum_{i = 1}^{I} P_{U A V} h (n)]}{h (n)}

(11)

In the formula,

G_{i}^{i n p u t}

represents the amount of uploaded data,

τ

represents the time slot,

σ^{2}

represents the power of the additive white Gaussian noise at the receiver, P_UAV represents the transmission power of the drone, and h represents the channel power gain.

(3): Energy consumption for drone linear flight [31]:

$e_{f l y} = \frac{d}{‖ v ‖} (\frac{c_{1}}{{‖ v ‖}^{2}} + c_{2} {‖ v ‖}^{3})$

(12)

$c_{1} = \frac{2 {(m g)}^{2}}{(π e_{1} A) ρ S}$

(13)

$c_{2} = \frac{1}{2} ρ C_{d}^{f} S$

(14)

In the formula, d represents the drone’s straight-line flight distance, v represents the drone’s horizontal flight speed, c₁ and c₂ are fixed coefficients, where g is the local gravitational acceleration, m is the drone’s mass, e₁ is the drone’s wing spread efficiency factor, A is the wing aspect ratio, S is the flight wing area,

C_{d}^{f}

is the zero-lift drag coefficient, and ρ is the air resistance.

3.3. Problem Description

This article defines a total system cost, which includes latency cost and energy consumption cost. In response to the special nature of drone terminals being constrained by electricity, different energy consumption weights are set up compared to heterogeneous edge nodes that are not constrained by electricity. And transform the partition point search problem into an optimization problem. The purpose is to minimize the end-to-end system cost of deep neural networks in a dynamic environment where terminal electricity, network resources, and edge node computing resources are constantly changing in heterogeneous substation inspection systems. Specifically, optimization problems are defined as follows:

s_{c} (i, t) = \underset{{x}}{\arg \min {α t_{i n f e r e n c e} + β E_{i n f e r e n c e}^{U A V} + θ e_{i n f e r e n c e}^{e d g e}}} \begin{array}{l} s . t . \\ C 1 : 0 \leq α \leq 1, 0 \leq β \leq 1 \\ C 2 : t_{i n f e r e n c e} \leq t_{m a x} \\ C 3 : e_{i n f e r e n c e} \leq e_{m a x} \end{array}

(15)

In the formula, C1 indicates that the preference for task calculation delay is α and the preference for task calculation energy consumption is 1−α. When α is large, it indicates that this task is more sensitive to latency. The preference for calculating energy consumption for drone missions is β. The preference for energy consumption of heterogeneous edge nodes is 1-β. When 1-α and β are larger, it indicates that this task is more sensitive to energy consumption and is often used for power-constrained terminals such as drones, thereby improving the drone’s usage time. The weight coefficient can be appropriately selected based on the specific situation of the drone. C2 indicates that the total latency does not exceed the maximum latency accepted by the user. C3 indicates that the total energy consumption does not exceed the maximum energy consumption accepted by the user.

4. Adaptive Partitioning Stage

The main tasks of this stage include analyzing task latency and energy consumption weights, determining partition methods and collaborative nodes, and searching for partition points.

Analyze task delay and energy consumption weight: This task aims to determine the task inference delay weight and energy consumption weight for different terminal devices in the substation, such as drones, surveillance cameras, inspection robots, etc., in different states. Due to the fact that cameras and other devices are not constrained by power consumption, the weight of energy consumption for energy consumption task processing is relatively low. Devices such as drones and inspection robots are easily constrained by electricity and equipment temperature, and the strict setting of task inference delay weight and energy consumption weight is particularly important.
Determine the division method and collaborative edge nodes: This task aims to determine the DNN division method and collaborative reasoning nodes by monitoring the resource utilization and communication status between heterogeneous edge nodes.
Search for partition points: Based on the prediction results of the DNN’s inference delay prediction model and inference energy consumption prediction model, search for the optimal partition point on the heterogeneous inspection system calculation platform of substations, so as to minimize the end-to-end inference delay and inference energy consumption weighted values of the DNN. Adaptively divide the DNN computing into terminal devices and heterogeneous nodes for collaborative processing in sequence.

A Hybrid Partition Strategy Based on DAG-Type DNN

This article constructs a corresponding DAG model based on the DNN model and utilizes the parallelism and directed acyclic characteristics of the DAG model to propose a new partitioning strategy that uses enumeration to determine the optimal partitioning point.

For the existing DNN model, a corresponding DAG-style DNN is constructed based on the dependency between the front and back layers, and a V0 layer is added at the beginning. When dividing nodes between V0 layer and V1, it means that the terminal only collects and transmits data, and all computing tasks are processed by the edge server. Set a new calibration method to mark vertices, starting from the starting point and calibrating them sequentially in the order of the links and vertices on the links, as shown in Figure 2.

The deep neural network hybrid partitioning method is divided into two steps. The first step is to default to no participating collaborative edge nodes and divide the inference process into terminal local server collaborative inferences. At this time, the cost of the drone in the system is s_c_,1 =

{α t}_{i n f e r e n c e}^{U A V}

+

{β E}_{i n f e r e n c e}^{U A V}

, and the local server cost is s_c_,2 =

{α t}_{i n f e r e n c e}^{s e v e r}

; therefore, the total cost of the system is s_total = s_c_,1 + s_c_,2. The optimal first partition point problem for the DNN is to solve min_stotal. For unmanned aerial vehicles with finite battery terminals, the first partition node can be searched within the vertex, that is, the neural network layer of the vertex can be divided.

Represent the problem of solving the min_stotal as an enumeration method optimization problem, using (i, j) to represent the optimal first partition point as the j-th layer in the i-th vertex. Among them, by continuously iterating and calculating, the optimal first partition point is solved to minstotal the total cost of the system. After multiple iterations, the range of values for the first partition node is recorded, and the range of values for i and j is reset. The global enumeration is changed to the local enumeration method to search for the optimal partition point. Reduce the optimization time complexity of enumeration methods.

Step 2: Firstly, determine whether there are mobile edge nodes that can participate in collaborative computing, and then infer the DNN offloaded to the local server, further dividing it into synchronous collaborative reasoning between the server and the mobile edge nodes. At this point, the server cost in the system is s_c_,3 =

{α t}_{i n f e r e n c e, 2}^{s e v e r}

, and the cost of moving edge nodes in the system is s_c_,4 =

{β e}_{i n f e r e n c e, 2}^{e d g e}

. The total cost of the system is therefore total, s_total_,2 = s_c_,3 + s_c_,4, and the DNN mixed partitioning problem is commonly used to prevent computation. Therefore, under the constraints of

t_{i n f e r e n c e, 2}^{s e v e r} < t_{i n f e r e n c e}^{e d g e}

solve for min_stotal_,2.

And in order to reduce the complexity of search space and time, heterogeneous-edge-assisted computation is performed on a link basis, using (k, i, j) to represent the j-th layer neural network of the i-vertex in the k-th link of the DAG-type DNN. At this point, the processing latency of the mobile edge nodes includes the following: current local task processing latency, link processing latency, data uplink, and data downlink latency.

t_{i n f e r e n c e, 2}^{e d g e} = \sum_{k = 1}^{K} t_{e d g e} (k) + \frac{G_{i}^{i n p u t} (\max)}{B_{s e v e r - e d g e}^{u}} + \frac{G_{i}^{o u p u t} (\max)}{B_{e d g e - s e v e r}^{u}} + t_{i n f e r e n c e, 1}^{e d g e}

(16)

t_{e d g e} (k) = \sum_{i = 1}^{N} \sum_{j = 1, S_{i} = 3}^{n} t_{e d g e} (i, j)

(17)

Among them,

t_{i n f e r e n c e, 1}^{e d g e}

represents the current cumulative task calculation time, t_edge(k) represents the execution time of the edge node of the k-th link,

B_{e v e r - e d g e}^{u}

represents the uplink bandwidth (Mbps) between the server and the edge node, and Bu edge-sever represents the uplink bandwidth (Mbps) between the edge node and the local server.

The processing energy consumption of the mobile edge nodes is as follows:

e_{i n f e r e n c e}^{e d g e} = \sum_{k = 1}^{K} e_{e d g e} (k)

(18)

e_{e d g e} (k) = \sum_{i = 1}^{N} \sum_{j = 1, S_{i} = 3}^{n} e_{e d g e} (i, j)

(19)

Among them, e_edge(k) represents the calculated energy consumption of the edge node of the k-th link.

The processing latency of the local server is as follows:

t_{i n f e r e n c e}^{s e v e r} = \sum_{k = 1}^{K} t_{s e v e r} (k) - \sum_{i = 0}^{i} \sum_{j = 1}^{j} t_{s e v e r} (i, j)

(20)

Representing the problem of solving min_stotal_,2 as an enumeration optimization problem, M is used to allocate a link set to the local server, N is used to allocate a link set to the edge nodes, and through continuous iterative calculations, the optimal collaborative reasoning set as M, N is solved to minimize the total system cost min_stotal_,2. After multiple iterative calculations, the number of links in M, N is often recorded, and the range of M, N values is reset. Changing global enumeration to local enumeration search reduces the optimization time complexity of the enumeration method.

5. Collaborative Reasoning Phase

Terminal equipment: The terminal equipment referred to in this article refers to various intelligent devices that participate in substation inspection and monitoring, including those whose working hours are easily limited by electricity, such as drones and inspection robots, as well as intelligent devices that are not limited by electricity, such as video surveillance. Terminal devices may generate various computationally intensive tasks based on different DNNs, and because terminal devices have certain computing power and can communicate with other edge nodes through wired or wireless networks, lightweight DNN layers can be calculated on the terminal devices, while the other DNN layers are processed by edge nodes in collaboration.

Heterogeneous edge layer: The heterogeneous edge layer is composed of edge nodes with high computing power that are close to the terminal devices. Due to the collaborative computing between terminal devices and heterogeneous edge nodes, which helps to achieve a good balance between inference latency and inference energy consumption, the DNN layer is offloaded to the edge nodes for execution. However, different heterogeneous nodes have different computing capabilities. To fully utilize the computing power of heterogeneous edge layers, edge nodes can also collaborate on a certain task.

This framework utilizes the optimal partitioning nodes obtained, divides the DNN into several blocks, and collaborates with heterogeneous edge nodes to accelerate DNN inference and save terminal energy consumption. Specifically, if it is an optimal partitioning node, the DNN blocks before the partitioning point are executed locally at the terminal, and the DNN blocks after the partitioning point are unloaded to the edge node for execution. If there are two optimal partition nodes, the blocks before the first partition point will be executed locally at the terminal. The blocks between the two partition points will be unloaded to the edge nodes with medium computing power for execution, and the rest will be unloaded to the edge nodes with high computing power to form a sequential collaborative inference architecture. As shown in Algorithm 1, the adaptive DAG DNN computation segmentation algorithm is presented.

Algorithm 1: Collaborative Reasoning Framework—daptive DAG DNN Computing Partitioning Algorithm

Input: Current Network Status

Output:

First Partition Point p_{1}

Set the end-to-end latency as

t_{t o t a l}

= 0

Set the first partition point as

p_{1}

= 0

Set s_{m i n}

= +∞

for i = 0:N do

for j = 1:n do

Set

t_{i n f e r e n c e}^{U A V}

= 0

Set

E_{i n f e r e n c e}^{U A V}

= 0

Set

t_{i n f e r e n c e}^{s e v e r}

= 0

t_{i n f e r e n c e}^{U A V}

=

t_{i n f e r e n c e}^{U A V}

+

t_{U A V}

(i,j)

e_{i n f e r e n c e}^{U A V}

=

e_{i n f e r e n c e}^{U A V}

+

e_{U A V}

(i,j)

end

E_{i n f e r e n c e}^{U A V}

=

e_{i n f e r e n c e}^{U A V}

+

e_{t r a n s}

+

e_{f l y}

s_{c, 1}

= α

t_{i n f e r e n c e}^{e n d}

+ β

E_{i n f e r e n c e}^{U A V}

for I = i:N do

for J = j + 1:n do

t_{i n f e r e n c e}^{s e v e r}

=

t_{i n f e r e n c e}^{s e v e r}

+

t_{s e v e r}

(i,j)

tend

t_{i n f e r e n c e}^{s e v e r}

=

t_{i n f e r e n c e}^{s e v e r}

+

\frac{G_{i}^{i n p u t}}{B_{e n d - s e v e r}^{u}}

s_{c, 2}

= α

t_{i n f e r e n c e}^{s e v e r}

s_{t o t a l}

=

s_{c, 1}

+

s_{c, 2}

if

s_{t o t a l}

<

s_{m i n}

then

s_{m i n}

=

s_{t o t a l}

P_{1}

= i

end

6. Experimental and Simulation Verification

6.1. Experimental Platform and Simulation Parameters

In order to verify the effectiveness of the method proposed in this paper, the results of unmanned aerial vehicle inspections in the substation area of a certain province’s transmission lines were collected and simulated using Python 3.7 and MATLAB 2018a tools. The drone was equipped with Raspberry Pi as the processor, and the Raspberry Pi model was Raspberry Pi 4B. The simulation experimental environment is shown in Table 1, where the specific relevant parameters of the UAV can be referred to in references [32,33]. The operating system running the simulation experiment was Windows 10, using Huawei phones as heterogeneous edge nodes. The detailed specifications are shown in Table 2, and the use of a server as a local server node is shown in Table 3.

6.2. Prediction Model Experimental Results and Analysis

This research trained a total of 24 deep learning neural networks based on four types of DNN layers, three computing platforms, and delay and energy consumption for layer delay prediction and layer energy consumption prediction. In order to evaluate the predictive performance of the inference acceleration framework in this article, the time-delay prediction model and energy consumption prediction model based on deep learning was compared with three prediction methods: linear regression (LR), k-nearest neighbor (KNN), and random forestry (RF), and the average absolute percentage error (MAPE) was used as an indicator to measure the performance of the prediction model. In the experiment, ResNet-34, a neural network architecture with high complexity and good performance in the field of image classification, was selected for training on the dataset ImageNet and implemented using Python 3.8.

Prediction Model Accuracy Evaluation

As shown in Figure 3, on the local server node, the neural network delay prediction model established in this article compared with the three benchmark methods of LR, KNN, and RF showed an average MAPE reduction of 36.68%, 34.55%, and 20.41% for all neural network layers, respectively.

As shown in Figure 4, on mobile heterogeneous edge nodes, the average MAPE of the DL delay prediction model decreased by 38.27%, 31.57%, and 16.67%, respectively.

As shown in Figure 5, on the edge nodes of the drone terminal, the average MAPE of the DL delay prediction model decreased by 35.50%, 39.67%, and 30.10%, respectively.

The experimental results showed that the three baseline methods could not effectively learn nonlinear features related to inference delay, inference energy consumption, task priority, current task volume, etc. However, the DL based prediction model proposed in this paper could better learn complex nonlinear features with a good performance.

6.3. Reasonability Analysis of Adaptive Hybrid Partitioning

In order to verify the rationality of the partition strategy proposed in this article. This article partitions the ResNet-34 network under varying network transmission rates, current drone power levels, and weight coefficients α and β.

As shown in Figure 6, as the transmission rate continues to increase, the first partition point of the hybrid partitioning strategy keeps advancing, because as the transmission rate increases, the neural network layer with larger data volume can be offloaded to the edge nodes for calculation as soon as possible, reducing the calculation delay cost while maintaining the communication cost basically unchanged.

Figure 7 verifies that as the power consumption of terminal devices such as drones gradually decreases, in order to maintain normal equipment operation, the first partition point is continuously moved forward to reduce the power consumption of drone calculations. This article sets weight coefficients α and β. α is the weight coefficient of the total delay, and β is the weight coefficient of the total energy consumption.

From Figure 8, it can be seen that when β is at 0.5, with α, the partition points are moved forward and due to the increase in α. When it is too small or too large, a certain cost of energy consumption or latency accounts for too much of the total cost, resulting in the partition point being higher.

From Figure 9, it can be seen that when α is at 0.5, with β, as the number of partitions increases, the partition points continue to move forward because of β. As the energy consumption cost of terminals such as drones increases significantly, in order to reduce the total cost, terminals are uninstalled in advance and more calculations are offloaded to edge nodes for execution.

6.4. Comparative Analysis of Total Delay in Multi Partition Systems

To verify the excellent performance of DAG-based DNN adaptive partitioning in reducing task processing latency, the two partition methods proposed in this article are compared with the existing six partition strategies: Only-End, Only-edge, Only-cloud, End-edge, End-cloud, and End-edge-cloud. Only-local means that all detection calculations are executed locally on terminals such as UAV. Only-edge and Only-cloud offload all detection and computation to edge servers or the cloud for execution, while drones and other terminals only collect data. End-edge and End-cloud divide the detection task into two parts, which are executed by the terminal in collaboration with the edge server or cloud. End-edge-cloud divides the detection task into three parts, which are executed by the local, edge server, and cloud in sequence.

As shown in Figure 10, the partitioning strategy proposed in this article is always superior to the other six partitioning methods in terms of latency. The reason is that the local terminal has limited computing power, which can easily lead to high computing latency. Moreover, the edge servers are far away from the cloud, and blind uninstallation can also lead to a significant increase in communication latency. The partitioning strategy proposed in this article balances the computation delay and communication delay, fully utilizing the computing power and proximity characteristics of heterogeneous edge nodes near the terminal, parallelizing the data with the model, and further reducing the computation delay and communication delay.

6.5. Analysis of the Impact of Image Size on the Total Cost of the System

From Figure 11, it can be seen that the hybrid partitioning method proposed in this article, along with the other four commonly used collaborative reasoning methods, had a total system cost under the same input image size. And the method proposed in this article was always superior to the existing collaborative methods, and the average reduction rate in the total system cost was 9.56% (i.e., 224 ∗ 224, 460 ∗ 460, 640 ∗ 640, and 760 ∗ 760 are 8.39%, 9.47%, 8.28%, and 12.09%, respectively). This indicates that the method proposed in this article can effectively balance communication delay and execution delay by dividing ResNet-34 computation into drones, local servers, and edge mobile edge nodes, and synchronize collaborative reasoning, thereby reducing the total system cost.

7. Conclusions

Aiming at the problems of the high delay and low efficiency of unmanned aerial vehicles (UAVs) in substation electrical inspection, this paper proposes an adaptive partitioning strategy for heterogeneous inspection systems. Firstly, establish a drone inspection calculation delay and energy consumption prediction model at various heterogeneous edge nodes. Based on this model, the total cost model of the substation heterogeneous inspection system was established. The experimental results showed that the adaptive partition strategy could effectively save the system cost. This research was carried out under the scenario where only a single UAV was considered, which had certain limitations. Therefore, when facing high-frequency inference tasks in the future, combining multi-UAV resource scheduling to improve the collaborative inference framework is a problem that needs further research.

Author Contributions

Conceptualization, F.D.; data curation, B.W.; formal analysis, Q.F.; investigation, Q.F., F.D. and X.X.; methodology, J.Z. and B.W.; project administration, F.D.; resources, J.Z.; software, X.X.; validation, X.X.; writing—original draft, Q.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants 52362052, 52167008, and 52377103 and in part by the Natural Science Foundation of Jiangxi Province under Grant 20232BAB204065.

Data Availability Statement

The data used to support the results of this study has not been provided because the author’s college does not allow data to be published.

Conflicts of Interest

The authors declare no conflict of interest.

References

Xia, C.; Zhao, J.; Cui, H.; Feng, X.; Xue, J. DNNTune: Automatic benchmarking DNN models for mobile-cloud computing. ACM Trans. Archit. Code Optim. 2019, 16, 49. [Google Scholar] [CrossRef]
Chen, Z.; Hu, J.; Chen, X.; Hu, J.; Zheng, X.; Min, G. Computation offloading and task scheduling for DNN-based applications in cloud-edge computing. IEEE Access 2020, 8, 115537–115547. [Google Scholar] [CrossRef]
Zhang, J.; Ma, S.; Yan, Z.; Huang, J. Joint DNN Partitioning and Task Offloading in Mobile Edge Computing via Deep Reinforcement Learning. J. Cloud Comput. 2023, 12, 116. [Google Scholar] [CrossRef]
Miao, W.; Zeng, Z.; Wei, L.; Li, S.; Jiang, C.; Zhang, Z. Adaptive DNN partition in edge computing environments 2020. In Proceedings of the IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), Hong Kong, China, 2–4 December 2020; pp. 685–690. [Google Scholar]
Li, X.; Li, W.; Yang, Q.; Yan, W.; Zomaya, A.Y. Edge-computing-enabled unmanned module defect detection and diagnosis system for large-scale photovoltaic plants. IEEE Internet Things J. 2020, 7, 9651–9663. [Google Scholar] [CrossRef]
Tang, W.; Yang, Q.; Hu, X.; Yan, W. Deep learning-based linear defects detection system for large-scale photovoltaic plants based on an edge-cloud computing infrastructure. Sol. Energy 2022, 231, 527–535. [Google Scholar] [CrossRef]
Xu, C.; Li, Q.; Zhou, Q.; Zhang, S.; Yu, D.; Ma, Y. Power line-guided automatic electric transmission line inspection system. IEEE Trans. Instrum. Meas. 2022, 71, 3512118. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, Z.; Li, D.; Su, Z. Joint optimization of computing offloading and service caching in edge computing-based smart grid. IEEE Trans. Cloud Comput. 2022, 11, 1122–1132. [Google Scholar] [CrossRef]
Wu, Y.; Guo, H.; Chakraborty, C.; Khosravi, M.R.; Berretti, S.; Wan, S. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Trans. Netw. Sci. Eng. 2022, 10, 3086–3098. [Google Scholar] [CrossRef]
Cheng, Q.; Wang, H.; Zhu, B.; Shi, Y.; Xie, B. A Real-Time UAV Target Detection Algorithm Based on Edge Computing. Drones 2023, 7, 95. [Google Scholar] [CrossRef]
Song, C.; Xu, W.; Han, G.; Zeng, P.; Wang, Z.; Yu, S. A cloud edge collaborative intelligence method of insulator string defect detection for power IIoT. IEEE Internet Things J. 2020, 8, 7510–7520. [Google Scholar] [CrossRef]
Shuang, F.; Chen, X.; Li, Y.; Wang, Y.; Miao, N.; Zhou, Z. PLE: Power Line Extraction Algorithm for UAV-Based Power Inspection. IEEE Sens. J. 2022, 22, 19941–19952. [Google Scholar] [CrossRef]
Ren, P.; Qiao, X.; Huang, Y.; Liu, L.; Pu, C.; Dustdar, S. Fine-grained elastic partitioning for distributed DNN towards mobile web AR services in the 5G era. IEEE Trans. Serv. Comput. 2021, 15, 3260–3274. [Google Scholar]
Sharma, S.K.; Wang, X. Live data analytics with collaborative edge and cloud processing in wireless IoT networks. IEEE Access 2017, 5, 4621–4635. [Google Scholar] [CrossRef]
Kang, Y.; Hauswald, J.; Gao, C.; Rovinski, A.; Mudge, T.; Mars, J.; Tang, L. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 2017, 45, 615–629. [Google Scholar] [CrossRef]
Chen, Y.Y.; Lin, Y.H.; Hu, Y.C.; Hsia, C.H.; Lian, Y.A.; Jhong, S.Y. Distributed real-time object detection based on edge-cloud collaboration for smart video surveillance applications. IEEE Access 2022, 10, 93745–93759. [Google Scholar] [CrossRef]
Ding, C.; Zhou, A.; Liu, Y.; Chang, R.N.; Hsu, C.-H.; Wang, S. A cloud-edge collaboration framework for cognitive service. IEEE Trans. Cloud Comput. 2020, 10, 1489–1499. [Google Scholar] [CrossRef]
Liu, G.; Dai, F.; Xu, X.; Fu, X.; Dou, W.; Kumar, N.; Bilal, M. An adaptive DNN inference acceleration framework with end–edge–cloud collaborative computing. Future Gener. Comput. Syst. 2023, 140, 422–435. [Google Scholar] [CrossRef]
Li, E.; Zeng, L.; Zhou, Z.; Chen, X. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wireless Commun. 2019, 19, 447–457. [Google Scholar] [CrossRef]
Li, J.; Liang, W.; Li, Y.; Xu, Z.; Jia, X.; Guo, S. Throughput maximization of delay-aware DNN inference in edge computing by exploring DNN model partitioning and inference parallelism. IEEE Trans. Mob. Comput. 2021, 22, 3017–3030. [Google Scholar] [CrossRef]
Wei, Z.; Yu, X.; Zou, L. Multi-Resource Computing Offload Strategy for Energy Consumption Optimization in Mobile Edge Computing. Processes 2022, 10, 1762. [Google Scholar] [CrossRef]
Shao, S.; Liu, S.; Li, K.; You, S.; Qiu, H.; Yao, X.; Ji, Y. LBA-EC: Load Balancing Algorithm Based on Weighted Bipartite Graph for Edge Computing. Chin. J. Electron. 2023, 32, 313–324. [Google Scholar] [CrossRef]
Sun, Z.; Yang, H.; Li, C.; Yao, Q.; Wang, D.; Zhang, J.; Vasilakos, A.V. Cloud-edge collaboration in industrial internet of things: A joint offloading scheme based on resource prediction. IEEE Internet Things J. 2021, 9, 17014–17025. [Google Scholar] [CrossRef]
Nayyer, M.Z.; Raza, I.; Hussain, S.A.; Jamal, M.H.; Gillani, Z.; Hur, S.; Ashraf, I. LBRO: Load Balancing for Resource Optimization in Edge Computing. IEEE Access 2022, 10, 97439–97449. [Google Scholar] [CrossRef]
Deng, Y.; Wu, T.; Chen, X.; Ashrafzadeh, A.H. Multi-Classification and Distributed Reinforcement Learning-Based Inspection Swarm Offloading Strategy. Intell. Autom. Soft Comput. 2022, 34, 1157–1174. [Google Scholar] [CrossRef]
Yuan, H.; Zhou, M.C. Profit-maximized collaborative computation offloading and resource allocation in distributed cloud and edge computing systems. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1277–1287. [Google Scholar] [CrossRef]
Xu, W.; Yin, Y.; Chen, N.; Tu, H. Collaborative inference acceleration integrating DNN partitioning and task offloading in mobile edge computing. Int. J. Softw. Eng. Knowl. Eng. 2023, 33, 1835–1863. [Google Scholar] [CrossRef]
Liang, H.; Sang, Q.; Hu, C.; Cheng, D.; Zhou, X.; Wang, D.; Bao, W.; Wang, Y. DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning. IEEE Trans. Cloud Comput. 2023, 11, 3111–3125. [Google Scholar] [CrossRef]
Liu, J.; Zhang, Q. Offloading schemes in mobile edge computing for ultra-reliable low latency communications. IEEE Access 2018, 6, 12825–12837. [Google Scholar] [CrossRef]
Shi, L.; Xu, Z.; Sun, Y.; Shi, Y.; Fan, Y.; Ding, X. A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system. Peer Peer Netw. Appl. 2021, 14, 4031–4045. [Google Scholar] [CrossRef]
Ren, W.Q.; Qu, Y.B.; Dong, C.; Jing, Y.Q.; Sun, H.; Wu, Q.H.; Guo, S. A survey on collaborative DNN inference for edge intelligence. Mach. Intell. Res. 2023, 20, 370–395. [Google Scholar] [CrossRef]
Xu, J.; Ota, K.; Dong, M. Aerial edge computing: Flying attitude-aware collaboration for multi-UAV. IEEE Trans. Mob. Comput. 2022, 22, 5706–5718. [Google Scholar] [CrossRef]
Cheng, K.; Fang, X.; Wang, X. Energy efficient edge computing and data compression collaboration scheme for UAV-assisted network. IEEE Trans. Veh. Technol. 2023, 72, 16395–16408. [Google Scholar] [CrossRef]

Figure 1. Adaptive segmentation framework diagram.

Figure 2. Building corresponding DAG model block diagram based on DNN model.

Figure 3. Prediction error of DNN layers on local servers.

Figure 4. Prediction error of DNN layers on moving edge nodes.

Figure 5. Prediction error of DNN layers at drone terminals.

Figure 6. Network transmission rate variation curve.

Figure 7. Drone battery level change curve.

Figure 8. Weight coefficient α variation curve.

Figure 9. Weight coefficient β variation curve.

Figure 10. Experimental diagram of partition strategy delay comparison.

Figure 11. Comparison of total collaborative costs under the same input image size.

Table 1. Experimental related parameters.

c1	Drone wing area constant	0.002
c2	Load factor constant of unmanned aerial vehicles	70.698
η	Drone chip structure constant	10⁻¹¹
FUAV	Drone computing power	1 GHz
KUAV	The cycle required for drone processing per bit	100 cycles/bite

Table 2. Heterogeneous edge node specifications.

Hardware	Specifications
System	Harmony OS 2.0.0
CPU	HUAWEI Kirin 985
Memory	8 GB
HDD	128 GB

Table 3. Local server specifications.

Hardware	Specifications
System	Windows 10
CPU	12 × Intel(R) Xeon(R) CPU E5-2678 2.50 GHz
Memory	Samsung 64 GB 2400 MHz
HDD	4 × WDC PC SN730 512 GB
GPU	3 × NVIDIA GeForce GTX 1080 Ti

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fu, Q.; Deng, F.; Xue, X.; Zeng, J.; Wei, B. DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations. Electronics 2024, 13, 3383. https://doi.org/10.3390/electronics13173383

AMA Style

Fu Q, Deng F, Xue X, Zeng J, Wei B. DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations. Electronics. 2024; 13(17):3383. https://doi.org/10.3390/electronics13173383

Chicago/Turabian Style

Fu, Qincui, Fangming Deng, Xianfa Xue, Jianjun Zeng, and Baoquan Wei. 2024. "DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations" Electronics 13, no. 17: 3383. https://doi.org/10.3390/electronics13173383

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DNN Adaptive Partitioning Strategy for Heterogeneous Online Inspection Systems of Substations

Abstract

1. Introduction

2. DNN Adaptive Hybrid Partitioning Architecture for a Heterogeneous System of Unmanned Aerial Vehicle Inspection in Substations

3. Prediction Stage

3.1. Modeling Inference Delay

3.2. Modeling Inference Energy Consumption

3.3. Problem Description

4. Adaptive Partitioning Stage

A Hybrid Partition Strategy Based on DAG-Type DNN

5. Collaborative Reasoning Phase

6. Experimental and Simulation Verification

6.1. Experimental Platform and Simulation Parameters

6.2. Prediction Model Experimental Results and Analysis

Prediction Model Accuracy Evaluation

6.3. Reasonability Analysis of Adaptive Hybrid Partitioning

6.4. Comparative Analysis of Total Delay in Multi Partition Systems

6.5. Analysis of the Impact of Image Size on the Total Cost of the System

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI