**A Real-Time Data Delivery for Mobile Sinks Group on Mobile Cyber-Physical Systems**

**Seungmin Oh 1, Yoonsoo Choi 2, Sangdae Kim 1, Cheonyong Kim 3, Kwansoo Jung <sup>4</sup> and Seok-Hun Kim 5,\***


Received: 4 August 2020 ; Accepted: 26 August 2020 ; Published: 27 August 2020

**Abstract:** Mobile Cyber-Physical Systems (MCPS) have extended the application domains by exploiting the advantages of Cyber-Physical Systems (CPS) through the mobile devices. The cooperation of various mobile equipment and workers based on the MCPS further improved efficiency and productivity in the industry. To support this cooperation of groups of workers (hereafter referred to as the Mobile Sink Groups), data should be delivered to appropriate groups of workers in a timely manner. Traditionally, the data dissemination for MSG relies on flooding-based geocasting into the movable area of the group due to frequent movements of each group member. However, the flooding-based data dissemination could not be directly applied to real-time data delivery that demands the required time deadline and the end-to-end delivery distance, because the flooding could not define the end-to-end distance and progress to each member in a group. This paper proposes a real-time data delivery mechanism for supporting MSG in time-critical applications. In our mechanism, a ring-based modeling and data transfer scheme on a virtual grid in the ring for group mobility provides the end-to-end distance and the progress to forward real-time data to each member. Simulation results show our mechanism is superior to the existing ones in terms of real-time communication for MSG.

**Keywords:** Mobile Cyber-Physical Systems (MCPS); industry; Mobile Sink Groups (MSG); group mobility; real-time data delivery

#### **1. Introduction**

As Cyber-Physical Systems (CPS) are collectively a technology for managing systems that interlink real-world assets, such as various sensors and actuators, with computing power in the information world [1], and it has recently become a key research area in industry, which utilized in various applications such as smart factory, digital manufacturing, and digital twin [2]. For example, CPS exploit sufficient computing resources to process information that has not been addressed in the physical world in the past to promote economic benefits such as improved energy efficiency and productivity in the industry [3]. In addition, by transferring the experimental environment of the physical world to the virtual world, various prototypes could be tested in the virtual environment in advance as performing error diagnosis, predictive maintenance, and product performance measurement very efficiently [4].

Nowadays, with the development of pervasive mobile devices, Mobile Cyber-Physical Systems (MCPS) have attracted more and more attention. MCPS have extended the application domains by exploiting the advantages of CPS through mobile devices [5]. In other words, MCPS could embrace not only CPS, which mainly deal with static equipment and stable networks, but also a network consisting of a number of mobile devices, such as vehicle networks. As the networks with mobile devices are unstable, unlike the networks assumed by the CPS, and the computing power of each mobile device is very different, many studies have been proposed from various aspects for their efficient cooperation. In particular, the timeliness of the data is very important because delay and failures due to bottlenecks, etc., which could be caused by variable network environments, adversely affect the entire system [6,7]. In addition, the various mobile equipment and groups of workers (Mobile Sink Groups (MSG)) performing the collaboration should be able to receive data within a valid time because they must be operated in a mutually collaborative manner.

In the past, the spatiotemporal approach was exploited for real-time data transmission [8,9]. The spatiotemporal approach forwards data with the required delivery speed. The end-to-end delay is proportional to the distance between the devices. By maintaining the delivery speed across the network, this approach could provide a predictable real-time service according to the distance. For multihop communication, each node selects one node among its neighbor nodes, which has faster delivery speed than the required delivery speed. To apply the spatiotemporal approach, per-hop data forwarding requires the delivery speed, the coordinates of the specific destination, and the progress to the destination by each of the 1-hop neighbor nodes.

This traditional spatiotemporal approach has been able to successfully transmit data for individual mobile sinks through virtual infrastructure; however, data dissemination to a group of mobile sinks causes duplicated location management of all sinks and duplicated data delivery to all sinks. Therefore, the group mobility support scheme, divided into two steps, has been proposed: member information gathering and data forwarding to group members [10–12]. Typically, a leader gathers the location of member sinks and reports the representative location of the group (i.e., center point and radius) as the location information of each members might be changed frequently. After gathering, M-Geocasting [10] causes data packets to be flooded into the region (a circle) for active data delivery. In [10], a source node sends its data toward the center point of the movable area of a group. Once one of the boundary nodes gets the data, it starts flooding them only in the area. Flooding data could reduce the cost for trivial movement of sinks within the region. The authors of [11] exploit the internal movement of each member sink. They put data in a virtual rectangular area passing the center point of the group so that member sinks passively get the data when the sinks encounter the area. However, flooding-based dissemination has still a problem regarding application into the spatiotemporal approach for real-time data. The data flooding could not define the final destination for a mobile sink group. Without the final destination, the source node could not calculate the delivery distance and the required delivery speed. Furthermore, each node on the delivery path is not able to calculate the progress to the destination via its neighbor node. Moreover, the passive data dissemination also could not define the end-to-end distance and the progress. VTS [12] exploits a virtual tube storage to deliver data to mobile sink group. As this scheme requires the process of storing data and acquiring data through queries in the sink, it is difficult to achieve real-time data transmission.

To overcome the problem, we propose a real-time data delivery mechanism for supporting a mobile sink group. First, the proposed scheme calculates the movable area of mobile sink group based on the virtual grid structure. Based on the structure, the maximum (farthest) distance for the real-time communication is calculated to find the minimum speed that should be satisfied in the process of data transmission. Finally, the proposed scheme could transfer data to all sinks in the group in a valid time by performing main forwarding and branch forwarding process which along nodes that meet the previously calculated minimum speed. The simulation results verify that the proposed mechanism achieves better performance than the existing ones to support real-time communication for mobile sink groups.

The remainder of this paper is organized as follows. In Section 2, we explain the real-time data delivery for mobile sink groups. The performance evaluation results are provided in Section 3. Finally, the proposed scheme and simulation results are summarized in Section 4.

#### **2. The Proposed Mechanism**

#### *2.1. Overview*

In this section, we describe the overview of the our proposed scheme through Figure 1. As the group moves collectively, our mechanism puts data in a ring-based movable area of the group. In order to define the delivery distance, we construct a grid-based virtual structure in the movable area. Based on the virtual structure, our proposed scheme is largely divided into two forwarding steps: main forwarding and branch forwarding. In the main forwarding process, the data are forwarded along a straight line between the source node and center point of the group sinks (the solid line in Figure 1). The branch forwarding is performed at each branch point in the process of main forwarding. When any branch point receives data through the main forwarding process, each branch point forwards the data through the line to the previously anticipated boundary of the movable area of mobile sinks (the dotted line in Figure 1). In addition, the proposed mechanism calculates the maximum (farthest) distance for the real-time communication based on the virtual structure. Then, each destination and the progress toward the destination could be provided according to either main forwarding or branch forwarding.

**Figure 1.** Overview of the proposed scheme.

#### *2.2. Group Sink Modeling*

In the MCPS environment, a group of mobile sinks usually have a common goal such as maintenance or production work in a restrict region. However, they have different roles for individuals. Thus, we assume that the group of mobile sinks collectively moves, but each member sink in the group moves independently in a restrict region. Each sink receives data from the nearest sensor as an agent node. To forward data to a sink, the coordinates of the sink are needed. However, in the case of a mobile sink group, per-sink movement management causes excessive energy consumption. M-Geocasting [10] offers an effective data delivery to mobile sink groups. We can gather the geographical information of member sinks and report the group information by a leader sink of the group that is responsible for gathering the location of all member sinks and periodically advertising coordinates of center point and a radius of the area. With the information of the mobile sink group, a source node sends its data using the geographic routing toward the center point. Once the data enter the area, it is flooded within the

area. Using the flooding, the protocols do not need to independently manage the locations of each members. The group has a ring-based movable area with the center point and the radius. Each member sink selects one of the sensor nodes as an agent to access the network since the mobile sinks in the group exist on an infrastructureless field.

#### *2.3. Calculation of the Delivery Speed of a Mobile Sink Group*

In the spatiotemporal approach for the real-time communication, the delivery speed concept is applied. The delivery speed is maintained in order for all relay nodes to evenly distribute the real-time requirement of applications that have a dynamic topology and error-prone nodes. The selected next-hop node must have a relay speed that is faster than the required delivery speed to meet the requirement. The speed concept includes the spatial requirement and the temporal requirement for the data delivery. The temporal requirement can be given by the application, whereas the spatial requirement might be calculated with the Euclidean distance between the source node and the destination node, as we assume each sensor node could get its own coordinates from either GPS or any localization algorithms. However, in the mobile sink group, the end-to-end distance between a source node and each sink node could not be defined as the data delivery towards each sink node is based on flooding within the area.

A source node defines the distance and calculates the delivery speed after getting the location information of a mobile sink group from sink location server [10,11]. From the server, the source gets the center point *PC* and the radius of movable area *RC*. *PC* is the central coordinate calculated based on the position of all sinks in the group, and when a circle is drawn around *PC*, the radius of the circle that can contain all member sinks is *RC*. *RC* could vary depending on the requirement of application.

Equation (1) is a formula for calculating *PC*, and Equation (2) is a formula for obtaining *RC*.

$$P\_{\mathbb{C}}(\mathbf{x}, \mathbf{y}) = \frac{1}{n(P)} \sum\_{i=1}^{n(P)} P\_{\text{i}\prime} \quad P = \{(\mathbf{x}, \mathbf{y}) \mid (\mathbf{x}, \mathbf{y}) = \text{coordinate of member sinks}\} \tag{1}$$

$$R\_{\mathbb{C}} = MAX(D), \quad D = \{d \mid d = \text{distance between } \text{P}\_{\mathbb{C}} \text{ and member sinks} \} \tag{2}$$

To define the distance and calculate the delivery speed, main forwarding and branch forwarding are applied. In the main forwarding, we follow the longest straight line of the area, passing the center point. Branching the longest straight line could reduce the total length of the grid structure and increase the probability of path merging.

The total transmission distance for each sink is the summation of the main forwarding and the branching forwarding. The main forwarding operates on the straight line between a source (*xs*, *ys*) and the center point (*xc*, *yc*). The branching forwarding for each sink *i* (*xsi*, *ysi*) could be defined by the distance between the sink and the straight line:

$$\frac{|k\mathbf{x}\_{si} - \mathbf{y}\_{si} + \mathbf{y}\_c - k\mathbf{x}\_c|}{\sqrt{k^2 + 1}}\tag{3}$$

where *k* = (*ys* − *yc*)/(*xs* − *xc*),(*xs* = *xc*).

In order to derive the delivery speed for every member in a mobile sink group, we consider the maximum (farthest) distance in the movable area. In the area, as the farthest point from the entry point might be located on the ring, we calculate the distance between two points on the ring. In Figure 2, we assume that the center point and the source node are on the coordinate (0,0) and (D,0), respectively. Each point on the circle can be represented with the angle *θ*: (Rcos *θ*, Rsin *θ*). The distance to each point is presented as follows.

$$f(\theta) = (D - R\cos\theta) + |R\sin\theta|.\tag{4}$$

**Figure 2.** Longest distance in the movable area.

With a differential equation from the Equation (4), we can get the farthest points on the circle: (*θ* = 3/4*π* or 5/4*π*) as *f* (3/4*π*) = *f* (5/4*π*) = 0 and *f* (3/4*π*) < 0, *f* (5/4*π*) < 0. Therefore, the maximum distance is (*<sup>D</sup>* <sup>+</sup> <sup>√</sup>2*R*). With the maximum distance, the source node makes the delivery speed which will be maintained during the data delivery.

#### *2.4. Real-Time Data Transfer via Branch Points*

From a source node to the exit point of the movable area (via the entry point of the movable area), data are transferred by the main forwarding. During the main forwarding, the destination of the data packets for geographic routing is the coordinate of the exit point of the movable area.

Each branch forwarding is repeated in every radio-range. For each branch forwarding, multiple branch zones are virtually constructed. The reason to construct the branch zone is to reduce energy consumption by allowing only a flow of data in a zone and avoiding every node participating in communication. There are 2*R*/*r* branching points in the main forwarding in a movable zone, where the radius is *R* and the radio range of the sensor nodes is *r*. The branching point set is represented as follows.

$$BP = \{b\_i = (\mathbf{x}\_i, y\_i) \mid \quad \mathbf{x}\_i \quad = \mathbf{x}\_c + i(\mathbf{x}\_s - \mathbf{x}\_c)r/D,$$

$$\begin{split} y\_i \quad &= y\_c + i(y\_s - y\_c)r/D, \\ i \quad &= [- \lceil \mathbf{R}/r \rceil, \lfloor \mathbf{R}/r \rfloor] \end{split} \tag{5}$$

Each branch point is on the straight line between the entry point and the exit point. The entry point P*EN* and the exit point P*EX* can be represented as follows.

$$\begin{aligned} P\_{EN} &= \left(\mathbf{x}\_{\mathsf{c}} + R(\mathbf{x}\_{\mathsf{s}} - \mathbf{x}\_{\mathsf{c}}) / D, y\_{\mathsf{c}} + R(y\_{\mathsf{s}} - y\_{\mathsf{c}}) / D\right) \\ P\_{EX} &= \left(\mathbf{x}\_{\mathsf{c}} + R(\mathbf{x}\_{\mathsf{c}} - \mathbf{x}\_{\mathsf{s}}) / D, y\_{\mathsf{c}} + R(y\_{\mathsf{c}} - y\_{\mathsf{s}}) / D\right) \end{aligned} \tag{6}$$

In each branch point, three nodes could be selected at most as the next-hop nodes as shown in Figure 3. One is selected for the main forwarding, the others for branching toward orthogonal directions. To transfer data to the multiple next-hop nodes, we exploit the broadcast nature in wireless transmission. However, there might be interference and concurrent transmission among the selected nodes in the radio-range of the node holding a data packet. To avoid this problem, we apply time slot-based transmission for branching. It divides the time slot and assigns the nodes to each slot. The slot-based transmission is needed for sharing opportunity to relay among the branch nodes. Each node can relay its data packet from its hop delay to time deadline for the real-time packet. The temporal duration can be called real-time tolerable time. As the tolerable times of the selected next-hop nodes might be overlapped to each other, controlled relaying is needed.

**Figure 3.** Branching point and termination of branching.

The nodes are assumed to timely synchronize by the time synchronization schemes [13,14]. Each next-hop node has its own and different available time until time deadline in real-time data transmission. We call it the tolerable time. However, in the branch node, the multiple next-hop nodes can relay their data packet independently, thus causing serious packet collision. In order to avoid this problem, an additional scheduling procedure is needed by the branch nodes. The branch node divides the real-time tolerable time to multiple time slots and assigns the time slots to its next-hop candidates. As the maximum number of the next-hop candidates is three, we divide it into three time slots. Each next-hop node could forward data packet in the assigned slot. The duration of the time slot should be longer than the minimum relaying time. For assignment of the time slots, we apply the following rules.


After branching, the destination of the data packet is modified toward the orthogonal direction. The destination could be calculated with the coordination of branch point and source node, and the information of the movable area. The destinations of branch points are needed to calculate because of the two orthogonal direction in shown in Figure 4. As the destination points locate on the circle of the movable area, the destination could be presented as follows:,

$$(x\_{\varepsilon} + R\cos(a), y\_{\varepsilon} + R\sin(a)),\tag{7}$$

where *a* = *π* + *θ<sup>E</sup>* + *θB*. The *θ<sup>E</sup>* is the angle of the straight line of the source node and the center point and presented as

$$\theta\_E = \arctan\left( (y\_\varepsilon - y\_\delta) / (\mathbf{x}\_\varepsilon - \mathbf{x}\_\delta) \right). \tag{8}$$

The *θ<sup>B</sup>* is the angle between two line from the branch point and the destination point to the center point. To the point, data packet is transferred via the geographic routing. Finally, the destination of each branch zone is presented as follows,

$$(x\_c - R\cos(\theta\_E + \theta\_B), y\_c - R\sin(\theta\_E + \theta\_B)). \tag{9}$$

**Figure 4.** Real-time scheduling in branch points.

#### *2.5. Management of Mobile Sinks by Sensor Nodes*

Typically, mobile sink selects a sensor node as its communication agent node and gets data from the agent node. The procedure of data transfer starts in source node, via main forwarding, branch forwarding, and agent, ends in the mobile sink. For the real-time data transmission, the agent should relay the holding data with the highest priority (almost no delay).

Sometimes, each individual mobile sink might temporally be out of the movable area due to obstacles on the path. That is, deviations from environmental factors could be occur. When the mobile sink leaves the movable area, the sink selects the node closest to the edge of the movable areas as inner agent to report its location. In addition, if the sink is out of the communication range of inner agent, the node that exists within the communication range of inner agent is selected as outer agent. It ensures that the sink has a connectivity to receive data even if it is out of the movable area. When the inner agent receives a data packet, the agent forwards the packet to the outer agent using geographic routing. It is possible to relay with more distance because there might be remaining distance to the maximum distance (*<sup>D</sup>* <sup>+</sup> <sup>√</sup>2*R*). The remaining distance could be represented as follows,

$$f\_{\mathbf{f}}(\theta) = \sqrt{2}R + R\cos\theta - |R\sin\theta|.\tag{10}$$

Although the remaining distance is positive, it cannot be enough to support the out-of-range mobile sink. In order to support the mobile sink with higher remaining distance and higher probability, the location of the out-of-range mobile sink is shared with neighbor boundary nodes and the data packet is relayed via one of the multiple boundary nodes by multipath routing as shown in Figure 5.

**Figure 5.** Mobility support for sinks out of the region.

#### **3. Performance Evaluation**

#### *3.1. Analysis*

We analyze the energy consumption of the flooding based scheme and the proposed protocol to define our routing protocol. To analyze the energy-efficiency of the proposed scheme, we focus on the worst-case communication overhead of each protocol. We consider a square area *A* in which *N* sensor nodes are uniformly distributed. There is a mobile sink group which has *k* multiple mobile sinks. The group moves at an average speed, while receiving *d* data packets from a source. The communication overhead to flood an area is proportional to the number of sensor nodes in the area, and that to send a message along a path by greedy geographical forwarding is proportional to the number of sensor nodes in the path. In this analysis, the mobile sink group has a radius *R*. There are *n* = *N/A*\**π*\**R*<sup>2</sup> sensor nodes in the group region.

A sink group is assumed to update its location *m* times, and receive *d/m* data packets between two consecutive location updates. A radius of the expected group region is *2R*. The overhead for group information calculation and advertisement is *5n*+*kR*+ <sup>√</sup>2*A*, where *<sup>n</sup>* is the number of sensor nodes in the group region, *kR* is the update cost from each sink to the leader sink, and <sup>√</sup>2*<sup>A</sup>* is the update cost from the leader sink to the location server in the sensor field. The communication overhead for location update is *m(5n+kR+*√2*A))*.

To deliver a data packet, we have two communication modes: unicasting from the source the entry point of the sink group and delivery mode within the group area. The length of the unicasting is *(D* − *R)*, where *D* is the average length between a source and the center of the group area and *R* is the radius of the area. Thus, the energy consumption of the unicasting is *(D* − *R)/r*. For delivery in the group area, the energy consumption is based on the number of sensor nodes in the area: *NπR*2/*A*. It is exponential to the radius of the area R. Our protocol divides the delivery before and after the branch. There are *b* = 2*R*/*r* branch points on the straight line *2R*. After the branch, energy consumption is *2bR*, where the constant 2 is for the two branches in the opposite directions. Totally, the energy consumption is presented as follows,

$$\begin{array}{rcl} \text{CO}\_{\text{Fooding}} &=& m((5(N\pi R^2/A) + kR + \sqrt{2A}) + \\ &d((D-R)/r + N\pi R^2/A)) \\ \text{CO}\_{\text{Proposal}} &=& m((5(N\pi R^2/A) + kR + \sqrt{2A}) + \\ &d((D-R)/r + [2R/r] + ([2R/r])^2)) \end{array}$$

As a result, the energy consumption of the flooding based scheme depends on the density and the radius of the area, whereas that of the proposed scheme is affected only by the radius of the area.

#### *3.2. Simulation Environment and Results*

We have implemented the proposed mechanism in the network simulator NS-3. For application to industrial area, sensor nodes follow the reference of the Wireless HART [15], one of the well-known standards in IWSNs which employs an IEEE 802.15.4-based radio, frequency hopping, and retry mechanisms. We compare with M-Geocasting [10] and VLDD [11], which are group mobility support protocols. The simulation network space consists of 1000 sensor nodes uniformly deployed in a 500 m × 500 m square area. Fifteen mobile sinks are in a group and the radio range of each sensor nodes is ~20 m. The source node generates 30 byte data packets with an interval of 400ms and the time deadline for each packet is 400 ms. The simulation time is 50 s. We measure the in-time data delivery ratio, which means how many of data packets are received by the mobile sinks within the time deadline. The result in the figures is the average value of 100 times of simulation.

Figure 6 shows the in-time data delivery ratio according to the end-to-end (E2E) distance. In this graph, the E2E distance indicates the Euclidean geographical distance between a source and the center of a movable area. In M-Geocasting, a number of packets are lost due to the fact that it exploits flooding and flooding makes broadcast storm. As VLDD is a passive forwarding for supporting group mobility based on the internal movement of each member sink, it causes temporal-useless data packets. In addition, we conduct a simulation with SPEED applying multiple destinations. With these simulation, as SPEED constructs multiple paths for each sink, the interference between multiple paths is frequently occurred. In our scheme, more than 80% of the packets could be received by the sinks of the mobile sink group via unicasting in movable area.

**Figure 6.** Comparison of in-time data delivery ratio according to end-to-end distance.

In Figure 7, the in-time data delivery ratio is presented according to the movable area range of a mobile sink group. In M-Geocasting, as the movable area spans, a larger number of sensor nodes should participate in communication and the interference is more frequently occurred. When the range of the area is 70 m, in M-Geocasting, approximately 61 sensor nodes participate in its flooding area; however, there are only about 39 participating nodes in our scheme. In VLDD, as the area is wider, it might have a lower possibility to receive an in-time data packet due to its passive data forwarding. With the performance of the proposed scheme, we show that our proposed scheme could cover the wider movable area by multiple unicast forwarding.

**Figure 7.** Comparison of in-time data delivery ratio according to movable area range.

Figure 8 shows that the in-time data delivery ratio is affected by the radio range of the sensor node. We vary the range from 15 m to 30 m. Our mechanism and VLDD show almost constant performance, although the range is varied. It is because that our mechanism and VLDD do not exploit the flooding. In M-Geocasting, as the range is wider, the number of one-hop neighbor nodes is increased. With the larger number of neighbor nodes, the number of branches in flooding is increased; however, the possibility of interference for each neighbors also increases dramatically. Therefore, the performance of data delivery ratio is rapidly degraded.

**Figure 8.** Comparison of total in-time data delivery ratio according to radio range.

#### **4. Conclusions**

Nowadays, Mobile Cyber-Physical Systems (MCPS) are widely exploited in various domains. In this environment, the Mobile Sink Groups (MSG) perform collaborative work with a common goal. Thus, data should be delivered to all mobile sinks in the group within a valid time. Traditionally, data delivery schemes for MSG have been proposed; however, as the existing flooding-based data delivery schemes have not been able to define end-to-end distance for each mobile sink, they struggle to satisfy the real-time requirement. To solve this problem, we proposed a scheme to model the MSG in a circular form, and to satisfy the real-time requirement for each member sink through data delivery using virtual grids. First, the proposed scheme models the MSG as one center point and radius, and defines the end-to-end distance based on the member sink furthest from the source node. Through this definition, the source node could could calculate the delivery speed which will be maintained during the data delivery. The data delivery process is largely divided into two phases: the main forwarding phase, which passing through the center of the mobile sinks from source node, and the branch forwarding phase at the branch point, which received the data through the main forwarding phase. In addition, even if some mobile sinks deviate from the initially calculated radius due to various environmental factors of MCPS, the connection of the sinks is ensured through the inner/outer agent concept. Through this process, the proposed scheme could deliver data to all member sinks in a timely manner. The performance evaluation results shows that the proposed scheme is superior to the existing schemes in terms of real-time communication for MSG.

The proposed scheme could have achieved real-time data delivery for a single MSG; however, there could be more than two MSG, independent single mobile sinks, and static sinks in the actual MCPS environment depending on the application. Therefore, further studies on methods such as multicasting are required to deal with the numerous applications of MCPS environments.

**Author Contributions:** S.O., Y.C., S.K., C.K., K.J., and S.-H.K. contributed to protocol design and detailed algorithms. They also conducted the performance analyses. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the research grant of Pai Chai University in 2020.

**Conflicts of Interest:** The authors declare no potential conflict of interests.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **AUTOSAR Runnable Periods Optimization for DAG-Based Complex Automobile Applications**

**Daeho Choi 1,†, Tae-Wook Kim 2,† and Jong-Chan Kim 3,\***


Received: 13 July 2020; Accepted: 20 August 2020; Published: 23 August 2020

**Abstract:** When developing an automobile control application, its scheduling parameters as well as the control algorithm itself should be carefully optimized to achieve the best control performance from given computing resources. Moreover, since the wide acceptance of the AUTOSAR standard, where finer-granular scheduling entities (called runnables) rather than the traditional real-time tasks are used, the number of scheduling parameters to be optimized is far greater than the traditional task-based control systems. Hence, due to the vast problem space, it is not feasible to reuse existing time-consuming search-based optimization methods. With this motivation, this paper presents an analytical codesign method for deciding runnable periods that minimize given control cost functions. Our solution approach, based on the Lagrange multiplier method, can find optimized runnable periods in polynomial times due to its analytical nature. Moreover, our evaluation results for synthesized applications with varying complexities show that our method performs significantly better (12% to 59% of control cost reductions) than a state-of-the-art evolutionary algorithm. To the best of our knowledge, this study is one of the first attempts to find runnable periods that maximize a given system's control performance.

**Keywords:** AUTOSAR; DAG; runnable scheduling; control-scheduling codesign; lagrange multiplier

#### **1. Introduction**

AUTOSAR is the de facto standard software architecture for automobile control systems, covering a wide range of applications such as engine management, motor-driven power steering, and advanced driver assistance systems [1–3]. In the AUTOSAR standard, a control system is designed as a set of *software components*, which are the units of software packaging and deployment. Usually, multiple software components are connected and communicate through the AUTOSAR runtime environment (RTE). Each software component is also composed of a set of *runnables*, which are the smallest unit functions for software development and scheduling. Runnables communicate with each other within each software component and across different software components, using asynchronous message passing interfaces provided by the RTE. As a result, a system can be modeled as a directed acyclic graph (DAG) of runnables where data flow from sensors to actuators through the runnables in the DAG.

For runnable executions, each runnable is associated with an event source, which is usually a periodic timer, and runnables with the same periods are grouped into periodic tasks for scheduling on the AUTOSAR real-time operating system (RTOS). Runnable periods should be carefully optimized since they are control knobs for balancing the trade-off between a system's load and control performance. For example, imagine a system with extremely short runnable periods. The system will then be much too heavily loaded, hence not schedulable since the runnables should execute with extremely high frequencies. On the other hand, the short runnable periods, if realized, can produce a high control performance due to fast data flows and high control frequencies. At the opposite extreme, i.e., a system with extremely long runnable periods, it will be lightly loaded and hence easily schedulable; however, its control performance will be significantly degraded due to slow data flows and low control frequencies. In that sense, we need a method to find optimal runnable periods between those two extreme cases.

However, in the automotive industry, runnable periods are usually decided in an ad-hoc manner with time-consuming trials and errors [4], making it difficult to extract the optimal control performance out of given hardware resources. To cope with this problem, this paper formulates a runnable periods optimization problem for maximizing the control performance of a given system. Our previous work's initial approach was to use a simple combinatorial search method to find real optimal runnable periods [5]. However, our preliminary experiment revealed that even for a small system with a dozen runnables, since the optimization process cannot find solutions in polynomial times, it takes too much time, making it impractical for complex industry applications.

To deal with this scalability problem, our approach is to develop an analytical method that can find near-optimal solutions without time-consuming searches. For that, the first step is to pick an appropriate control performance model as the optimization objective. Among various models in the literature, we chose the linear control cost model from Bini and Cervin [6] that represents a control system's performance as an approximate linear cost function of its control period and delay [6]. This model has been used as a standard tool by many control-scheduling codesign studies [7–10]. The second step is to define the optimization constraint, that is, the schedulability constraint in our problem. Since the AUTOSAR standard assumes a priority-driven scheduling algorithm, we use the Liu and Layland (L&L) utilization bound method [11], which can be used for both the rate monotonic (RM) and the earliest deadline first (EDF) scheduling algorithms. For the explanation purpose, the EDF scheduling algorithm is mostly assumed throughout this paper, and later our method is extended to the RM scheduling algorithm.

Based on the control cost function and the schedulability constraint, our specific problem is to find the runnable periods that minimize the control cost while guaranteeing the schedulability constraint. Since the control cost function is a function of control period and delay, it should be transformed into a function of runnable periods. For that, we carefully investigate how runnable periods affect the temporal behavior, i.e., control period and delay, of a control system, and develop a generalized method for the transformation.

After the transformation, the Lagrange multiplier method is used to find the optimal runnable periods. Note that the Lagrange multiplier is a well-known optimization method for constrained optimization problems. As it provides an analytical method without any problem space search, our method can find the optimal runnable periods regardless of the size and the complexity of a target system. The detailed optimization process is explained in three steps beginning from the most uncomplicated application model to the general DAG model for the explanation purpose. Although our solution cannot find the real optimal solutions due to a heuristic applied during the optimization, our evaluation results for small systems show that the performance loss is marginal compared with the real optimal solutions. Moreover, even for large systems, our method performs better compared with a state-of-the-art optimization method.

This paper's contributions can be summarized as follows:


The rest of this paper is organized as follows: Next section provides related work. In Section 3, the background is given and the problem is described. Sections 4 and 5 introduce our preliminary works for limited application models. Then, Section 6 describes our analytical solution for the general DAG model. Section 7 evaluates our method. Finally, Section 8 concludes this paper.

#### **2. Related Work**

*Periods selection problem.* Control-scheduling codesign methods have been developed in the literature to improve a control system's performance through optimizing its scheduling parameters. In this regard, Seto et al.'s seminal work [12] first presented a periods selection problem assuming that the control performance can be expressed as an exponential decay function of sampling periods and that the tasks are scheduled under a dynamic-priority scheduling algorithm. The periods selection problem was extended to fixed-priority systems by finding the finite set of feasible period ranges using a branch and bound-based integer programming method [13]. Later, Bini and Di Natale [14] proposed a faster algorithm that finds a sub-optimal periods assignment, which can be used for task sets of practical size that are not solvable by previous methods due to high computing demands. Du et al. [15] presented an analytical solution using the Lagrange multiplier method and an online algorithm for overloaded situations. Fu et al. [9] developed a heuristic algorithm for multicore processors.

*Delay-aware approaches.* A common assumption of the above studies regarding the periods selection problem is that the control performance is only affected by sampling rates, i.e., task periods, of a control system. However, delays between sensing and actuation also have significant effects on control performance. With this motivation, Bini and Cervin [6] incorporated each task's delay into their optimization cost function. In their work, to find the optimal periods assignment, cost functions are approximated as linear functions of control period and delay, and the delay is also approximated assuming the fluid model scheduler. Through the approximations, they proposed an analytical solution. Xu et al. [16] extended this approach for systems with harmonic periods.

*Periods and deadlines selection problem.* Wu et al. [8] formulated an optimization problem for selecting both task periods and deadlines simultaneously for EDF-scheduled systems. They showed that we can upper bound the amount of delays and jitters each task can experience by regulating relative deadlines of tasks. The cost function is assumed to be a nonlinear function of period and deadline of each task. Based on that, a two-step approach was proposed, which first fixes periods and later tries to minimize deadlines exploiting unused resources. Tan et al. [10] proposed an algorithm that simultaneously adjusts periods and deadlines assuming EDF-scheduled linear–quadratic–Gaussian (LQG) controllers. They showed that their algorithm is more robust with various workloads than the previous method. Cha et al. [17] proposed a heuristic algorithm for the periods and deadlines selection problem with arbitrary nonlinear control cost functions for systems scheduled by the RM scheduling algorithm.

*Cause-effect chain analysis.* The above studies commonly assume independent real-time tasks, where there is no data dependency among tasks and each task is responsible for its dedicated control target plant. To deal with practical automobile control applications composed of tasks with complex dependencies, DAG-based control applications had been studied in the context of cause-effect chain analyses of real-time tasks [18–21]. Even though they are using tasks instead of runnables, their system model is similar to ours. However, they address the opposite direction of our optimization problem, which is to analyze end-to-end delays for a DAG of tasks with given periods. Besides, [22] analyzed end-to-end delays of an engine management system, which is given as a DAG of runnables.

*AUTOSAR system optimization.* In automobile control systems based on the AUTOSAR standard, each control application is designed as a DAG of fine-granular runnables with more complex data dependencies compared with traditional real-time task-based systems. In this context, Long et al. [23] developed a runnable placement and scheduling method considering the inter-runnable communication overhead in an electronic control unit (ECU). Monot et al. [24] proposed an algorithm for sequencing and scheduling runnables for multicore ECUs. Saidi et al. [25] studied the runnable-to-core mapping problem using the integer linear programming (ILP) technique. Kehr et al. [26] developed a method for migrating a legacy AUTOSAR application to a multicore processor while minimizing energy consumption.

*AUTOSAR runnable scheduling.* However, the above studies about AUTOSAR applications commonly assume that runnable periods are given a priori, which is not valid in the industry practice. With this motivation, runnable periods optimization problem was first formulated in our previous research paper by Kim et al. [5], which proposed a combinatorial search method that is useful only for small systems due to its high computing demands. Choi et al. [27] partly solved the scalability problem using an analytical method only for limited application structures. These two papers are precursors to this paper and will be thoroughly explained even in more depth and detail in Sections 4 and 5 for the self-completeness of this paper. This paper then further extends our previous works by presenting a more general solution that applies to arbitrarily-shaped complex DAG-based AUTOSAR applications.

#### **3. Background and Problem Description**

#### *3.1. System Model*

This paper assumes an automobile control application based on the AUTOSAR standard. Figure 1 shows an example system where the application is composed of *N* software components

$$\{\mathbb{C}\_1, \mathbb{C}\_2, \cdots, \mathbb{C}\_N\}.\tag{1}$$

Each software component *Ci* is also composed of |*Ci*| runnables where |*Ci*| denotes the number of runnables in *Ci*. Note that a runnable is the smallest unit function in the AUTOSAR standard. As shown in the figure, runnables, denoted by *ri*s, are connected with directed edges representing data dependencies among them. Thus, the whole system can be thought of as a DAG of runnables without explicitly specifying which software component each runnable belongs to. A DAG *G* is formally defined as

$$G = (V, E \subset V \times V) \tag{2}$$

where the set of vertices *<sup>V</sup>* is a set of *<sup>n</sup>* nodes or runnables {*r*1,*r*2, ··· ,*rn*} where *<sup>n</sup>* <sup>=</sup> <sup>∑</sup>*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> |*Ci*| and *E* represents a set of directed edges or links among them. There exists a directed edge (*rj*,*rk*) ∈ *E* if and only if the runnable *rk* has a data dependency on the runnable *rj*. Then, each *i*-th runnable *ri* is defined by a tuple

$$r\_i = (p\_{i'}e\_i) \tag{3}$$

where *pi* is its period and *ei* is the worst-case execution time. Among the runnables, we assume that *r*1, the *sensor runnable*, plays a special role of collecting data from sensors, and *rn*, the *actuator runnable*, is responsible for controlling actuators. Thus, *G* has only one source node *r*<sup>1</sup> and one sink node *rn*. Our system model assumes that, in an ECU, there is only one CPU running a single control application described by *G*, where the ECU handles only a single control target plant, which is common in the automotive industry's federated architecture [28,29]. Note that *ei*s are given properties of the system, whereas *pi*s are controllable parameters. Thus, the runnable periods

$$(p\_1, p\_2, \dots, p\_n) \tag{4}$$

should be decided before integrating the runnables on the AUTOSAR platform. Once *pi*s are decided, runnables with the same *pi*s are grouped and consolidated into RTOS tasks, which are scheduled following the scheduling strategy of the RTOS.

For communications between runnables, an asynchronous sender-receiver communication is used [30]. In this communication method, a sender runnable periodically generates its output in a shared memory buffer with its own period, then a receiver runnable asynchronously reads the data in the memory buffer with its own period. When multiple writes occur to the same memory location without any reading operation from the receiver, the most recent data are always overwritten in the buffer. For further discussions afterward, we formally introduce the following definitions:

**Figure 1.** Our directed acyclic graph (DAG)-based system model with *N* software components and *n* runnables where each runnable *ri* is annotated with its period *pi* and worst-case execution time *ei*.

**Definition 1. (Paths)** *For a DAG G* = (*V*, *E*)*, there is a finite number of directed paths from the source node r*<sup>1</sup> *to the sink node rn. We assume that there are m paths in G, which is denoted by*

$$\mathbb{P}(G) = \{P\_1, P\_2, \dots, P\_m\}.\tag{5}$$

*Then, a path is formally defined as an ordered set of runnables beginning with r*<sup>1</sup> *and ending with rn in which all runnables are distinct and every pair of adjacent runnables is joined by a directed edge in E. From now on, for the notational convenience, when we refer to a path <sup>P</sup>* <sup>∈</sup> <sup>P</sup>(*G*)*, it can denote the ordered set of runnable indexes* (1 ≤ *i* ≤ *n*) *as well as the runnables themselves depending on the context.*

**Definition 2. (Length)** *For a path P* <sup>∈</sup> <sup>P</sup>(*G*)*, its length is defined as*

$$\sum\_{i \in P} p\_{i\nu} \tag{6}$$

*which is the sum of runnable periods following a specific path P. When data flow through several paths in parallel, the speed of a data flow is collectively determined by the runnable periods in each path through which the data are flowing, considering our inter-runnable communication method.*

**Definition 3. (Weight)** *For a path P* <sup>∈</sup> <sup>P</sup>(*G*)*, its weight is defined as*

$$\sum\_{i \in P} c\_{i\prime} \tag{7}$$

*which is the sum of runnable execution times following a specific path P. Thus, a path's weight is a representative metric for the amount of computing resource demand of the path.*

**Definition 4. (Critical Path)** *Given a DAG G with its paths* P(*G*)*, its critical path is defined by the path found in*

$$\underset{P \in \mathbb{P}(G)}{\text{argmax}} \sum\_{i \in P} p\_{i\prime} \tag{8}$$

*which is the path with the longest length. Without loss of generality, we assume that there is only one critical path in each DAG.*

**Definition 5. (Heaviest Path)** *Given a DAG G with its paths* P(*G*)*, its heaviest path is defined by the path found in*

$$\underset{P \in \mathcal{P}(G)}{\text{argmax}} \sum\_{i \in P} c\_{i\prime} \tag{9}$$

*which is the path maximizing the sum of eis for the runnables in the path. The intuition behind the heaviest path is that it indicates the path that consumes the maximum amount of CPU time for a single execution of paths. Without loss of generality, we assume that there is only one heaviest path in each DAG.*

#### *3.2. Control Performance Model*

A control system's performance can be defined in many different aspects. For example, its robustness to external disturbances, control stability, and control error can be such performance metrics. In general, a control system's performance is affected by its timing behavior as well as the control algorithm itself [31]. In this paper, we assume that the control algorithm is given as a fixed system property. Thus, our control performance model is about how the system's temporal properties affect the control performance. More specifically, we consider two distinct temporal properties: control period and end-to-end delay of the target control system.

In the AUTOSAR timing extensions, two latency constraints are defined: (i) data age timing constraint and (ii) reaction time constraint [20,32,33]. The specification states that when an actuator command is periodically produced, its source input (sensing) value's age should be maintained within a specified timing constraint. The reaction time constraint is also considered when an external event, such as pressing a button, should be reacted within a specified timing constraint. In this paper, since we are considering periodic workloads, the data age timing constraint is considered.

Based on the timing model, there are several ways to build a control performance model. One is to measure the resulting performance of the system by artificially controlling the temporal parameters. Simulation tools [34,35] can also be used to predict the control performance when we cannot directly measure the system under investigation. To provide a more general model, Bini and Cervin [6] introduced a linear control cost function as

$$J(T, \Delta) = \mathfrak{a}T + \beta \Delta \tag{10}$$

where *T* is the control period, and Δ is the end-to-end delay from the sensors to the actuators. Note that *α* and *β* are constants that define the characteristics of the control target plant. Figure 2 shows an example control cost function. The intuition behind it is that if we give control commands to the actuator more often (frequently), its control cost gets smaller, and in the same manner, if we decide the control command with more fresh (recent) sensor data with shorter delays, the cost gets smaller, again. In general, the cost function *J* can be a nonlinear function of *T* and Δ, however, it can be approximated as a linear function as in [6–10,16]. In this paper, we use this linear approximate control cost function as the optimization objective.

**Figure 2.** Control cost function, which is a linear function of the system's control period and delay [5].

#### *3.3. Schedulability Constraint*

The runnables {*r*1,*r*2, ··· ,*rn*} are implemented as RTOS tasks, where runnables with the same period are grouped together and sequentially executed inside a task body when the task is scheduled on a CPU. As most RTOSes only support implicit deadline tasks, their relative deadlines (= periods) should be guaranteed to satisfy runnable-level periodic timing requirements, i.e., *pi*s. For that, in this paper, we use the L&L utilization bound method, which guarantees the schedulability of a given system if the system utilization is less than or equal to a specific threshold value (i.e., utilization bound) for each scheduling algorithm. For example, the RM scheduling algorithm's utilization bound is roughly 69.3%, and the EDF scheduling algorithm's utilization bound is 100% [11]. We chose to use EDF for its simplicity, where its schedulability condition can be formally expressed as follows:

$$\text{MI}(p\_1, p\_2, \dots, p\_n) = \sum\_{i=1}^n \frac{e\_i}{p\_i} \le 1. \tag{11}$$

Although we mainly use the EDF scheduling algorithm throughout this paper, since most scheduling algorithms support the utilization bound method for the schedulability test, we can easily apply our optimization method to other scheduling algorithms like the RM scheduling algorithm. Section 6.5 will deal with this issue in more detail.

#### *3.4. Problem Description*

With the system and control performance models and the schedulability constraint presented above, our problem can be defined as follows: With a given AUTOSAR control system composed of DAG-structured runnables {*r*1,*r*2, ··· ,*rn*} and a linear control cost function *J*(*T*, Δ) regarding the control target plant, find the optimal runnable periods (*p*1, *p*2, ··· , *pn*) that minimize the control cost while satisfying the system's schedulability constraint. More formally, our problem is as follows:

$$\begin{array}{ll}\underset{p\_1, p\_2, \dots, p\_n}{\text{minimize}} & J(T, \Delta) \\\\ \text{subject to} & \mathcal{U}(p\_{1\prime}, p\_{1\prime}, \dots, p\_n) \le 1. \end{array} \tag{12}$$

In this paper, we try to find the theoretically optimal real numbered runnable periods, without explicitly considering neither the grouping of runnables into predefined periodic tasks nor the scheduling granularity (e.g., integer constraints) of a specific RTOS. However, our solution can be used as a baseline foundation for further practical applications after considering the implementation details imposed by a specific RTOS.

#### **4. Analytical Solution for Linear Path Graphs**

#### *4.1. LPG Model*

Instead of directly going for a general solution, let us begin by solving our optimization problem for a subset of the DAG model, and later extend the solution step by step towards a generalized one. This section specifically deals with the *linear path graph* (LPG) model, which is for graphs with runnables {*r*1, *r*2, ··· , *rn*} such that the edges are given by *E* = {(*ri*,*ri*+1)|1 ≤ *i* ≤ *n* − 1}. Figure 3 shows an example LPG with *n* runnables and *n* − 1 edges between them.

$$\text{Sensors} \xrightarrow[\underbrace{\mathfrak{p}\_{\mathfrak{p}\_{1}},\mathfrak{e}\_{\mathfrak{e}}}]{} \xleftarrow{} \bigoplus\_{\langle \mathfrak{p}\_{2},\mathfrak{e}\_{\mathfrak{e}} \rangle} \xrightarrow{\quad} \cdots \xrightarrow{} \bigoplus\_{\langle \mathfrak{p}\_{i},\mathfrak{e}\_{\mathfrak{e}} \rangle} \xrightarrow{} \cdots \xrightarrow{} \bigoplus\_{\langle \mathfrak{p}\_{s},\mathfrak{e}\_{\mathfrak{e}} \rangle} \text{Actualators}$$

**Figure 3.** Linear path graph (LPG) model.

#### *4.2. Transformation of Control Cost Function*

When solving the optimization problem in the LPG model's scope, the first step is to redefine the control cost function as a function of the free variables of the optimization problem, i.e., runnable periods (*p*1, *p*2, ··· , *pn*). For the transformation of *J*(*T*, Δ) in Equation (10) into a function of runnable periods, our strategy is to define both *T* and Δ using only runnable periods considering the LPG model's runnable execution and data flow patterns.

Control period *T* can be formally defined, from a plant's perspective, as a regular time interval between consecutive actuation instances. However, due to the jitter caused by preemption delays among concurrent runnables, the intervals may vary for each actuation instance. Thus, we consider the longest time interval as the worst-case period *T*. For LPG-based applications, *T* can be defined as double the actuator runnable *rn*'s period *pn*, which is

$$T = 2p\_n.\tag{13}$$

The worst-case scenario happens when a certain instance of *rn* is scheduled at the beginning of its period, whereas, in the next instance, *rn* is scheduled at the very end of its period. Assuming that actuation commands are emitted at each completion of *rn* instances, the time interval between the actuation commands gets to the longest as possible in that particular scenario, which is 2*pn*. It can be argued that *en* should be considered, and the exact worst-case time interval should be 2*pn* − *en*. However, note that since *en* is the worst-case execution time, real execution times can be much smaller than *en*; thus, for simplicity's sake, we do not take *en* into consideration when defining *T*.

The end-to-end delay Δ is defined as the time taken for new sensor data to go through runnables until arriving at the actuator. According to the data flow architecture of our system model, the sensor runnable *r*<sup>1</sup> sends out its output to its neighboring runnables with its own period *p*1. Then, the neighboring runnables also send out their outputs with their own periods. With these repeated transmissions, new sensor data originating from the source node *r*<sup>1</sup> gradually propagate through the runnables toward the sink node, i.e., the actuator runnable *rn*. After *rn* finally receives the updates, it can decide its actuation commands based on the new sensor data. In an LPG-based application where data flow through only a single path from *r*<sup>1</sup> to *rn*, the worst-case end-to-end delay Δ can be calculated as

$$
\Delta = 2p\_1 + 2p\_2 + \dots + 2p\_n = 2\sum\_{i=1}^n p\_i. \tag{14}
$$

The worst case happens as in the following: a runnable *ri*−<sup>1</sup> emits its output for *ri* at a certain time *t*. Unfortunately, however, *ri* begins just right before *t*, reading the previous (old) output of *ri*−1. Let us assume that *ri* is scheduled at the very beginning of its period at that time. Then, unfortunately again, the next instance of *ri* is scheduled at the very end of its period, reading the new data and emitting its output at the end of the period (i.e., *t* + 2*pi*). In the above scenario, the time taken for the data to go through *ri* is double the *ri*'s period 2*pi*. Assuming this scenario happening for every runnable in the path, the end-to-end delay Δ becomes double the sum of all the runnable periods as in Equation (14). By combining Equations (13) and (14), the control cost function in Equation (10) is transformed into as follows:

$$J(p\_1, p\_2, \dots, p\_n) = 2\mathfrak{a}p\_n + 2\mathfrak{z} \sum\_{i=1}^n p\_i. \tag{15}$$

#### *4.3. Finding the Optimal Runnable Periods*

For visual understanding of the optimization process, let us pick an example system with only two runnables {*r*1,*r*2}. Then, the transformation of the control cost function is illustrated in Figure 4. In the left-hand side, the original control cost function is depicted, which is transformed into a function of *p*<sup>1</sup> and *p*<sup>2</sup> as in the right-hand side by Equation (15). Then, Figure 5 illustrates the optimization process where the schedulability constraint and the transformed control cost function are shown upon

the two-dimensional problem space of *p*<sup>1</sup> and *p*2. In the figure, our optimization objective is to find the lowest point in the control cost plane that is inside the green schedulable area. This concept can be generally extended to *n*-runnable systems in *n*-dimensional problem spaces. In general, our original optimization problem in Equation (12) can be transformed into the following using the transformed control cost function:

$$\begin{aligned} \text{minimize} & \quad J(p\_1, p\_2, \dots, p\_n) = 2ap\_n + 2\beta \sum\_{i=1}^n p\_i\\ \text{subject to} & \quad \mathcal{U}(p\_1, p\_2, \dots, p\_n) = \sum\_{i=1}^n \frac{c\_i}{p\_i} \le 1. \end{aligned} \tag{16}$$

**Figure 4.** Visually illustrated transformation of control cost function [5].

**Figure 5.** Visually illustrated constrained optimization process [5].

To analytically solve the transformed optimization problem, the Lagranage multiplier method is applied. For the first step, a Lagrange function is formulated as follows:

$$\mathcal{L} = 2a p\_n + 2\beta \sum\_{i=1}^n p\_i - \lambda \left( \sum\_{i=1}^n \frac{\varepsilon\_i}{p\_i} - 1 \right). \tag{17}$$

Then, we take the partial derivatives of L with respect to *p*1, *p*2, ··· , *pn*, and *λ*, respectively and set them to zeros as follows:

$$
\nabla \mathcal{L} = \left( \frac{\partial \mathcal{L}}{\partial p\_1}, \frac{\partial \mathcal{L}}{\partial p\_2}, \dots, \frac{\partial \mathcal{L}}{\partial p\_n}, \frac{\partial \mathcal{L}}{\partial \lambda} \right) = 0,\tag{18}
$$

which in turn is expanded to the followings:

$$\begin{aligned} \frac{\partial \mathcal{L}}{\partial p\_1} &= 2\beta + \frac{c\_1}{p\_1^2}\lambda = 0, \\ \frac{\partial \mathcal{L}}{\partial p\_2} &= 2\beta + \frac{c\_2}{p\_2^2}\lambda = 0, \\ \vdots & \vdots \\ \frac{\partial \mathcal{L}}{\partial p\_n} &= 2(a + \beta) + \frac{c\_n}{p\_n^2}\lambda = 0, \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= -\left(\sum\_{i=1}^n \frac{c\_i}{p\_i} - 1\right) = 0. \end{aligned} \tag{19}$$

Then, by isolation *λ* in the left-hand side of the first in Equation (19), we have

$$
\lambda = -\frac{2\beta p\_1^2}{\varepsilon\_1},
\tag{20}
$$

which can be applied to the remaining of Equation (19) except the last one. As a result, *p*2, *p*3, ··· , and *pn* are given in terms of *p*<sup>1</sup> as in the second to the last of the followings:

$$\begin{aligned} p\_1 &= \sum\_{i=1}^{n-1} \sqrt{\varepsilon\_1 \varepsilon\_i} + \sqrt{\frac{(\alpha + \beta)\varepsilon\_1 \varepsilon\_n}{\beta}}, \\ p\_2 &= p\_1 \sqrt{\frac{\varepsilon\_2}{\varepsilon\_1}}, \\ \vdots &, \\ p\_{n-1} &= p\_1 \sqrt{\frac{\varepsilon\_{n-1}}{\varepsilon\_1}}, \\ p\_n &= p\_1 \sqrt{\frac{\beta \varepsilon\_n}{(\alpha + \beta)\varepsilon\_1}}. \end{aligned} \tag{21}$$

Additionally, *p*<sup>1</sup> is given as the first of the above by replacing *p*2, *p*3, ··· , and *pn* in the last of Equation (19) with the second to the last of the above. Then, by Equation (21), we can find the real optimal runnable periods for arbitrary LPG-based applications.

#### **5. Analytical Solution for Linear Multipath Graphs**

Based on the method for the LPG model explained in Section 4, this section goes one step further to a more complex application model having multiple independent data flows from sensors to actuators.

#### *5.1. LMG Model*

When designing automobile control applications, there are cases where a simpler data flow model is preferred instead of using the complex DAG model. The most common such case is when there are several independent parallel data flows from sensors to actuators. In Figure 6, *r*<sup>1</sup> is the sensor runnable and *rn* is the actuator runnable. Between them, there are *m* paths where each runnable in the middle part {*r*2,*r*3, ··· ,*rn*−1} belongs to only one specific path among them. To distinguish such a particular application architecture from general DAGs, we specifically call them the *linear multipath graph* (LMG) model. Although the LMG model can be applied to a limited range of applications, it is meaningful since there is an increasing need for integrating independent control algorithms to develop integrated control systems or multi-functional ECUs [36–38]. In such new systems, sensor data propagate through multiple independent paths of runnables to the actuators.

**Figure 6.** Linear multipath graph (LMG) model with *m* independent paths identified by edges with different colors [27].

#### *5.2. Transformation of Control Cost Function*

For the optimization, the objective function *J*(*T*, Δ) in Equation (10) should be transformed into a function of runnable periods (*p*1, *p*2, ··· , *pn*). For that, in the same way in Section 4.2, the period *T* is transformed into as follows:

$$T = 2p\_n.$$

For the end-to-end delay Δ, however, we cannot simply reuse the method in Section 4.2 since we have multiple paths with possibly different lengths. Thus, Δ should be defined as the length of the longest path among them to represent the worst-case end-to-end delay. More specifically, let us remind that *Pi* denotes the *i*-th path of *m* independent paths {*P*1, *P*2, ··· , *Pm*}. Then, Δ is defined as follows:

$$\Delta = \max\_{1 \le i \le m} \left( \sum\_{j \in P\_i} 2p\_j \right). \tag{23}$$

Thus, our original optimization problem is transformed into as follows:

$$\begin{aligned} \text{minimize} \quad & J(p\_1, p\_2, \dots, p\_n) = 2a \, p\_n + \beta \, \max\_{1 \le i \le m} \left( \sum\_{j \in P\_i} 2p\_j \right) \\ \text{subject to} \quad & \mathcal{U}(p\_1, p\_2, \dots, p\_n) = \sum\_{i=1}^n \frac{\varepsilon\_i}{p\_i} \le 1, \end{aligned} \tag{24}$$

where unlike the LPG model, the max operator introduces nonlinearity making it difficult to develop an analytical solution. Fortunately, however, due to the LMG model's workload characteristics, we can simplify the problem by defining an equilibrium state, which is to be obtained to find the optimal runnable periods in the LMG model. The equilibrium state of an LMG can be defined as follows:

**Definition 6. (Equilibrium state)** *For a set of m paths of a given LMG G, which is denoted by* P(*G*) = {*P*1, *P*2, ··· , *Pm*}*, G is in its equilibrium state if and only if*

$$\sum\_{i \in P\_1} p\_i = \sum\_{i \in P\_2} p\_i = \dots = \sum\_{i \in P\_m} p\_i. \tag{25}$$

**Theorem 1. (Equilibrium state theorem)** *If G is an LMG with its optimal runnable periods, G is always in its equilibrium state.*

**Proof of Theorem 1.** If *G* is with its optimal runnable periods while not in the equilibrium state, we can increase runnable periods that do not belong to the critical path, without affecting the end-to-end delay. Then, the increased runnable periods will make a lower system utilization, which can be used to further decrease the end-to-end delay by shortening runnable periods in the critical path. Thus, we can conclude that *G* is not with the optimal runnable periods, which is a contradiction.

By the equilibrium state theorem, we can narrow down the problem space without sacrificing the optimality by excluding non-equilibrium states from the problem space. To express the equilibrium state more efficiently, Δ is re-expressed by breaking it into three parts as

$$\Delta = 2p\_1 + \max\_{1 \le i \le m} \left( \sum\_{j \in P\_i} 2p\_j \right) + 2p\_n \tag{26}$$

with a helper notation *P*ˆ *<sup>i</sup>* = *Pi* − {1, *n*}. Then, to explicitly express the equilibrium state, we define a new notation *p*<sup>∗</sup> as in the following:

$$p\_\* = \sum\_{i \in \mathcal{\mathbb{P}}\_1} p\_i = \sum\_{i \in \mathcal{\mathbb{P}}\_2} p\_i = \dots = \sum\_{i \in \mathcal{\mathbb{P}}\_m} p\_{i\prime} \tag{27}$$

which is an aggregate variable representing the path length of the middle part in the equilibrium state. Then, by using *p*∗, Δ can be re-expressed from Equations (26) and (27) as follows:

$$
\Delta = 2(p\_1 + p\_\* + p\_n). \tag{28}
$$

Finally, the control cost function *J*(*T*, Δ) from Equation (10) is rewritten as a function of (*p*1, *p*∗, *pn*) by Equations (22) and (28) as in the following:

$$J(p\_1, p\_\*, p\_n) = 2\alpha p\_n + 2\beta (p\_1 + p\_\* + p\_n). \tag{29}$$

With this transformed control cost function *J*(*p*1, *p*∗, *pn*), our original problem of *n* runnable periods is transformed into a problem of three free variables (*p*1, *p*∗, *pn*). Then, once they are decided, *p*<sup>∗</sup> is distributed to runnables along each path. For that, we use a heuristic that runnables with larger *ei*s are assigned with longer *pi*s. More specifically, we assign *pi*s strictly proportional to *ei*s. Following this assignment rule, runnable periods *pj*s for each *P*ˆ *<sup>i</sup>* are decided as follows:

$$\forall i \in [1 \dots m] \; \forall j \in P\_i \colon p\_j = \sum\_{k \in P\_i} c\_k \; . \tag{30}$$

#### *5.3. Transformation of Schedulability Constraint Function*

This subsection transforms the schedulability constraint function in Equation (11) to a function of (*p*1, *p*∗, *pn*). First, the original function *U*(*p*1, *p*2, ··· , *pn*) is re-expressed by breaking it into three parts, and the middle part is arranged by grouping the runnables by the paths they belong to. The new expression can be simply comprehended as the sum of *m* per-path sums of utilizations as in the following:

$$\begin{split} \mathcal{U}(p\_1, p\_2, \ldots, p\_n) &= \frac{c\_1}{p\_1} + \left(\frac{c\_2}{p\_2} + \cdots + \frac{c\_{n-1}}{p\_{n-1}}\right) + \frac{c\_n}{p\_n} \\ &= \frac{c\_1}{p\_1} + \sum\_{i=1}^m \left(\sum\_{j \in \mathcal{P}\_i} \frac{c\_j}{p\_j}\right) + \frac{c\_n}{p\_n} . \end{split} \tag{31}$$

Then, to transform each *i*-th per-path utilization sum into a function of *p*∗, Equation (30) is applied to eliminate *pj* as in the following:

$$\begin{split} \sum\_{j \in \tilde{P}\_i} \frac{\varepsilon\_j}{p\_j} &= \sum\_{j \in \tilde{P}\_i} \frac{\varepsilon\_j}{\sum\_{k \in \mathbb{P}\_i} e\_k} = \sum\_{j \in \tilde{P}\_i} \frac{\sum\_{k \in \mathbb{P}\_i} e\_k}{p\_\*} \\ &= |\dot{P}\_i| \frac{\sum\_{k \in \tilde{P}\_i} e\_k}{p\_\*} = \frac{\sum\_{k \in \tilde{P}\_i} |\dot{P}\_i| \cdot e\_k}{p\_\*} \end{split} \tag{32}$$

where *P*ˆ *i* denotes the number of elements in the ordered set *<sup>P</sup>*<sup>ˆ</sup> *<sup>i</sup>*. Finally, our utilization constraint is transformed into as follows:

$$\mathcal{U}(p\_1, p\_\*, p\_n) = \frac{\mathcal{E}\_1}{p\_1} + \frac{\sum\_{i=1}^m \left| \hat{P}\_i \right| e\_k}{p\_\*} + \frac{\mathcal{E}\_n}{p\_n} \le 1. \tag{33}$$

#### *5.4. Finding the Optimal Runnable Periods*

After the transformation of the control cost function and the schedulability constraint function, our runnable periods optimization problem for the LMG model can be formulated with the three free variables (*p*1, *p*∗, *pn*) as follows:

$$\begin{aligned} \underset{p\_1, p\_\*, p\_n}{\text{minimize}} \quad & J(p\_1, p\_\*, p\_n) = 2ap\_n + 2\beta (p\_1 + p\_\* + p\_n) \\ \text{subject to} \quad & \mathcal{U}(p\_1, p\_{\* \cdot \cdot}, p\_n) = \frac{\varepsilon\_1}{p\_1} + \frac{\sum\_{i=1}^m \sum\_{k \in \mathcal{P}\_i} |\mathcal{P}\_i|}{p\_\*} + \frac{\varepsilon\_n}{p\_n} \le 1. \end{aligned} \tag{34}$$

To solve the optimization problem, a Lagrange function is formulated as follows:

$$\mathcal{L} = 2ap\_n + 2\beta (p\_1 + p\_\* + p\_n) - \lambda \left( \frac{\sum\_{i=1}^{m} \sum\_{k \in \hat{P}\_i} |\hat{P}\_i|}{p\_1} + \frac{\varepsilon\_n}{p\_\*} - 1 \right). \tag{35}$$

Then, we take the partial derivatives of L with respect to *p*1, *p*∗, *pn*, and *λ*, respectively and set them to zeros as follows:

$$
\nabla \mathcal{L} = \left( \frac{\partial \mathcal{L}}{\partial p\_1}, \frac{\partial \mathcal{L}}{\partial p\_\*}, \frac{\partial \mathcal{L}}{\partial p\_n}, \frac{\partial \mathcal{L}}{\partial \lambda} \right) = 0,\tag{36}
$$

which in turn is expanded to the followings:

$$\begin{aligned} \frac{\partial L}{\partial p\_1} &= 2\beta + \frac{\varepsilon\_1}{p\_1^2}\lambda = 0, \\ \frac{\partial L}{\partial p\_\*} &= 2\beta + \frac{\sum\_{i=1}^m \sum\_{k \in \tilde{P}\_i} |\hat{P}\_i|\varepsilon\_k}{p\_\*^2} \lambda = 0, \\ \frac{\partial L}{\partial p\_n} &= 2(a + \beta) + \frac{\varepsilon\_n}{p\_n^2}\lambda = 0, \\ \frac{\partial L}{\partial \lambda} &= -\left(\frac{\sum\_{i=1}^m \sum\_{k \in \tilde{P}\_i} |\hat{P}\_i|\varepsilon\_k}{p\_\*} + \frac{\varepsilon\_n}{p\_n} - 1\right) = 0. \end{aligned} \tag{37}$$

Then, the first, second, and third of Equation (37) are rearranged by isolating *λ* in each left-hand side as follows:

$$\begin{aligned} \lambda &= -2\beta \frac{p\_1^2}{c\_1}, \\ \lambda &= -2\beta \frac{p\_\*^2}{\sum\_{i=1}^m \sum\_{k \in \hat{P}\_i} |\not{p}\_i| c\_k}, \\ \lambda &= -2(\alpha + \beta) \frac{p\_n^2}{c\_n}. \end{aligned} \tag{38}$$

Then, by combining the first and second of Equation (38), we have the following:

$$2\beta \frac{p\_1^2}{c\_1} = 2\beta \frac{p\_\*^2}{\sum\_{i=1}^m \sum\_{k \in \hat{\mathcal{P}}\_i} |\hat{\mathcal{P}}\_i| c\_k} \Longrightarrow \frac{1}{p\_\*} = \frac{1}{p\_1} \sqrt{\frac{c\_1}{\sum\_{i=1}^m \sum\_{k \in \hat{\mathcal{P}}\_i} |\hat{\mathcal{P}}\_i| c\_k}}.\tag{39}$$

By combining the first and third of Equation (38), we have the following:

$$2\beta \frac{p\_1^2}{\varepsilon\_1} = 2(a+\beta)\frac{p\_n^2}{\varepsilon\_n} \implies \frac{1}{p\_n} = \frac{1}{p\_1}\sqrt{\frac{a+\beta}{\beta}}\sqrt{\frac{\varepsilon\_1}{\varepsilon\_n}}.\tag{40}$$

By replacing <sup>1</sup> *<sup>p</sup>*<sup>∗</sup> and <sup>1</sup> *pn* in the last of Equation (37) with the findings in Equations (39) and (40), we have the following:

$$\frac{1}{p\_1} \left( e\_1 + \sum\_{i=1}^m \sum\_{k \in \tilde{P}\_i} |\not\mathcal{P}\_i| \, e\_k \, \sqrt{\frac{e\_1}{\sum\_{i=1}^m \sum\_{k \in \tilde{P}\_i} |\not\mathcal{P}\_i|}} + e\_n \sqrt{\frac{(a+\beta)}{\beta e\_n}} \right) = 1. \tag{41}$$

Finally, from Equations (39)–(41), we have the following solution:

$$\begin{aligned} p\_1 &= c\_1 + \sqrt{c\_1 \sum\_{i=1}^m \sum\_{k \in P\_i} |\mathcal{P}\_i| c\_k} + \sqrt{\frac{(\alpha + \beta) c\_1 c\_n}{\beta}} \\\\ p\_\* &= p\_1 \sqrt{\frac{\sum\_{i=1}^m \sum\_{k \in P\_i} |\mathcal{P}\_i| c\_k}{c\_1}} \\\\ p\_n &= p\_1 \sqrt{\frac{\beta c\_n}{(\alpha + \beta) c\_1}}.\end{aligned} \tag{42}$$

We have one remaining step of deciding (*p*2, *p*3, ··· , *pn*−1). For that, we distribute *p*<sup>∗</sup> to runnables in each path in proportion to their *ei*s as in Equation (30). It is also worth noting that even with the equilibrium state theorem, we cannot find the real optimal solutions since we lose the optimality while distributing *p*<sup>∗</sup> with a heuristic. Nevertheless, we can find high-quality solutions close to the real optimal runnable periods. Interested readers are referred to our previous work [27].

#### **6. Generalized Analytical Method for Directed Acyclic Graphs**

This section generalizes the previously explained methods for the LPG model and the LMG model to the general DAG model. Both methods are not usable for a general DAG-based application for their limited applicability. In particular, since our method for the LMG model assumes that there is no such runnable that belongs to different paths at the same time, it is not applicable to DAGs with at least one such runnable. If we forcibly try it, Equation (30) may yield two different, hence, conflicting results for such runnables. Thus, we need a separate method for the DAG model.

#### *6.1. DAG Model and Its Challenge*

The DAG model is already explained in Section 3.1. Hence this subsection just highlights how it is different from the LPG model and the LMG model and presents a challenge that does not exist in the previous models. As noted earlier, there is only one path in the LPG model, making it easy to define the system's end-to-end delay. In the LMG model, even though there are multiple paths, we can use the equilibrium state theorem to simply represent them together by their identical path length in the middle part, denoted by *p*∗. Figure 7a shows a simple DAG that is not an LPG nor an LMG. In the figure, note that *r*<sup>4</sup> belongs to the following two different paths: < *r*1,*r*2,*r*4,*r*<sup>7</sup> > and < *r*1,*r*4,*r*<sup>7</sup> >. As a runnable period cannot be zero, the former is always longer than the latter. Thus, unlike the LMG model where we can always make an equilibrium state, we cannot always make an equilibrium state in the DAG model.

**Figure 7.** An example of DAG explaining the concept of the critical path with *rc*.

#### *6.2. Transformation of Control Cost Function*

With the above challenge, let us transform the control cost function to a function of runnable periods. For the period *T*, we can use the same method as for the LPG model and the LMG model since it is only concerned with the actuator runnable *rn*. Thus, *T* is transformed into as follows:

$$T = 2p\_n.$$

Unfortunately, however, when transforming the end-to-end delay Δ, we cannot simply reuse the method for the LMG model in Section 5.2 since we cannot be sure that an equilibrium state can be made. To handle this challenge, let us begin with the general definition of Δ as in the following, assuming *m* paths {*P*1, *P*2, ··· , *Pm*}:

$$\Delta = \max\_{1 \le i \le m} \left( \sum\_{j \in P\_i} 2p\_j \right). \tag{44}$$

For example, in Figure 7a, there are four paths, *P*<sup>1</sup> =< 1, 2, 3, 7 >, *P*<sup>2</sup> =< 1, 2, 4, 7 >, *P*<sup>3</sup> =< 1, 4, 7 >, and *P*<sup>4</sup> =< 1, 5, 6, 7 >. Among them, it is apparent that *P*<sup>3</sup> cannot be the critical path since it is always shorter than *P*2. However, among *P*1, *P*2, and *P*4, we cannot be sure which is the longest since all of them can be the critical path according to how we decide *pi*s.

To overcome this challenge, we propose to employ a heuristic with a clear rule regarding which path should be the critical path. For that, with given *ei*s, we employ a heuristic described by the following:

$$p\_i \propto e\_i,\text{ for } 2 \le i \le n - 1. \tag{45}$$

Certainly, the most important benefit from making *pi*s simply proportional to *ei*s is that we can simply decide the critical path based on the path weights (See Definition 3) such that we can choose the heaviest path (See Definition 5) as the critical path. For example, in Figure 7a, the weights of paths, *P*<sup>1</sup> to *P*4, are calculated as 15, 17, 13, and 10, respectively. Then, *P*2, specified by the red color, turns out to be the heaviest path. By the heuristic, it is used as the critical path. The intuition behind this heuristic is that we give longer periods to runnables with longer execution times to evenly distribute the system utilization across runnables, eliminating possible bottlenecks. Note that *p*<sup>1</sup> and *pn* are excluded in Equation (45) as they have no effect on deciding the critical path.

For further explanations, we introduce a new notation *rc*, which is defined as the set of runnables in the critical path excluding *r*<sup>1</sup> and *rn*. Figure 7b shows *rc*= {*r*2,*r*4}. Then, let us think as if *rc* is a virtual *composite runnable* combining its member runnables just like the ellipse covering *r*<sup>2</sup> and *r*<sup>4</sup> in the figure. Then, the critical path can be thought of as a three-runnable ordered set < *r*1,*rc*,*rn* >. For *rc*, let us also define *ec* and *pc* as in the followings:

$$\mathfrak{e}\_{\mathfrak{e}} = \sum\_{i \in r\_{\mathfrak{e}}} \mathfrak{e}\_{i} \tag{46}$$

and

$$p\_{\mathcal{E}} = \sum\_{i \in r\_{\mathcal{E}}} p\_i. \tag{47}$$

Based on the above notations, Δ can be defined as follows:

$$
\Delta = 2(p\_1 + p\_\varepsilon + p\_n),
\tag{48}
$$

which makes the control cost function as follows with Equation (43):

$$f(p\_1, p\_c, p\_n) = 2\alpha p\_n + 2\beta (p\_1 + p\_c + p\_n). \tag{49}$$

Then, regarding how to derive *p*2, *p*3, ··· , and *pn*−<sup>1</sup> from *pc*, we use the following assignment rule following Equation (45):

$$p\_i = \frac{\mathcal{e}\_i}{\mathcal{e}\_c} p\_c \text{ for } 2 \le i \le n - 1. \tag{50}$$

Under the above assignment rule, the followings are ensured:


As an example, in Figure 7b, we can find that *ec* = *e*<sup>2</sup> + *e*<sup>4</sup> = 12 and *pc* = *p*<sup>2</sup> + *p*4. Once *pc* is decided, each *pi* can be derived as in the followings according to Equation (50):

$$p\_2 = \frac{4}{12} p\_{c\prime}, \; p\_3 = \frac{6}{\varepsilon\_c} p\_{c\prime}, \; p\_4 = \frac{8}{12} p\_{c\prime}, \; p\_5 = \frac{2}{12} p\_{c\prime} \text{ and } p\_6 = \frac{3}{12} p\_c. \tag{51}$$

#### *6.3. Transformation of Schedulability Constraint Function*

The schedulability constraint in Equation (11) uses a function of *n* runnable periods. Thus, it is transformed into a function of the three free variables (*p*1, *pc*, *pn*), following the rule in Equation (50) as follows:

$$\begin{split} \mathcal{U}(p\_1, p\_2, \ldots, p\_n) &= \frac{\varepsilon\_1}{p\_1} + \left(\frac{\varepsilon\_2}{p\_2} + \cdots + \frac{\varepsilon\_{n-1}}{p\_{n-1}}\right) + \frac{\varepsilon\_n}{p\_n} = \\ &= \frac{\varepsilon\_1}{p\_1} + \left(\frac{\varepsilon\_2}{\frac{\varepsilon\_2}{\varepsilon\_c}p\_c} + \cdots + \frac{\varepsilon\_{n-1}}{\frac{\varepsilon\_{n-1}}{\varepsilon\_c}p\_c}\right) + \frac{\varepsilon\_n}{p\_n} \\ &= \frac{\varepsilon\_1}{p\_1} + (n-2)\frac{\varepsilon\_c}{p\_c} + \frac{\varepsilon\_n}{p\_n} .\end{split} \tag{52}$$

Now, the utilization function *U*(*p*1, *p*2, ··· , *pn*) can be replaced by a function of (*p*1, *pc*, *pn*) as in the following:

$$\mathcal{U}(p\_1, p\_c, p\_n) = \frac{\varepsilon\_1}{p\_1} + (n - 2)\frac{\varepsilon\_c}{p\_c} + \frac{\varepsilon\_n}{p\_n}.\tag{53}$$

#### *6.4. Finding the Optimal Runnable Periods*

Based on the control cost function in Equation (49) and the utilization function in Equation (53), our runnable periods optimization problem for the DAG model can be formulated with the three free variables (*p*1, *pc*, *pn*) as follows:

$$\begin{aligned} \underset{p\_1, p\_c, p\_n}{\text{minimize}} \quad & J(p\_1, p\_c, p\_n) = 2\alpha p\_n + 2\beta (p\_1 + p\_c + p\_n) \\ \text{subject to} \quad & \mathcal{U}(p\_1, p\_c, p\_n) = \frac{\varepsilon\_1}{p\_1} + (n - 2)\frac{\varepsilon\_c}{p\_c} + \frac{\varepsilon\_n}{p\_n} \le 1. \end{aligned} \tag{54}$$

To solve the optimization problem, a Lagrange function is formulated as follows:

$$\begin{split} \mathcal{L} &= l(p\_1, p\_c, p\_n) - \lambda \left( \mathcal{U}(p\_1, p\_c, p\_n) - 1 \right) \\ &= 2ap\_n + 2\beta (p\_1 + p\_c + p\_n) - \lambda \left( \frac{e\_1}{p\_1} + (n-2)\frac{e\_c}{p\_c} + \frac{e\_n}{p\_n} - 1 \right). \end{split} \tag{55}$$

Then, we take the partial derivatives of L with respect to *p*1, *pc*, *pn*, and *λ*, respectively and set them to zeros as follows:

$$
\nabla \mathcal{L} = \left( \frac{\partial}{\partial p\_1}, \frac{\partial}{\partial p\_c}, \frac{\partial}{\partial p\_u}, \frac{\partial}{\partial \lambda}, \frac{\partial}{\partial \lambda} \right) = 0,\tag{56}
$$

which in turn is expanded to the followings:

$$\begin{aligned} \frac{\partial \mathcal{L}}{\partial p\_1} &= 2\beta + \frac{\varepsilon\_1}{p\_1^2}\lambda = 0, \\ \frac{\partial \mathcal{L}}{\partial p\_c} &= 2\beta + (n-2)\frac{\varepsilon\_c}{p\_c^2}\lambda = 0, \\ \frac{\partial \mathcal{L}}{\partial p\_n} &= 2(n+\beta) + \frac{\varepsilon\_n}{p\_n^2}\lambda = 0, \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= -\left(\frac{\varepsilon\_1}{p\_1} + (n-2)\frac{\varepsilon\_c}{p\_c} + \frac{\varepsilon\_n}{p\_n} - 1\right) = 0. \end{aligned} \tag{57}$$

Then, the first, second, and third of Equation (57) are rearranged by isolating *λ* in each left-hand side as follows:

$$\begin{aligned} \lambda &= -2\beta \frac{p\_1^2}{\varepsilon\_1}, \\ \lambda &= -2\beta \frac{p\_c^2}{(n-2)\varepsilon\_c}, \\ \lambda &= -2(a+\beta) \frac{p\_n^2}{\varepsilon\_n}. \end{aligned} \tag{58}$$

Then, by combining the first and second of Equation (58), we have the following:

$$2\beta \frac{p\_1^2}{\varepsilon\_1} = 2\beta \frac{p\_c^2}{(n-2)\varepsilon\_c} \Longrightarrow \frac{1}{p\_c} = \frac{1}{p\_1} \sqrt{\frac{\varepsilon\_1}{(n-2)\varepsilon\_c}}.\tag{59}$$

By combining the first and third of Equation (58), we have the following:

$$2\beta \frac{p\_1^2}{c\_1} = 2(\alpha + \beta) \frac{p\_n^2}{c\_n} \Longrightarrow \frac{1}{p\_n} = \frac{1}{p\_1} \sqrt{\frac{(\alpha + \beta)c\_1}{\beta c\_n}}.\tag{60}$$

By replacing <sup>1</sup> *pc* and <sup>1</sup> *pn* in the last of Equation (57) with the findings in Equations (59) and (60), we have the following:

$$\frac{1}{p\_1}\left(c\_1 + (n-2)\varepsilon\_{\mathbb{C}}\sqrt{\frac{\varepsilon\_1}{(n-2)\varepsilon\_{\mathbb{C}}}} + c\_{\mathbb{H}}\sqrt{\frac{(\alpha+\beta)\varepsilon\_1}{\beta\varepsilon\_n}}\right) = 1. \tag{61}$$

Finally, from Equations (59)–(61), we have the following solution:

$$\begin{aligned} p\_1 &= e\_1 + \sqrt{(n-2)e\_1 e\_c} + \sqrt{\frac{(\alpha + \beta)e\_1 e\_n}{\beta}} \\ p\_c &= p\_1 \sqrt{\frac{(n-2)e\_c}{e\_1}} \\ p\_{\text{fl}} &= p\_1 \sqrt{\frac{\beta e\_n}{(\alpha + \beta)e\_1}}. \end{aligned} \tag{62}$$

After finding the optimal (*p*1, *pc*, *pn*), the remaining runnable periods (*p*2, *p*3, ··· , *pn*−1) should be decided, too. For that, the assignment rule in Equation (50) is used.

#### *6.5. Applying Our Method to Other Scheduling Algorithms*

Thus far, we assumed the EDF scheduling algorithm for the underlying RTOS scheduling. However, other scheduling algorithms such as RM are also widely used in the automotive industry. With this motivation, this subsection explains how we can apply our method to different scheduling algorithms. Fortunately, most real-time scheduling algorithms provide a schedulability analysis method based on the L&L utilization bound, where if the system utilization is less than or equal to a specific threshold value called a utilization bound, denoted by *UB*, the system is guaranteed to be schedulable. As noted earlier, *UB* for EDF is 100%, whereas *UB* for RM is 69.3%. Then, the schedulability condition is formally expressed as follows:

$$\mathcal{U}(p\_1, p\_2, \dots, p\_n) = \sum\_{i=1}^n \frac{e\_i}{p\_i} \le \mathcal{U}\_\mathcal{B}.\tag{63}$$

Then, our optimization problem is slightly changed from Equation (54) to the following using *UB* in the schedulability constraint:

$$\begin{aligned} \underset{p\_1, p\_c, p\_n}{\text{minimize}} \quad & J(p\_1, p\_c, p\_n) = 2\alpha p\_n + 2\beta (p\_1 + p\_c + p\_n) \\ \text{subject to} \quad & \mathcal{U}(p\_1, p\_c, p\_n) = \frac{c\_1}{p\_1} + (n - 2)\frac{c\_c}{p\_c} + \frac{c\_n}{p\_n} \le \mathcal{U}\_B. \end{aligned} \tag{64}$$

Then, its Lagrange function is also modified as follows:

$$\begin{split} \mathcal{L} &= \mathbb{I}(p\_1, p\_c, p\_n) - \lambda \left( \mathcal{U}(p\_1, p\_c, p\_n) - \mathcal{U}\_\mathcal{B} \right) \\ &= 2\alpha p\_n + 2\beta (p\_1 + p\_c + p\_n) - \lambda \left( \frac{\varepsilon\_1}{p\_1} + (n-2)\frac{\varepsilon\_c}{p\_c} + \frac{\varepsilon\_n}{p\_n} - \mathcal{U}\_\mathcal{B} \right). \end{split} \tag{65}$$

Solving the above Lagrange function yields the following solution:

$$\begin{split}p\_1 &= \frac{\varepsilon\_1 + \sqrt{(n-2)\varepsilon\_1 \varepsilon\_c} + \sqrt{\frac{(n+\beta)\varepsilon\_1 \varepsilon\_n}{\beta}}}{\mathcal{U}\_B} \\ p\_c &= p\_1 \sqrt{\frac{(n-2)\varepsilon\_c}{\varepsilon\_1}} \\ p\_n &= p\_1 \sqrt{\frac{\beta \varepsilon\_n}{(\alpha + \beta)\varepsilon\_1}}.\end{split} \tag{66}$$

From the above, (*p*2, *p*3, ··· , *pn*−1) are decided by the assignment rule in Equation (50). Note that we can apply our method to any scheduling algorithm whose schedulability analysis can be conducted by the L&L utilization bound method.

#### *6.6. Algorithm*

Algorithm 1 shows a complete procedure for finding optimal runnable periods for a given DAG by our analytical method. As inputs, the algorithm accepts (i) a list of worst-case execution times for *n* runnables, (ii) a list of *m* paths in the DAG, (iii) *α* and *β* of a given control cost function, and (iv) a utilization bound *UB*. The algorithm just needs a list of paths instead of the entire structure of the DAG. Thus, the algorithm itself does not consider generating paths from a DAG. As an output, the algorithm returns a list of optimal runnable periods. Note that the algorithm's computational complexity is just O(*n* × *m*), which is caused when finding the heaviest path in line 3. This polynomial time complexity makes our analytical method practical for use with large systems.

**Algorithm 1:** Find optimal runnable periods for a DAG

**Input:** <sup>E</sup> <sup>=</sup><sup>&</sup>lt; *<sup>e</sup>*1,*e*2, ··· ,*en* <sup>&</sup>gt;: list of runnable execution times **Input:** <sup>P</sup> <sup>=</sup><sup>&</sup>lt; *<sup>P</sup>*1, *<sup>P</sup>*2, ··· , *Pm* <sup>&</sup>gt;: list of paths **Input:** (*α*, *β*): coefficients of a control cost function **Input:** *UB*: utilization bound **Output:** < *p*1, *p*2, ··· , *pn* >: optimal runnable periods **Function** FindOptimalRunnablePeriods(E*,* P*, <sup>α</sup>, <sup>β</sup>, UB*)**: <sup>1</sup>** *<sup>n</sup>* ← |E<sup>|</sup> /\* n: number of runnables \*/ **<sup>2</sup>** *<sup>m</sup>* ← |P<sup>|</sup> /\* m: number of paths \*/ **<sup>3</sup>** *ec* ← max *<sup>P</sup>*∈<sup>P</sup> ∑ *i*∈*P ei* − (*e*<sup>1</sup> + *en*) **<sup>4</sup>** *p*<sup>1</sup> ← *<sup>e</sup>*<sup>1</sup> + (*<sup>n</sup>* − <sup>2</sup>)*e*1*ec* + (*α* + *β*)*e*1*en β UB* **<sup>5</sup>** *pc* ← *p*<sup>1</sup> (*n* − 2)*ec e*1 **<sup>6</sup>** *pn* ← *p*<sup>1</sup> *βen* (*α* + *β*)*e*<sup>1</sup> **<sup>7</sup> for** *i* ← 2 **to** *n* − 1 **do <sup>8</sup>** *pi* <sup>←</sup> *ei ec pc* **<sup>9</sup> return** < *p*1, *p*2, ··· , *pn* >

#### *6.7. Applying Our Method to Conventional Task-Based Systems*

Many control applications, but for the automotive industry, are still designed as a set of periodic real-time tasks. Thus, it can be beneficial if we can apply our runnable periods optimization method to such traditional control systems. For that, we first classify them into two different categories. The first is for systems with independent tasks with multiple target plants [12,14,15] and the second is for systems composed of periodic tasks with DAG-based data dependencies [18–20].

Note that the applications in the second category have a strong resemblance to our assumed system model. If we simply assume one-to-one mappings from runnables to tasks, our method for the runnable periods optimization can be applied to systems with periodic tasks without much modifications. However, the applications in the first category cannot make use of our method due to their disagreeing application model with ours. However, for those applications, traditional control-scheduling codesign methods [9,12–15] can be used instead.

#### **7. Evaluation**

This section specifically evaluates our optimization method for the general DAG model. Readers interested in the evaluation results for the LPG and LMG models are referred to [27]. More specifically, we evaluate our optimization method by answering the following questions:


#### *7.1. Evaluation Method*

For the evaluation, we have to consider the following: (i) workload synthesis, (ii) control cost functions, (iii) optimization algorithms, and (iv) performance metrics. In the remainder of this subsection, the above topics are discussed to explain our evaluation method.

*Workload synthesis.* As representative AUTOSAR workloads, a total of nine DAGs are artificially synthesized. Among them, the first six DAGs in Figure 8 are relatively small ones with four to six runnables, whereas the remaining three DAGs in Figure 9 are with a relatively large number of runnables ranging from 12 to 25. Note that the DAGs are manually generated, however, understand that the resulting DAGs are purely random without any unfair bias. The small DAGs are used to test the optimality of our method since we can find the real optimal solutions for those small DAGs in Figure 8. On the other hand, we cannot find the real optimal solutions for the large DAGs in Figure 9 due to the vast problem space. However, although we cannot evaluate the optimality of our method with the large DAGs, they are still useful when evaluating our method in comparison to other heuristic optimization methods. Each DAG in the figures is labeled by a notation (*nR*, *mL*) representing its size and complexity, where *nR* denotes *n* runnables and *mL* denotes *m* links (or edges) between them. Basically, with larger *n* and *m*, DAGs become more and more complex. For example, the DAG in Figure 8a is labeled by (4*R*, 5*L*), which has four runnables and five links, and the DAG in Figure 9c, labeled by (25*R*, 34*L*), has 25 runnables and 34 links. For each DAG, we generate 100 sets of random runnable execution times uniformly distributed in the range of [20 ms, 150 ms].

*Control cost functions.* As our optimization objective, we use a linear control cost function as in Equation (10), which is a function of the control period (*T*) and the end-to-end delay (Δ). As a representative control cost function, we use the following as our default control cost function, unless otherwise stated:

$$I(T,\Delta) = 0.01T + 0.01\Delta.\tag{67}$$

Note the above control cost function has two coefficients *α* = 0.01 and *β* = 0.01. Here, however, the relative ratio of *α* and *β* is more important than their absolute values since the ratio represents the control cost function's relative sensitivity to the control period and the end-to-end delay. By using the same values for *α* and *β* in our default control cost function, the control cost is equally sensitive to the control period and the end-to-end delay. To represent other scenarios with varying relative sensitivities, we also use varying *α*s and *β*s in the range of [0.01, 0.05].

*Optimization algorithms.* To evaluate the optimization performance of our method, for the comparison purpose, we specifically consider the following three optimization methods:


More specifically, the *EXH* method searches through the discrete integer problem space within [1 ms, 1000 ms] for each runnable period. We compare our method with the *EXH* method to evaluate the optimality of our optimization method with small DAGs to answer the question Q1. To evaluate the optimization performance with large DAGs as an answer to the question Q3, we compare our method with the *PSO* method.

*Performance metrics.* We mainly use two optimization performance metrics: (i) absolute control costs and (ii) normalized control costs. Absolute control costs are the raw control cost values resulting from an optimization process, whereas normalized control costs are used to compare our method with another algorithm, i.e., the *EXH* method and the *PSO* method. A normalized control cost is defined as the relative ratio of our resulting control cost by letting another method's result as 100%. Besides, to evaluate the practicality of each optimization method, the optimization times are measured with respect to varying application complexities. For the optimization, we use a workstation with an Intel i7-9700k CPU with 64 GB RAM (Dell, Round Rock, TX, USA)

**Figure 8.** Small DAGs with varying number of runnables (denoted by *n*R) and links (denoted by *m*L).

**Figure 9.** Large DAGs used for evaluating the practicality of our method.

#### *7.2. Evaluation Results and Discussion*

Q1: Is our analytical method able to find near-optimal runnable periods? For the six DAGs from Figure 8a–f, their normalized average control costs by the *EXH* method and our method are compared in Figure 10. In the figure, our method shows near-optimal control costs with marginal performance losses compared with the *EXH* method. The minimum control cost increase is just 1.1% and the maximum is 12.3%. One interesting point is that DAG (a) shows a significantly better performance than the other DAGs. That is because our heuristic for selecting the critical path is always valid in this particular DAG shape. Note that, in DAG (a), < *r*1,*r*2,*r*3,*r*<sup>4</sup> > will be correctly chosen as the critical path by our heuristic regardless of their execution times since every other path is a subset of it. On the other hand, in other DAGs, there are multiple choices for the critical path. Nonetheless, our method must bet on a certain path based on the given execution times. However, as shown in the figure, the performance losses from the real optimal solutions are marginal.

Thus far, we have assumed the EDF scheduling algorithm. However, in the automotive industry, the RM scheduling algorithm is also widely used for scheduling real-time tasks. In this regard, Figure 11 compares the EDF case where the utilization bound is 100% and the RM case where the utilization bound is 69.3%. More specifically, Figure 11a compares their normalized average control costs across the six DAGs, each of which represents the relative optimization performance compared with the *EXH* method. The figure shows that our method provides not much different optimization results across the two scheduling algorithms. Figure 11b compares their absolute average control costs where EDF shows a significantly lower average control cost since it can efficiently schedule workloads with its higher utilization bound.

To investigate how varying control cost functions affect the optimization performance, Figures 12a,b show normalized average control costs with varying *α*s and *β*s, respectively. As shown in the figures, our method provides a consistent optimization performance with varying control cost functions that represent various sensitivities to the control period and the end-to-end delay.

**Figure 10.** Normalized average control costs of our method compared with the *EXH* method.

(**a**) Normalized average control costs.

**Figure 11.** Comparison of average control costs with the earliest deadline first (EDF) and rate monotonic (RM) scheduling algorithms.

Q2: Is it practical to find real optimal solutions by the exhaustive search method? By looking at Q1's results, one can argue that if we can use the *EXH* method to find real optimal runnable periods, why not just use the *EXH* method instead of our method? However, this argument does not hold since the *EXH* method is not usable for large DAGs due to the vast search space when *n* > 6. To answer the question, we evaluate our optimization method in terms of the required time for the optimization process. Table 1 shows the required optimization times for our method and the *EXH* method with varying number of runnables. The numbers inside parentheses are projected numbers. As our method finds the runnable periods by an analytical method, it shows negligible computational complexities as predicted in Algorithm 1. With the *EXH* method, we can find optimal runnable periods in about one month when the numbers of runnables is seven. However, after seven runnables, it is not feasible to use the *EXH* method since it takes more than a year even for small and medium-size systems. Thus, due to the scalability problem, the *EXH* method cannot be used for practical industrial applications, whereas our analytical method can be used for large industry applications.


**Table 1.** Required optimization times.

Q3: Is our method practically competitive when optimizing large systems? From Q1 and Q2, we showed that (i) our analytical method works well with small systems and (ii) real optimal solutions cannot be found for large systems. One remaining question is whether we can apply our analytical method to large systems with sufficient optimization performance. To answer this question, we use the three large DAGs with *n* = 12, 16, and 25, as in Figure 9. For the comparison, we additionally use an evolutionary metaheuristic algorithm known as particle swarm optimization (denoted by *PSO*) that searches an unknown vast problem space efficiently with swarm intelligence. To implement the *PSO* method, we use the PySwarms library [40]. Figure 13 shows the normalized average control costs of our method compared with the *PSO* method, letting the resulting control costs by the *PSO* method as 100%. As shown in the figure, our method shows significantly better results than the *PSO* method, especially for the largest DAG case. We notice the decreasing trend of normalized average control costs with the increasing complexity of DAGs. With the above results, we can claim that our analytical method performs better than the traditional evolutionary optimization method.

**Figure 13.** Normalized average control costs of our method for large DAGs compared with the *PSO* method.

#### **8. Conclusions**

This paper formulates a runnable periods optimization problem for AUTOSAR control applications and provides an analytical solution based on the Lagrange multiplier method. Our method can find near-optimal solutions that maximize a given system's control performance regardless of the size and complexity of the application. Since the complexity of automobile control applications is rapidly growing due to the recent development of various advanced driver assistance systems and autonomous driving applications, it is no longer feasible to use traditional ad-hoc methods or time-consuming search-based optimization algorithms. Due to the analytical nature of our proposed runnable periods optimization method, we consider that our solution can be readily used in the automotive industry when designing their complex industry-scale AUTOSAR control applications.

Although our method provides a promising solution for optimizing complex applications, our method is only usable when the control cost is given or approximated as a linear function. As the approximation can induce overestimated control costs, we plan to extend our optimization method to nonlinear control cost functions in our future work.

**Author Contributions:** Conceptualization, J.-C.K.; Data curation, D.C. and T.-W.K.; Methodology, D.C.; Software, D.C. and J.-C.K.; Supervision, J.-C.K.; Visualization, T.-W.K.; Writing—original draft, T.-W.K.; Writing—review & editing, J.-C.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported partly by the Ministry of Land, Infrastructure, and Transport (MOLIT), Korea, through the Transportation Logistics Development Program (20TLRP-B147674-03, Development of Operation Technology for V2X Truck Platooning) and partly by the National Research Foundation (NRF) grant funded by the Korea government (MSIT; Ministry of Science and ICT) (No. 2017R1C1B5018374) and partly by Institute for Information and communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (2014-0-00065, Resilient Cyber-Physical Systems Research).

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


PSO Particle Swarm Optimization

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Applied Sciences* Editorial Office E-mail: applsci@mdpi.com www.mdpi.com/journal/applsci

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18