*Article* **Fighting CPS Complexity by Component-Based Software Development of Multi-Mode Systems**

**Hang Yin 1,\* and Hans Hansson 2,†**


Received: 7 October 2018; Accepted: 18 October 2018; Published: 22 October 2018

**Abstract:** Growing software complexity is an increasing challenge for the software development of modern cyber-physical systems. A classical strategy for taming this complexity is to partition system behaviors into different operational modes specified at design time. Such a multi-mode system can change behavior by switching between modes at run-time. A complementary approach for reducing software complexity is provided by component-based software engineering (CBSE), which reduces complexity by building systems from composable, reusable and independently developed software components. CBSE and the multi-mode approach are fundamentally conflicting in that component-based development conceptually is a bottom-up approach, whereas partitioning systems into operational modes is a top-down approach with its starting point from a system-wide perspective. In this article, we show that it is possible to combine and integrate these two fundamentally conflicting approaches. The key to simultaneously benefiting from the advantages of both approaches lies in the introduction of a hierarchical mode concept that provides a conceptual linkage between the bottom-up component-based approach and system level modes. As a result, systems including modes can be developed from reusable mode-aware components. The conceptual drawback of the approach—the need for extensive message exchange between components to coordinate mode-switches—is eliminated by an algorithm that collapses the component hierarchy and thereby eliminates the need for inter-component coordination. As this algorithm is used from the design to implementation level ("compilation"), the CBSE design flexibility can be combined with efficiently implemented mode handling, thereby providing the complexity reduction of both approaches, without inducing any additional design or run-time costs. At the more specific level, this article presents (1) a mode mapping mechanism that formally specifies the mode relation between composable multi-mode components and (2) a mode transformation technique that transforms component modes to system-wide modes to achieve efficient implementation.

**Keywords:** component-based software engineering; mode; mode-switch

#### **1. Introduction**

Growing software complexity is posing a challenge for the design of cyber-physical systems (CPS) [1]. Complexity related to CPS software is multifaceted, including specific requirements related to extra functional properties such as functional safety, resilience and timeliness. There is additionally a trade-off between the different aspects of complexity, e.g., added complexity at design-time can reduce complexity at run-time and vice versa. A key issue in making such trade-offs is the risk implied by different choices; and risks have to be balanced against benefits. For companies, this is a reality even for safety-critical systems regulated by safety standards. As a result, to maximize business benefits, it is standard practice for many companies to make a minimal, but sufficient, effort to comply with applicable regulation.

At the technological level, there are approaches developed to reduce complexity throughout the system life-cycle. Combining such approaches can potentially reduce the overall life-cycle complexity, but approaches are not always compatible, and each of them introduces both benefits and costs. This article presents the integration of two such approaches: multi-mode systems and component-based software engineering, which both provide composability, but are targeting different life-cycle phases. A specific challenge in combining the two is that they are conceptually incompatible, in the sense that component-based development is a bottom-up approach, whereas partitioning systems into operational modes is a top-down approach with its starting point from a system-wide perspective. Still, we are able to successfully integrate them in a single framework of multi-mode components, thereby providing the combined benefits of both, while being able to reduce costs to acceptable levels. These approaches, introduced below, are primarily focusing on the design, configuration and run-time phases of the system life-cycle, although they do have important implications also for the maintenance phase. Furthermore, other dimensions of complexity are affected by the considered technologies, including organizational complexity, as both component-based software engineering (CBSE) and the multi-mode approach provide a basis not only for system partitioning, but also for partitioning of the design activities, implying that the distribution of the design tasks to different departments or even different organizations are facilitated. A further implication of the enabled partitioning and inherent clearly-defined interfaces is that the approaches could scale also to systems-of-systems (SoS), e.g., different components or modes can correspond to different systems in an SoS.

#### *1.1. Multi-Mode Systems*

A common practice to manage software complexity is to partition system behaviors into different mutually-exclusive operational modes so that each mode corresponds to a specific system behavior. A multi-mode system [2] changes behavior by switching modes. A typical example is the control software of an airplane, which runs in different modes such as taking off, flight and landing. Multi-mode systems have also been motivated by many other reasons:


#### *1.2. Component-Based Software Engineering*

As a complementary approach to multi-mode systems, CBSE [3] advocates the reuse of independently developed software components as a promising technique for the development of complex software systems. The success of CBSE has been evidenced by a variety of component models proposed both in industry and academia [4,5]. CBSE suggests software modularity and reusability to facilitate the development of high-quality software. For instance, different functional modules or even subsystems of the control software of an airplane can be encapsulated into reusable software components that can be reused multiple times for the same system or for other systems in the same software product line.

Applying CBSE in multi-mode systems or the other way around has been a largely unexplored research area, possibly because multi-mode systems and CBSE are fundamentally conflicting in the sense that the former traditionally is a top-down approach, whereas the latter is a bottom-up approach. Despite this apparent conflict, our research goal in this article is to combine these approaches and benefit from the advantages of both multi-mode systems and CBSE. Hence, we propose component-based software development of multi-mode systems, characterized by the independent development and reuse of multi-mode components (i.e., components that can run in multiple modes).

#### *1.3. A Guiding Example*

As a guiding example, consider a proof-of-concept healthcare monitoring system. The system consists of two subsystems: a data acquisition subsystem and a monitoring subsystem. The data acquisition subsystem uses cameras and microphones to collect video and audio data from a ward or a private home. Video and audio data are separately encoded and encrypted before transmission. The monitoring subsystem decrypts and decodes data that it receives and reports them to the health center. The monitoring subsystem also includes an alarm that is triggered when the person being monitored encounters a dangerous situation, such as falling or suffering from a heart attack.

Our focus is on the software architecture of the monitoring subsystem (MoS), which is composed of multi-mode components. Figure 1 illustrates the component hierarchy of the system on the left and component connections on the right. The system can be considered as a top-level component MoS with three subcomponents: DaD for data decryption, the multimedia decoder MuD and EvA for event analysis. Due to the tree structure of the component hierarchy, DaD, MuD and EvA are also called the children of MoS, which consequently is their parent. The system can run in two different modes: regular monitoring mode (denoted as *Rm*) and attention mode (denoted as *Att*).

**Figure 1.** The architectural model of a multi-mode system built from multi-mode components.

The default mode of MoS is *Rm* when nothing special happens. To save resource in this mode, a fast video encoding/decoding algorithm can be selected and only low resolution video is transmitted to keep low CPU and bandwidth usage. Shown in Figure 1, small squares denote the input and output ports of a component, while each arrow that connects two ports denotes the data flow. Basically, each component has input data coming from its input ports, processes the data and provides output data at its output ports. Such a pipes and filters architectural style is fairly common for multimedia

applications. The video data are first decrypted by DaD and subsequently decoded by MuD, which sends the decoded video data to the display in the health center. Represented by the dimmed color, EvA is deactivated when MoS runs in *Rm*. Meanwhile, DaD runs in a regular mode *R1* and MuD runs in a regular decoding mode *Rd*. MuD has three subcomponents: VAE for video/audio extraction, a video decoder, ViD, and an audio decoder, AuD. ViD is the only activated subcomponent running in a regular video decoding mode *Rvd* as MuD runs in *Rd*. VAE and AuD are deactivated as shown in Figure 1 because no audio data are transmitted.

When the data acquisition subsystem detects an accident, both subsystems will switch to an attention mode *Att* to raise the attention of the health center. The network load is increased due to transmission of video data with higher quality and audio data. Component EvA becomes activated running in a regular mode *R2*, to analyze the detected event and trigger an alarm when necessary. MuD starts to run in enhanced decoding mode *Ed*, where all its subcomponents (VAE, ViD, AuD) are activated and a different video encoding/decoding algorithm is used to support higher resolution video. VAE runs in a regular mode *R3* to separate decrypted video and audio data and send them to ViD and AuD, respectively. Represented by grey color in Figure 1, ViD runs in an enhanced video decoding mode *Evd* with a different video decoding algorithm for high quality video. AuD runs in a regular audio decoding mode *Rad*. In the case of poor network condition, MuD switches to a degraded QoS mode *Dq*, where the transmission of audio data is terminated to keep video quality, which is considered to be more important. Therefore, AuD becomes deactivated.

We distinguish two types of components in the monitoring subsystem: primitive components and composite components. DaD, EvA, VAE, ViD and AuD are primitive components, which are directly implemented by code, while MoS and MuD are composite components composed by other components. What makes this system distinctive compared with traditional component-based systems is its constitution of multi-mode components, i.e., components that can run in different modes at run-time. The system in Figure 1 indicates a clear mapping between the modes of different components. Such a mapping is summarized in Table 1, where modes in the same column are mapped to each other for the composite components MoS and MuD. For instance, when MoS runs in mode *Att*, DaD must run in *R1*, MuD can run in either *Ed* or *Dq* and EvA must run in *R2*. Even already, this simple example signifies the power of building multi-mode systems with multi-mode components, which has the potential to enrich system architectural variability at all levels. Since multi-mode components still comply with component-based development, the overall software design complexity is decoupled into building multi-mode components at different levels, thereby making the growing software complexity manageable.


**Table 1.** Mode mappings.

#### *1.4. Contributions*

In achieving component-based software development of multi-mode systems, this article includes two key contributions. First, we propose a formal mode mapping description in the form of mode mapping automata (MMA) that specifies how the modes of a composite component are mapped to the modes of its subcomponents. The MMA presented in this article partially builds on the MMA initially proposed in [6], which is here substantially refined and extended. Mode mapping elegantly links modes and software component reuse. The hierarchical composition of multi-mode components easily allows one to build multi-mode systems with multi-mode behaviors at various levels. A potential drawback of this approach is the run-time overhead due to inter-component communication for coordination of the mode-switches of different components. To eliminate this run-time drawback, while still being able to design systems from reusable multi-mode components, we introduce a mode transformation technique as our second contribution. This technique transforms component modes to system-wide modes to optimize the implementation. This is obtained by flattening the hierarchical structure of component modes mapped at different levels. Mode transformation can be included in the mapping from the design to implementation level ("compilation"), after the mode mappings of all composite components in a system have been specified. An initial version of the mode transformation technique is presented in [7].

The rest of this article is structured as follows: Section 2 elaborates on the composition of multi-mode components and the mode mapping mechanism. Section 3 presents the mode transformation technique. Related work is reviewed in Section 4. Finally, Section 5 concludes the article and discusses some future work.

#### **2. The Composition of Multi-Mode Components**

As an essential step in the composition of multi-mode components, mode mapping unambiguously specifies the mode relation between different multi-mode components at design time. This section highlights the essential properties of multi-mode components and the motivation of mode mapping, followed by a thorough explanation of MMA, i.e., a formal description of mode mapping.

#### *2.1. Multi-Mode Components and Mode Mapping*

A multi-mode component supports multiple modes and has a unique configuration defined for each mode. Figure 2 illustrates the key properties of a reusable multi-mode component. The configuration for each mode relies on various factors. For instance, a multi-mode primitive component may have different mode-specific behaviors for different modes; while a multi-mode composite component may have a different set of subcomponents activated depending on its mode.

**Figure 2.** The illustration of a multi-mode component.

A multi-mode component can switch between certain modes at run-time, either on its own initiative or as the result of a request by another component. A mode-switch leads to the reconfiguration of the component by changing its configuration in the current mode to a new configuration in the target mode. A local mode-switch manager is used to handle the mode-switch of a multi-mode component. By having such a mode-switch manager in each component, a multi-mode component is able to exchange mode information with its parent and subcomponents via dedicated mode-switch ports (the blue ports in Figure 2) during a mode-switch, even without knowing the global mode information. These mode-switch ports do not deal with input or output data going through the component. Instead, they are only used for mode-switch coordination between a composite component and its

subcomponents. Therefore, each mode-switch port is bidirectional, which allows mode-switch signals to be transmitted in both directions. For instance, in the guiding example presented in Section 1.3, when an accident is detected, MoS will switch from *Rm* to *Att*. Meanwhile, the mode-switch manager of MoS will send a signal to its subcomponents DaD, MuD and EvA, requesting them to switch mode based on the mode mapping defined in Table 1 (a). The mode-switch managers of different components are jointly responsible for propagating a mode-switch event to the affected components, keeping mode consistency between components and coordinating the mode-switches of different components. Designing the mode-switch manager is out of the scope of this article. We have previously developed distributed mode-switch algorithms [8,9] running in the mode-switch manager for the cooperative mode-switch of different components. Here, our focus is on mode mapping, also shown in Figure 2.

Since we assume that multi-mode components are independently developed, they typically support different numbers of modes and name them differently. It is necessary to specify the relation between the modes of different components at design-time without ambiguity. Such a specification is called mode mapping. To ensure reusability, the mode mapping must never violate the following principles:


The principles imply that mode mapping should be managed by each composite component, not by its subcomponents. A primitive component requires no mode mapping. The mode mapping defined in Table 1 is simple and intuitive; however, it is incapable of showing the initial mode of each component, which component initiates a mode-switch and the exact mode-switch of each individual component due to a mode-switch event. For example, when MoS switches from *Rm* to *Att*, according to Table 1 (a), MuD may switch to either *Ed* or *Dq*. Such non-determinism can be eliminated by specifying either *Ed* or *Dq* as the default new mode of MuD for this particular mode-switch scenario. To be able to formally specify all types of mode mapping rules, we propose a more powerful representation: the mode mapping automata.

#### *2.2. Mode Mapping Automata*

Let *c* be a composite component with SC*<sup>c</sup>* being the set of subcomponents of *c* and *Pc* being the parent of *c*. When *c* is running in one of its supported modes, it should always know its current mode and the current modes of all *ci* ∈ SC*<sup>c</sup>* by its mode mapping. Moreover, whenever the mode-switch manager of *c* notices the mode-switch of *ci* ∈ SC*<sup>c</sup>* ∪ {*c*}, it will refer to the mode mapping, which should tell which other components among SC*<sup>c</sup>* ∪ {*c*}\{*ci*} should also switch mode as a consequence, as well as the new modes of these components.

The complete mode mapping of *c* can be formally presented by a set of MMA, which consists of one mode mapping automaton of *c* (denoted as *MMA<sup>s</sup> <sup>c</sup>*) and one MMA of each subcomponent *ci* ∈ SC*<sup>c</sup>* (denoted as *MMA<sup>c</sup> ci* ). Here, we call *MMA<sup>s</sup> <sup>c</sup>* a self-MMA and *MMA<sup>c</sup> ci* a child MMA.

As an example, Figure 3 presents the set of MMA of the component MuD in Figure 1, including a self-MMA (*MMA<sup>s</sup> MuD*) and three child MMA (*MMA<sup>c</sup> VAE*, *MMA<sup>c</sup> ViD* and *MMA<sup>c</sup> AuD*). These MMA are hierarchically organized in the same way as the corresponding components. Each MMA can receive and emit internal or external signals. Internal signals are used to synchronize the pair of the self-MMA and its child MMA while external signals interact with its local mode-switch manager for requesting and returning mode mapping results.

**Figure 3.** The role of the mode mapping of MuD at run-time. MMA, mode mapping automata.

An external signal indicates that a component is requested to switch to a particular mode. We use *x*.*E*(*y*) to denote an external signal asking *MMAx*, which is either a self-MMA or a child MMA, to switch to mode *y*. An internal signal is sent either from a self-MMA to a child MMA or from a child MMA to the self-MMA. A self-MMA only sends an internal signal to a child MMA if the current mode-switch event requires the mode-switch of the corresponding subcomponent. The self-MMA decides the new mode of this subcomponent. We use *x*.*I*(*y*) to denote an internal signal emitted by a self-MMA to the child MMA *MMA<sup>c</sup> <sup>x</sup>*, asking the subcomponent *x* to switch to mode *y*. A child MMA can also send an internal signal to the self-MMA. This implies that the corresponding subcomponent is requesting a mode-switch. Since mode mapping is always determined by the self-MMA, the internal signal from a child MMA only needs to contain the current mode and new mode of the corresponding subcomponent. We use *<sup>x</sup>*.*I*(*<sup>z</sup>* <sup>→</sup> *<sup>y</sup>*) to denote an internal signal emitted by a child MMA *MMA<sup>c</sup> <sup>x</sup>* that requests to switch mode from *z* to *y*. Note that *z* must be present in this internal signal, as *x*.*I*(*z* → *y*) and *x*.*I*(*z* → *y*) are two different mode-switch scenarios, which may lead to different mode mappings.

A self- or child MMA can be formally defined as follows:

**Definition 1.** *MMA: An MMA is defined as a tuple:*

S,*s* 0, SI, T

*where* S *is a set of states; <sup>s</sup>*<sup>0</sup> ∈ S *is the initial state;* SI = I∪E (I∩E = ∅) *is a set of signals received or emitted during a state transition, with* I *as the set of internal signals and* E *as the set of external signals;* T ⊆ S ×SI × 2SI × S *is a set of transitions of the MMA.*

We use a state machine for the graphical representation of an MMA, where each state is one mode and each transition is a result of mode-switch. Ordinary states are marked by a circle, while the initial state is marked by a double circle. If the MMA is a self-MMA of *c*, then each state corresponds to a mode of *c*. If the MMA is a child MMA of *c* associated with *ci* ∈ SC*c*, then each state corresponds to a mode of *ci* or the deactivated status of *ci*, if *ci* can be deactivated, denoted as *D*. When a composite

component is deactivated, then all its enclosed components must also be deactivated. A transition *t* ∈ T is represented by an arrow from a state *s* to a state *s* , denoted as *s* In/Out −−−−→ *s* , where In/Out is the label of the transition. "In" is an external/internal signal as the input that triggers the transition. "Out" is a set of external/internal signals as the output of the transition.

Figure 4 depicts *MMA<sup>s</sup> MuD*, i.e., the self-MMA of MuD of the guiding example. Three states are included in this MMA, implying that MuD can run in three modes. The state transitions of *MMA<sup>s</sup> MuD* and the corresponding child MMA *MMA<sup>c</sup> VAE*, *MMA<sup>c</sup> ViD* and *MMA<sup>c</sup> AuD* (Figure 5) are manually specified to determine the mode mapping of MuD. As an example for demonstrating MMA synchronization, the top left transition of *MMA<sup>s</sup> MuD*, *Rd MuD*.*E*(*Ed*)/{*VAE*.*I*(*R3*),*ViD*.*I*(*Evd*),*AuD*.*I*(*Rad*)} −−−−−−−−−−−−−−−−−−−−−−−−−−−→ *Ed*, implies that MuD requests a mode-switch to *Ed*, consequently requiring its subcomponents VAE, ViD and AuD to switch to modes *R3*, *Evd* and *Rad*, respectively. Figure 5 shows that this transition of *MMA<sup>s</sup> MuD* is synchronized with three transitions of the child MMA: *<sup>D</sup> VAE*.*I*(*R3*)/{*VAE*.*E*(*R3*)} −−−−−−−−−−−−−→ *R3*, *Rvd ViD*.*I*(*Evd*)/{*ViD*.*E*(*Evd*)} −−−−−−−−−−−−−−→ *Evd*, *<sup>D</sup> AuD*.*I*(*Rad*)/{*AuD*.*E*(*Rad*)} −−−−−−−−−−−−−−−→ *Rad*.

**Figure 4.** The self-mode mapping automaton of MuD.

**Figure 5.** The child mode mapping automata of MuD.

#### *2.3. MMA Composition*

The internal synchronization between a set of MMA of a composite component is actually invisible to the mode-switch manager of the composite component. What the mode-switch manager sees is the composition of these MMA. MMA composition is achieved by merging a set of MMA into a single MMA without internal signals. For instance, based on the set of MMA of MuD depicted in Figures 4 and 5, the composed MMA is illustrated in Figure 6. After MMA composition, the mode mapping of MuD is defined by a single MMA, where each state represents a mode combination between MuD and its subcomponents, i.e., *s*<sup>1</sup> = (*Rd*, *D*,*Rvd*, *D*), *s*<sup>2</sup> = (*Ed*,*R3*, *Evd*,*Rad*) and *s*<sup>3</sup> = (*Dq*,*R3*, *Evd*, *D*). All internal signals are eliminated. This composed MMA is the actual MMA referenced by the mode-switch manager of MuD, since the mode-switch manager does not care about the internal synchronization of a set of MMA. However, the composed MMA can be much more complex than any single MMA before the composition. Instead of specifying mode-switch directly with the composed MMA, it is much easier to design mode mapping with a self-MMA and the child MMA first and then compose them. The synchronization semantics of a set of MMA and the formal definition of MMA composition can be found in the extended technical report [10].

**Figure 6.** MMA composition for MuD.

#### *2.4. Mode Mapping Verification*

A crucial issue in designing mode mapping with MMA is ensuring the correctness of the mode mapping, i.e., for each input external signal from the mode-switch manager, the set of MMA should produce the expected set of external signals as the output back to the mode-switch manager. For instance, failing to synchronize an internal signal will never yield a mode mapping result.

The mode mapping of a composite component specified by MMA can be easily verified by model checking. We use the model checker UPPAAL [11] for mode mapping verification. Since UPPAAL is a convenient tool for modeling and verifying concurrent state transition systems, it is fairly straightforward to graphically model a set of MMA in UPPAAL. Using UPPAAL, we have modeled (the UPPAAL model is available at http://mdh.diva-portal.org/smash/record.jsf?pid=diva2%3A1244506& dswid=1426) the mode mapping of MuD specified by the set of MMA in Figures 4 and 5.

The behaviors of the local mode-switch manager of MuD, the self-MMA of MuD and the child MMA of its subcomponents are modeled as separate automata in UPPAAL. For instance, Figure 7 showcases three typical UPPAAL models for the mode-switch manager of MuD, *MMA<sup>c</sup> VAE* and *MMA<sup>s</sup> MuD*. Each mode of a component is represented by a state in these UPPAAL models (e.g., *mode\_R3* represents mode *R3* in Figure 7b). These models also contain committed states marked with "C" in a circle, which are intermediate states during a mode-switch. External and internal signals are simulated as channels synchronized between multiple UPPAAL models. For example, *VAE\_I[R3]!* denotes the internal signal VAE.I(R3) emitted by *MMA<sup>s</sup> MuD*, while *VAE\_I[R3]?* denotes the same signal VAE.I(R3)received by *MMA<sup>c</sup> VAE*. The UPPAAL model of *MMA<sup>s</sup> MuD* in Figure 7c is consistent with *MMA<sup>s</sup> MuD* in Figure 4. The reason why the UPPAAL model contains one or more intermediate states for each mode-switch is that receiving and sending each signal needs to be modeled sequentially in UPPAAL. This essentially does not change the execution semantics, as all intermediate states are committed states, whose incoming and outgoing transitions are performed as a single atomic transaction. In addition, shown in Figure 7a, the mode-switch manager of MuD consists of two states. *InitialS* is the initial state, where the mode-switch manager can send an external signal to *MMA<sup>s</sup> MuD* and switch to the state *ModeSwitching*. Meanwhile, a Boolean variable *switching* is set to

true, indicating an ongoing mode-switch. Depending on the current mode of MuD and the new mode indicated by the external signal from the mode-switch manager, there are four possible events, leading to different transitions among these components: (1) *k*1: MuD requests to switch from *Rd* to *Ed*; (2) *k*2: MuD requests to switch from *Ed* to *Rd*; (3) *k*3: MuD requests to switch from *Ed* to *Dq*; (4) *k*4: MuD requests to switch from *Dq* to *Ed*. Each event ID is assigned to a variable *eventID* as shown in Figure 7c.

**Figure 7.** UPPAALmodels of the mode mapping of MuD. (**a**) UPPAAL model for the mode-switch manager of MuD. (**b**) UPPAAL model for the child MMA of VAE. (**c**) UPPAAL model for the self-MMA of MuD.

Based on the UPPAAL models, we can verify that the set of MMA of MuD satisfies the expected constraints by checking properties formulated in the UPPAAL query language, which is a subset of timed computation tree logic [12]. The following are four types of properties addressing different constraints:


switch from *Rd* to *Ed* will make VAE, ViD and AuD switch to *R3*, *Evd* and *Rad*, respectively. This property should be verified for all possible events from *k*1–*k*4.

All these properties are satisfied with verification time less than 4 ms. Furthermore, our UPPAAL models can be used as a common template for modeling any other mode mapping specified by MMA. Due to the graphical resemblance between an MMA and the corresponding UPPAAL model, it is possible to generate UPPAAL models from MMA described by a graphical or textual domain-specific language.

#### **3. Mode Transformation**

Our previous research results [9] show that the mode-switch of a multi-mode component may lead to mode-switches of other multi-mode components in the same system, and it is not trivial to coordinate the mode-switches of different components at run-time. The local mode-switch manager of each component needs to run delicate algorithms to communicate with the parent and subcomponents of the component via dedicated mode-switch ports to switch mode cooperatively. Such inter-component communication incurs run-time computation overhead and mode-switch latency. For instance, when MoS in the healthcare monitoring system triggers a mode-switch from *Rm* to *Att*, the mode-switch event is first propagated from MoS to MuD and EvA, and MuD subsequently propagates the mode-switch event to VAE, ViD and AuD. Further, more handshake messages are exchanged between these components to keep mode consistency. The communication overhead grows as the component hierarchy becomes more complex.

The purpose of mode transformation is to eliminate the need for the mode-switch coordination among different multi-mode components by a centralized mode management, and thereby achieve better run-time performance, provided that (1) all components are deployed on the same hardware platform and (2) the mode information of each component is globally accessible. Illustrated in Figure 8, mode transformation transfers the responsibility of mode-switch handling from the local mode-switch manager of each component to a single global mode-switch manager. As a result of mode transformation, each multi-mode component becomes unaware of modes. Instead, a global mode transition graph is generated for the global mode-switch manager to handle mode-switch at the system level.

**Figure 8.** The overview of mode transformation.

Our mode transformation process includes two sequential steps. First, given the mode mappings of all composite components, we construct an intermediate representation, a mode combination tree (MCT), where all the possible system modes are identified. In the second step, based on a list of possible mode-switch events defined in the system, we add transitions between the identified system modes to construct the mode transition graph. The two steps are further explained in the following subsections separately.

#### *3.1. Construction of the Mode Combination Tree*

The aim of constructing the MCT is to identify all the system modes. Let M*<sup>c</sup>* denote the set of supported modes of a component *c* and *D* denote the current mode of a deactivated component. Then, we define system modes as follows:

**Definition 2.** *System modes based on component modes: For a system composed by a set of components* <sup>C</sup> <sup>=</sup> {*c*1, *<sup>c</sup>*2, ··· , *cn*} (*<sup>n</sup>* <sup>∈</sup> <sup>N</sup>)*, the set of system modes is defined as* <sup>M</sup>*<sup>s</sup>* <sup>⊆</sup> ×*<sup>i</sup>*∈[1,*n*] {M*ci* ∪ {*D*}}*. Each system*

*mode m* ∈ M*<sup>s</sup> is a mode combination of all components.*

By Definition 2, each system mode *m* = (*mc*<sup>1</sup> , *mc*<sup>2</sup> , ··· , *mcn* ), where *mci* ∈ M*ci* ∪ {*D*} for *i* ∈ [1, *n*]. In order to indicate more explicitly the relationship between *ci* and *mci* , we shall hereafter use an alternative expression to represent a system mode: *m* = {(*ci*, *mci* )|*i* ∈ [1, *n*]}, where *mci* ∈ M*ci* ∪ {*D*}. Using the same formalism, an MCT is defined as follows:

**Definition 3.** *Mode combination tree: An MCT is a tree with a set of nodes* N = {N0, N1, ··· , N*n*} (*n* ∈ N)*, where* N<sup>0</sup> = ∅ *is the root node, and each other node* N*<sup>i</sup>* = {(*cj*, *mcj* )|*j* ∈ [1, *k*], *k* ∈ N} (*i* ∈ [1, *n*])*, where for all j, mcj* ∈ M*cj* ∪ {*D*} *and all cj have the same depth level in the system component hierarchy.*

By Definition 3, each non-root node of an MCT provides a mode combination of components with the same depth level. A typical outlook of MCT is displayed in Figure 9, while the construction of the MCT will be further explained later.

A few more notations and concepts need to be introduced before the formal description of the MCT construction process. First, we introduce the valid local mode combination (LMC) of a composite component *c*, which is a feasible combination of a mode of *c* and the modes of all its subcomponents as per the local mode mapping of *c*. To define the valid LMC of a composite component formally, let P C and CC be the set of primitive components and composite components in a system, respectively. Let *Top* be the component at the top of the component hierarchy in a system. For each *c* ∈ CC, a valid LMC of *c* is formally defined as follows:

**Definition 4.** *Valid local mode combination: For <sup>c</sup>* ∈ CC *with* SC*<sup>c</sup>* = {*c*<sup>1</sup> *<sup>i</sup>* , ··· , *<sup>c</sup><sup>n</sup> <sup>i</sup>* } *(n* ∈ N*), we call the set* V*<sup>c</sup>* = {(*c*, *mc*),(*c*<sup>1</sup> *<sup>i</sup>* , *mc*<sup>1</sup> *i* ), ··· ,(*c<sup>n</sup> <sup>i</sup>* , *mcn i* )} *a valid LMC of c, where mc* ∈ M*<sup>c</sup>* ∪ {*D*} *and* ∀*k* ∈ [1, *n*]*, mk ci* ∈ M*c<sup>k</sup> i* ∪ {*D*}*, if* (*mc*, *mc*<sup>1</sup> *i* , ··· , *mcn i* ) *is a state of the composed MMA of c.*

Note that each element in V*<sup>c</sup>* is a pair (*x*, *y*), where *x* ∈ SC*<sup>c</sup>* ∪ {*c*} and *y* ∈ M*<sup>x</sup>* ∪ {*D*}. For instance, the mode mapping of MoS in Table 1 (a) implies three valid LMCs of MoS: (1) {(MoS, *Rm*),(DaD, *R1*),(MuD, *Rd*),(EvA, *D*)}; (2) {(MoS, *Att*),(DaD, *R1*),(MuD, *Ed*),(EvA, *R2*)}; (3) {(MoS, *Att*),(DaD,*R1*),(MuD, *Dq*),(EvA,*R2*)}.

Based on Definition 4, we further introduce the valid LMC concerning a specific mode of a composite component *c*, which is a feasible combination of the modes of all subcomponents of *c* as per the local mode mapping of *c* when *c* is running in a particular mode. A formal definition is given as follows:

**Definition 5.** *Valid LMC concerning a specific mode: For <sup>c</sup>* ∈ CC *with* SC*<sup>c</sup>* = {*c*<sup>1</sup> *<sup>i</sup>* , ··· , *<sup>c</sup><sup>n</sup> <sup>i</sup>* } *(n* ∈ N*), if when <sup>c</sup> is running in mc, and* ∀*c<sup>k</sup> <sup>i</sup>* ∈ SC*<sup>c</sup>* (*k* ∈ [1, *n*])*,* ∃*mck i such that* {(*c*, *mc*),(*c*<sup>1</sup> *<sup>i</sup>* , *mc*<sup>1</sup> *i* ), ··· ,(*c<sup>n</sup> <sup>i</sup>* , *mcn i* )} *is a valid LMC of c, then the set* V*c*,*mc* = {(*c*<sup>1</sup> *<sup>i</sup>* , *mc*<sup>1</sup> *i* ), ··· ,(*c<sup>n</sup> <sup>i</sup>* , *mcn i* )} *is a valid LMC of c for mc.*

Depending on the mode mapping of *c*, multiple valid LMCs of *c* may exist for *mc*. Let W*c*,*mc* be the set of all valid LMCs of *c* ∈ CC for *mc*. Each element in W*c*,*mc* is a set V*c*,*mc* . The total number of all valid LMCs of *c* for *mc* is |W*c*,*mc* |. For instance, according to Table 1 (a), WMoS,*Att* = {V<sup>1</sup> MoS,*Att*, V<sup>2</sup> MoS,*Att*}, where V<sup>1</sup> MoS,*Att* = {(DaD, *R1*),(MuD, *Ed*),(EvA, *R2*)} and V<sup>2</sup> MoS,*Att* = {(DaD,*R1*),(MuD, *Dq*),(EvA,*R2*)}. By traversing the states of the composed MMA of *c* containing *mc*, it is easy to automatically generate W*c*,*mc* .

Next, we introduce an important operator for combining different valid LMCs:

**Definition 6.** *Valid LMC operation: Consider two sets of valid LMCs* W<sup>1</sup> = {V1, V2, ··· , V*m*} *and* W<sup>2</sup> = {V*k*<sup>+</sup>1, V*k*<sup>+</sup>2, ··· , V*k*+*n*}*, where m*, *n*, *k* ∈ N *and k* ≥ *m. Let* ⊕ *be an operator such that* W<sup>1</sup> ⊕ W<sup>2</sup> = {V*<sup>i</sup>* ∪ V*k*+*j*|*i* ∈ [1, *m*], *j* ∈ [1, *n*]}*. In addition, for each l* ∈ N*,* W<sup>1</sup> ⊕ W<sup>2</sup> ⊕···⊕W*<sup>l</sup> can be represented as o*∈[1,*l*] W*o.*

For the sake of clarity, let us clarify the ⊕ operator using a small example. Suppose W<sup>1</sup> = {V1, V2} where V<sup>1</sup> = {(*a*, *<sup>m</sup>*<sup>1</sup> *<sup>a</sup>*),(*b*, *m*<sup>1</sup> *<sup>b</sup>*)} and V<sup>2</sup> = {(*a*, *<sup>m</sup>*<sup>2</sup> *<sup>a</sup>*),(*b*, *m*<sup>2</sup> *<sup>b</sup>*)}; and W<sup>2</sup> = {V3, V4} where V<sup>3</sup> = {(*c*, *<sup>m</sup>*<sup>1</sup> *<sup>c</sup>* ),(*d*, *m*<sup>1</sup> *<sup>d</sup>*)} and V<sup>4</sup> = {(*c*, *<sup>m</sup>*<sup>2</sup> *<sup>c</sup>* ),(*d*, *m*<sup>2</sup> *<sup>d</sup>*)}. Then,

$$\begin{split} \mathcal{W}\_{1} \oplus \mathcal{W}\_{2} &= \{ \mathcal{V}\_{1} \cup \mathcal{V}\_{3}, \mathcal{V}\_{1} \cup \mathcal{V}\_{4}, \mathcal{V}\_{2} \cup \mathcal{V}\_{3}, \mathcal{V}\_{2} \cup \mathcal{V}\_{4} \} \\ &= \{ \{ (a, m\_{a}^{1}), (b, m\_{b}^{1}), (c, m\_{c}^{1}), (d, m\_{d}^{1}) \}, \\ &\{ (a, m\_{a}^{1}), (b, m\_{b}^{1}), (c, m\_{c}^{2}), (d, m\_{d}^{2}) \}, \\ &\{ (a, m\_{a}^{2}), (b, m\_{b}^{2}), (c, m\_{c}^{1}), (d, m\_{d}^{1}) \}, \\ &\{ (a, m\_{a}^{2}), (b, m\_{b}^{2}), (c, m\_{c}^{2}), (d, m\_{d}^{2}) \} \} \end{split}$$

Given the mode mappings of all composite components, the MCT of the system can be constructed by creating nodes top-down from the root node. For each node N of an MCT, let *d*<sup>N</sup> be its depth level and *λ*<sup>N</sup> be the number of new nodes created from this node. We use N*<sup>i</sup>* N*<sup>j</sup>* to denote that a new node N*<sup>i</sup>* is created from on old node N*j*. Moreover, let M*Top* = {*m*<sup>1</sup> *<sup>T</sup>*, *<sup>m</sup>*<sup>2</sup> *<sup>T</sup>*, ··· , *<sup>m</sup>*|M*Top*<sup>|</sup> *<sup>T</sup>* } be the set of supported modes of *Top*. The MCT is constructed by the following steps:

	- N N , N ∈ *i*∈[1,*n*], *ci*∈CC W*ci*,*mci* . Moreover, if *λ*<sup>N</sup> > 1, then for each N , N N , we have

$$
\mathcal{N}' \neq \mathcal{N}''.
$$

4. Repeat Step 3 until all branches of the MCT have reached the leaf node.

The MCT construction process is implemented as Algorithm 1, which is a recursive function *constructMCT*(N , *d*<sup>N</sup> ) that has two input parameters: N is the node currently being explored and *d*<sup>N</sup> is the depth level of N . Initially, N = ∅ and *<sup>d</sup>*<sup>N</sup> = 0. We assume that *Top* must have subcomponents. Otherwise, *Top* itself will be the entire system, and mode transformation will be meaningless. Moreover, for each component *c* running in mode *m*, we assume that W*c*,*<sup>m</sup>* is an indexed set such that W*c*,*m*[*i*] represents the *i*-th element of W*c*,*m*.

**Algorithm 1** *constructMCT*(N , *d*<sup>N</sup> ).

1: **if** *<sup>d</sup>*<sup>N</sup> = <sup>0</sup> **then** 2: *<sup>λ</sup>*<sup>N</sup> := |M*Top*|; 3: **for** *i from* 1 *to λ*<sup>N</sup> **do** 4: <sup>N</sup>*<sup>i</sup>* :<sup>=</sup> {(*Top*, *<sup>m</sup><sup>i</sup> <sup>T</sup>*)}; 5: *constructMCT*(N*i*, 1); 6: **end for** 7: **end if** 8: **if** *<sup>d</sup>*<sup>N</sup> = <sup>1</sup> **then** 9: {(*Top*, *m*)} := N ; 10: *Derive* W*Top*,*m*; 11: *<sup>λ</sup>*<sup>N</sup> := |W*Top*,*m*|; 12: **for** *i from* 1 *to λ*<sup>N</sup> **do** 13: *constructMCT*(W*Top*,*m*[*i*], 2); 14: **end for** 15: **end if** 16: **if** *d*<sup>N</sup> ≥ 2 **then** 17: {(*c*1, *mc*<sup>1</sup> ),(*c*2, *mc*<sup>2</sup> ), ··· ,(*cn*, *mcn* )} := N ; 18: **if** ∀*i* ∈ [1, *n*] : *ci* ∈ PC **then** 19: **return** ; 20: **else** 21: *Derive* W := *i*∈[1,*n*], *ci*∈CC W*ci*,*mci* ; 22: *<sup>λ</sup>*<sup>N</sup> := <sup>∏</sup> *i*∈[1,*n*], *ci*∈CC |W*ci*,*mci* |; 23: **for** *i from* 1 *to λ*<sup>N</sup> **do** 24: *constructMCT*(W[*i*], *<sup>d</sup>*<sup>N</sup> + <sup>1</sup>); 25: **end for** 26: **end if** 27: **end if**

Once the MCT is constructed, the system modes can be derived as the set of paths from the root node to the leaf nodes of the MCT. The total number of system modes is equal to the total number of leaf nodes of the MCT. Among the system modes, the initial system mode can be recognized based on the specification of the initial modes of all components.

As an example, Figure 9 illustrates the MCT of the monitoring subsystem introduced in Section 1. The MCT consists of nine nodes N0–N<sup>8</sup> with four depth levels. Represented by the respective paths of the MCT, one of the three identified system modes is:

$$\begin{aligned} m\_1 &= \mathcal{N}\_0 \cup \mathcal{N}\_1 \cup \mathcal{N}\_3 \cup \mathcal{N}\_6 \\ &= \{ (\mathit{MoS}, \mathit{Rm}), (\mathit{DaD}, \mathit{R1}), (\mathit{MuD}, \mathit{Rd}), (\mathit{EvA}, \mathit{D}), (\mathit{VAE}, \mathit{D}), (\mathit{ViD}, \mathit{Rvd}), (\mathit{AuD}, \mathit{D}) \} \\ m\_2 &= \mathcal{N}\_0 \cup \mathcal{N}\_2 \cup \mathcal{N}\_4 \cup \mathcal{N}\_7 \\ &= \{ (\mathit{MoS}, \mathit{Att}), (\mathit{DaD}, \mathit{R1}), (\mathit{MuD}, \mathit{Ed}), (\mathit{EvA}, \mathit{R2}), (\mathit{VAE}, \mathit{R3}), (\mathit{ViD}, \mathit{Evd}), (\mathit{AuD}, \mathit{Rad}) \} \\ m\_3 &= \mathcal{N}\_0 \cup \mathcal{N}\_2 \cup \mathcal{N}\_5 \cup \mathcal{N}\_8 \\ &= \{ (\mathit{MoS}, \mathit{Att}), (\mathit{DaD}, \mathit{R1}), (\mathit{MuD}, \mathit{Dq}), (\mathit{EvA}, \mathit{R2}), (\mathit{VAE}, \mathit{R3}), (\mathit{ViD}, \mathit{Evd}), (\mathit{AuD}, \mathit{D}) \} \end{aligned}$$

Assuming that the monitoring subsystem starts with mode *Rm*, *m*<sup>1</sup> is the initial system mode after mode transformation. Figure 10 shows the configurations of the three system modes based on the component connections in Figure 1.

**Figure 9.** The mode combination tree of the monitoring subsystem.

**Figure 10.** The configurations of different system modes after mode transformation.

The complexity of an MCT depends on the structure of the component hierarchy, the number of modes of each component and the mode mappings in the involved components. The worst-case combination of factors, such as the number of components and the number of component modes, may lead to a huge number of system modes, increasing the overhead exponentially. However, in practice, the expected number of system modes should be limited. If mode transformation becomes intractable due to extreme computation overhead, this would imply that the system is too complex to adopt centralized mode management. Then, it may be more suitable to go for distributed mode management without mode transformation, although the run-time overhead for the required message exchange may be substantial if the component hierarchy is deep. Alternatively, a better solution could be partial mode transformation, i.e., performing mode transformation within one or more composite components instead of the entire system. Our mode transformation technique is flexible enough to support partial mode transformation at any component level. Furthermore, we expect noticeably different behaviors in different modes. Depending on the application, it could be more efficient to merge several modes

with similar global configurations into a single mode. The criteria for merging system modes are application-dependent and out of the scope of this article. Nevertheless, we believe that it is possible to partially automate the merging of system modes in a later optimization phase by certain application independent merging rules.

#### *3.2. Deriving the Mode Transition Graph*

The constructed MCT identifies system mode, which is subsequently used to derive the mode transition graph on top of these system modes based on the definition of mode-switch events. We assume that a mode-switch event is triggered by a component *c* requesting to switch mode from *m*<sup>1</sup> *<sup>c</sup>* to *m*<sup>2</sup> *<sup>c</sup>* , denoted as *c* : *m*<sup>1</sup> *<sup>c</sup>* → *<sup>m</sup>*<sup>2</sup> *<sup>c</sup>* . The triggering of each mode-switch event may lead to the mode-switches of some other components in the same system. For a system with a set of identified system modes M = {*m*1, *m*2, ··· , *mn*} (*n* ∈ N), a mode-switch is a transition from *mold* to *mnew*, where *mold*, *mnew* ∈ M and *mold* = *mnew*. A mode transition graph contains all the possible transitions between these system modes and associates each transition with the corresponding mode-switch event. Similar to an MMA, each state of a mode transition graph can be graphically represented by a circle, with the initial state being marked by a double circle. A graphical illustration of the mode transition graph can be found in Figure 11.

**Figure 11.** Deriving the mode transition graph of the monitoring subsystem. CTM, component target mode.

The key issue of deriving the mode transition graph is to identify the system modes *mold* and *mnew* for each mode-switch event for which a system mode-switch is possible. Consider a mode-switch event *k* identified as *c* : *m*<sup>1</sup> *<sup>c</sup>* → *<sup>m</sup>*<sup>2</sup> *<sup>c</sup>* . The only condition satisfying the triggering of *k* is that the triggering source *c* is currently running in mode *m*<sup>1</sup> *<sup>c</sup>* . For each *k*, *mold* can be easily identified as long as (*c*, *m*<sup>1</sup> *<sup>c</sup>* ) ∈ *mold*. Note that more than one system mode could be identified as *mold*. Depending on the current system mode, a mode-switch event may enable different transitions.

In contrast to *mold*, only one system mode can be the *mnew* for each mode-switch event *k*. The identification of *mnew* for *k* is more difficult because it depends not only on *m*<sup>2</sup> *<sup>c</sup>* , but also on the target modes of the other components. We identify the *mnew* for each mode-switch event with the assistance of a component target mode (CTM) table. A CTM table is a table with *n*<sup>1</sup> rows and *n*<sup>2</sup>

columns, where *n*<sup>1</sup> is the number of components of a system and *n*<sup>2</sup> is the number of mode-switch events. An example of a CTM table is shown above the mode transition graph in Figure 11. In the CTM table, each row is associated with a component, each column is associated with a mode-switch event and each cell contains the target mode *mc* of the corresponding component *c* for the corresponding mode-switch event *k*. The cell with *X* indicates that *mc* is independent of *k*, i.e., *k* does not lead to the mode-switch of *c*.

A CTM table can be automatically constructed offline based on the list of mode-switch events and the mode mapping of each composite component. Let *m<sup>k</sup> <sup>c</sup>* be the target mode of *c* for *k* in a CTM table. Taking advantage of the CTM table, the new system mode *mnew* for each mode-switch event *k* can be identified as follows: For each system mode *m* = {(*ci*, *mci* )|*<sup>i</sup>* ∈ [1, *<sup>n</sup>*], *<sup>n</sup>* ∈ N}, if ∀*<sup>i</sup>* where *<sup>m</sup><sup>k</sup> ci* = *X* in the CTM table (i.e., *k* leads to the mode-switch of *ci* to a new mode *m<sup>k</sup> ci* ), we have *mci* = *m<sup>k</sup> ci* , then *m* is the *mnew* for *k*. Algorithm 2 describes the process of building the mode transition graph, with a search space of *O*(|M| · |K|).

#### **Algorithm 2** *constructMTG*(C,M, K).

1: C = {*c*1, ··· , *co*} (*o* ∈ N); {*The set of all components*} 2: M = {*m*1, ··· , *mn*} (*n* ∈ N); {*The set of identified system modes*} 3: K = {*k*1, ··· , *kl*} (*l* ∈ N); {*The set of all mode-switch events*} 4: **for all** *ki* ∈ K *where k* <sup>∈</sup> [1, *<sup>l</sup>*] *and ki* <sup>=</sup> *<sup>c</sup>* : *<sup>m</sup>*<sup>1</sup> *<sup>c</sup>* <sup>→</sup> *<sup>m</sup>*<sup>2</sup> *<sup>c</sup>* **do** 5: **if** <sup>∃</sup>*mj* ∈ M *<sup>s</sup>*.*t*. (∀*cp* ∈ C *and mki ci* <sup>=</sup> *<sup>X</sup>*,(*cp*, *<sup>m</sup>ki cp* ) ∈ *mj*) **then** 6: *mnew* = *mj*; 7: **for all** *mj* ∈ M **do** 8: **if** (*c*, *m*<sup>1</sup> *<sup>c</sup>* ) ∈ *mj* **then** 9: *addTransition*(*mj*, *mnew*, *ki*); {*Add a transition from mj to mnew labeled with ki*} 10: **end if** 11: **end for** 12: **end if** 13: **end for**

Figure 11 presents the workflow for deriving the mode transition graph of the monitoring subsystem. The CTM table is derived based on two inputs: (1) the mode mapping of composite components MoS and MuD specified by MMA and (2) the possible mode-switch events. For this example, four mode-switch events, from *k*1–*k*4, are specified at design time. *k*<sup>1</sup> and *k*<sup>2</sup> are triggered by MoS for switching between modes *Rm* and *Att*, while *k*<sup>3</sup> and *k*<sup>4</sup> are triggered by MuD for switching between modes *Ed* and *Dq*. The target modes of all components in the monitoring subsystem for all mode-switch events are listed in the CTM table. Previously, the MCT in Figure 9 has identified three system modes *m*<sup>1</sup> (the initial mode), *m*<sup>2</sup> and *m*3. The CTM table additionally adds transitions between the system modes based on each possible mode-switch event, thereby yielding the mode transition graph. The mode transition graph helps the global mode-switch manager to keep track of the current system mode and makes the system switch to the right target mode when a mode-switch is triggered.

After mode transformation, local mode-switch managers are replaced with a single global mode-switch manager, whose complexity is much lower than each local mode-switch manager. A local mode-switch manager needs complex algorithms [9] to coordinate mode-switches between a composite component and its subcomponents, such as checking component states before mode-switch is performed, handling multiple concurrent mode-switch events triggered by different components and handling emergency mode-switch events, which are more critical than regular mode-switch events. By contrast, the global mode-switch manager only takes care of system mode based on a single CTM table.

Mode transformation assumes no dynamic change of modes or mode mappings at any level. If there is a need to change the mode of a component or its mode mapping (e.g., adding new modes, removing modes, changing mode names), then mode transformation must be applied again from scratch. The chain effect of such a change must be considered when it is propagated to other

components. The change of mode mapping of one component may entail the change of mode mapping of its parent component and subcomponents. The architecture designer should decide how to update the MMA of components impacted by a change. Mode transformation can always be automated in the same way once all MMA are in place.

A potential drawback of mode transformation is the loss of potential concurrency between local mode managers. If multiple mode-switch events are triggered concurrently and affect disjoint sets of components, distributed mode management before mode transformation allows that these mode-switch events can be handled concurrently, whereas different mode-switch events have to be sequentially handled by the global mode-switch manager. Nonetheless, the centralized mode management after mode transformation eliminates inter-component communication, which is a complex process [9]. Hence, mode transformation is still more likely to yield a faster mode-switch.

The correctness of the two steps of mode transformation has been verified by manual theorem proving. All the detailed theorems and proofs can be found in the extended technical report [10].

#### *3.3. Concrete Implementation of Mode Transformation*

A prototype tool MCORE [13], the Multi-mode COmponent Reuse Environment, has been developed to support the modeling of multi-mode systems with multi-mode components by integrating mode mapping and mode transformation. Compared with other component-based development tools, a distinguishing feature of MCORE is the reuse of multi-mode software components. As far as we know, MCORE is the first (and possibly only) tool for building multi-mode systems with multi-mode components. MCORE can be potentially used as a preprocessor for Rubus ICE [14], which is an IDEfor the Rubus component model [15] developed by Arcticus Systems (http://www.arcticus-systems.com/). As an industrial component model, Rubus is targeting the component-based development of vehicular systems. Rubus supports multi-mode systems; however, modes can only be specified at the system level, and the reuse of multi-mode components is not supported. This limitation can be alleviated by MCORE. The system model built by multi-mode components in MCORE is in compliance with the Rubus component model after mode transformation. Hence, the system model designed in MCORE can be imported to Rubus ICE for further analysis, test and code generation.

#### **4. Related Work**

The extended MECHATRONICUML [16,17] (EUML) allows the hierarchical composition of reconfigurable components, which are comparable to our multi-mode components. EUML introduces an additional reconfiguration port for each component, which resembles the dedicated mode-switch ports of a multi-mode component. In EUML, the reconfiguration of a composite component is handled by two dedicated subcomponents, which play similar roles as the local mode-switch manager of a multi-mode component. Unlike our approach, EUML does not pre-define component configurations at design-time, thus allowing more flexible reconfiguration at run-time. Compared with such reconfigurable systems, multi-mode systems built by multi-mode components are more predictable due to static configurations specified at design-time.

Pop et al. proposed an Oracle-based approach [18] that also supports the reuse of multi-mode components. Component behaviors are abstracted into a global property network. Component mode is treated as a property dependent on other property values. The change of one property is propagated throughout the property network, potentially leading to the change of other properties. At the end of propagation, component modes are updated top-down. Similar to our mode transformation, a finite-state machine called Oracle is offline constructed to guarantee a predictable update time of the property network. The mapping between component modes is however not systematically specified in the Oracle-based approach.

Weimer et al. proposed a set of input-output blocks for building multi-mode systems [19]. Each multi-mode component contains a set of mode blocks (MBs), while each MB includes all

the components used for the corresponding mode. The mode-switch of a component is achieved by switching the currently selected MB controlled by a supervisor block (SB). These blocks were implemented in Simulink [20]. Another work similar to this is the mode-oriented design [21] in Gaspard2 [22]. A multi-mode component is represented by a macro component, which consists of a state graph component and one or more mode-switch components. A mode-switch component plays the same role as the MB in [19]. Both approaches in [19,21] use completely different components for different modes, whereas in our approach, it is possible to share some components and connections in different modes. Hence, our approach is more suitable for the reuse of multi-mode components.

Mode-switch has been addressed in a number of component models, e.g., SaveCCM [23], COMDES-II [24] and MyCCM-HI [25], to name a few. There are also some other component models that have been commercialized, e.g., Koala [26] (targeting consumer electronics) and Rubus [15] (targeting ground vehicles). These component models have different notions of mode-switch handling. Koala and SaveCCM both use a special connector *switch* to achieve the structural diversity of a component. *switch* selects outgoing connections based on input data. In COMDES-II, a state-machine component switches component configurations in different modes. Rubus only considers system-level mode, which is in line with our system mode after mode transformation. MyCCM-HI supports mode-aware components whose mode-switch is controlled by a mode automaton associated with each component. Another component model supporting component reconfiguration is Fractal [27]. Each Fractal component has a membrane (a container for local controllers) that is able to control the reconfiguration of the component.

Mode-switch has also been covered by some programming and specification languages, such as AADL [28], Giotto [29], TDL [30], the extended Darwin [31] and mode-automata [32]. In AADL, component mode-switch is represented by a state machine, including states, transitions and input/output event ports used for mode-switch triggering. Both Giotto and TDL are time-triggered languages for embedded programming, which require periodic checking of conditions to decide whether to trigger a mode-switch or not. The extended Darwin [31] extends the existing Architecture Description Language Darwin [33] by incorporating the notion of mode. The mode of a composite component is directly related to the modes of its subcomponents. Yet, the mapping between modes is unclear in [31]. Mode-automata is a programming model supporting the description of running modes of reactive systems. The behavior of a system is a sequence of modes, each of which corresponds to a collection of execution states. Our MMA differs from mode-automata in the sense that mode-automata specifies the hierarchical structure of system-wide modes, whereas MMA specifies the local mode mapping within composite components.

Dynamic software product lines (DSPL) [34], which originates from the conventional software product lines (SPL) [35] for producing a family of software systems, is an emerging technique for developing adaptive systems. Different systems configured from the same SPL share certain common features, whereas the SPL uses variation points to distinguish the unique features of each system. DSPL allows the binding of variation points at run-time so that a system can dynamically change configurations on the fly to accommodate the changing environment. DSPL is becoming more adaptive [36]; however, to the best of our knowledge, DSPL only considers global system configurations without considering reuse of adaptive software components.

Different types of automata have been proposed for component-based systems and multi-mode systems. For instance, constraint automata [37] is used to model the functional coordination of components, thereby enabling the formal verification of coordination mechanisms. Besides, multi-mode automata [38] is intended for compositional analysis of multi-mode real-time systems. The MMA presented in this article serves as a formalism for a unique and dedicated purpose: mode mapping, which to our knowledge has not been addressed by other existing automata.

Criado et al. [39] proposed a method for an adaptive component-based architecture using model transformation. Software architecture can be dynamically constructed based on transformation rules defined in a repository. Their proposal was applied to component-based GUIs for web applications. Compared to our approach, their adaptation runs at the system level only.

#### **5. Conclusions and Future Work**

Partitioning system behaviors into modes and component-based software engineering are both successful software development methods to tame the growing software complexity of modern cyber-physical systems (CPS). It is still an under-researched area to combine both methods, due to their conflicting natures: multi-mode systems are built top-down, while component-based systems are built bottom-up. In this article, we combine the advantages of both methods and propose the component-based software development of multi-mode systems, characterized by the reuse of multi-mode components, i.e., components that can run in different modes and switch mode guided by a local mode-switch manager. We specify the local mode mapping of each composite component by mode mapping automata. Mode mapping is then complemented by a mode transformation technique that transforms component modes to system modes for centralized mode management to improve the mode-switch performance, since the transformation eliminates the need for inter-component communication to coordinate a mode-switch at run-time, thereby reducing mode-switch overhead and shortening mode-switch time. Mode transformation is an optional and flexible process that can be taken for the entire system if the mode information of all components is globally accessible and all software components are deployed on the same hardware platform, or within certain composite components instead of the entire system. It can even be performed iteratively. For instance, in scenarios where systems are built from composite components provided by different vendors that do not want to reveal the internal structure of their components to the integrator, each vendor could apply mode transformation on the level of their respective composite component, and the integrator could then compose the resulting mode-mappings to a system-wide centralized mode management.

The healthcare monitoring system introduced in this article is only a proof-of-concept guiding example. In future work, our software development approach should be further evaluated, and before being deployed, its applicability and concrete implementation should be explored in more substantial real-world systems. Moreover, some remaining efforts need to be invested to complete the development of our prototype tool MCORE fully and its integration in the commercial tool Rubus ICE developed by Arcticus Systems. This will allow us to develop reusable multi-mode software components in MCORE as a preprocessor of Rubus ICE, perform mode transformation therein and then export the system model with global system modes to Rubus ICE for further analysis, test and code generation. Still, the actual effects in terms of resulting improvements, additional and/or reduced efforts, improved quality, etc., throughout the life-cycle of a CPS require empirical evidence much beyond what is presented in this article.

At a more general level, this article presents essential bricks and related glue for a small part of the wall needed to tame the complexity of CPS-related development and life-cycle challenges to a level that allows future deployment of the many technical solutions required to address several of the key challenges of modern society successfully. Many more bricks are however needed, as well as the glue that enables their successful composition.

**Author Contributions:** methodology, H.H. (Hang Yin); validation, H.H. (Hang Yin); formal analysis, H.H. (Hang Yin); writing—original draft preparation, H.H. (Hang Yin); writing—review and editing, H.H. (Hans Hansson); supervision, H.H. (Hans Hansson); project administration, H.H. (Hans Hansson).

**Funding:** This work was funded by the Swedish Research Council via the framework project ARROWS (ref. 90447401) and Mälardalen University

**Acknowledgments:** The authors would like to thank Arcticus Systems for discussions and support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

1. Rajkumar, R.; Lee, I.; Sha, L.; Stankovic, J. Cyber-physical systems: The next computing revolution. In Proceedings of the Design Automation Conference, Anaheim, CA, USA, 13–18 June 2010; pp. 731–736.


c 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Adaptive Time-Triggered Multi-Core Architecture**

#### **Roman Obermaisser \*, Hamidreza Ahmadian, Adele Maleki, Yosab Bebawy, Alina Lenz and Babak Sorkhpour**

Department of Electrical Engineering and Computer Science, University of Siegen, 57068 Siegen, Germany; hamidreza.ahmadian@uni-siegen.de (H.A.); Adele.Maleki@uni-siegen.de (A.M.);

Yosab.Bebawy@uni-siegen.de (Y.B.); alina.lenz@uni-siegen.de (A.L.); Babak.Sorkhpour@uni-siegen.de (B.S.) **\*** Correspondence: roman.obermaisser@uni-siegen.de; Tel.: +49-271-740-3332

Received: 27 September 2018; Accepted: 18 January 2019; Published: 22 January 2019

**Abstract:** The static resource allocation in time-triggered systems offers significant benefits for the safety arguments of dependable systems. However, adaptation is a key factor for energy efficiency and fault recovery in Cyber-Physical System (CPS). This paper introduces the Adaptive Time-Triggered Multi-Core Architecture (ATMA), which supports adaptation using multi-schedule graphs while preserving the key properties of time-triggered systems including implicit synchronization, temporal predictability and avoidance of resource conflicts. ATMA is an overall architecture for safety-critical CPS based on a network-on-a-chip with building blocks for context agreement and adaptation. Context information is established in a globally consistent manner, providing the foundation for the temporally aligned switching of schedules in the network interfaces. A meta-scheduling algorithm computes schedule graphs and avoids state explosion with reconvergence horizons for events. For each tile, the relevant part of the schedule graph is efficiently stored using difference encodings and interpreted by the adaptation logic. The architecture was evaluated using an FPGA-based implementation and example scenarios employing adaptation for improved energy efficiency. The evaluation demonstrated the benefits of adaptation while showing the overhead and the trade-off between the degree of adaptation and the memory consumption for multi-schedule graphs.

**Keywords:** time-triggered system; real-time; cyber-physical systems; adaptation; scheduling; multi-core

#### **1. Introduction**

Safety-critical CPS demand assures services under all considered load and fault assumptions in order to minimize the risk for people, property and the environment. Therefore, electronic systems are subject to domain-specific certification processes (e.g., ISO26262, IEC61508, and DO178/254), which provide documented evidence and safety arguments.

Time-triggered systems facilitate the establishment of safety arguments and have thus become prevalent in many safety-critical applications. Schedule tables are defined at development time and determine global points in time for the initiation of activities, such as sending a message or starting a computation. Major benefits are temporal predictability, temporal composability and support for fault containment [1]. Time-triggered systems simplify the system design and reduce the probability of design faults by offering implicit synchronization, implicit flow control and transparent fault tolerance. By deriving all control signals from the global time base, there is no control flow between application components, which can be independently developed and seamlessly integrated. Furthermore, a priori knowledge about the permitted temporal behavior can be used by network guardians or operating systems for isolating faulty messages or tasks, thereby preventing fault propagation via shared resources. This fault containment is a prerequisite for active redundancy as well as modular and incremental certification [2,3].

Time-triggered operation has been realized at different levels. Many safety-critical distributed systems are deployed with time-triggered communication networks such as Time-Triggered Ethernet (TTE), Time-Triggered Protocol (TTP), FlexRay and IEEE 802.1Qbv/TSN [1,4]. Time-triggered operating systems and hypervisors (e.g., ARINC653 [5]) adopt scheduling tables for cyclic time-based executions of partitions to virtualize the processor. Time-triggered multi-core architectures (e.g., TTMPSoC [6] and COMPSoC [7]) use time-triggered Network-on-Chips (NoCs) in analogy to the time-triggered networks in distributed systems.

At the same time, adaptive system behaviors upon relevant events are desirable to improve energy efficiency, reliability and context awareness. For example, execution slack enables energy management, such as voltage/frequency scaling and clock gating. Information about faults can serve for fault recovery by redistributing application services on the system's remaining resources. Changing environmental conditions or operational modes may demand for different application services (e.g., take-off vs. in-flight of an airplane).

Time-triggered systems can support this adaptation through the deployment of precomputed schedules for the relevant events [8]. However, the major challenge is preserving the properties of time-triggered systems, such as temporal predictability, fault containment and implicit synchronization. Therefore, all building blocks must consistently switch between schedules. This requires system-wide consistent information about the context events determining the adaptation. In addition, the adaptation must work correctly in the presence of faults to prevent the introduction of vulnerabilities. Further requirements are bounded times of adaptation for fault recovery and the efficient storage of large numbers of schedules.

This paper introduces the *Adaptive Time-triggered Multi-core Architecture (ATMA)* that fulfills these requirements. The architecture establishes a system-wide consistent agreement on context events as well as robust and efficient switching between schedules.

Prior work has addressed computing schedule graphs for time-triggered systems (e.g., [9,10]). However, the combination of agreement, adaptation and meta-scheduling as part of a time-triggered multi-core architecture supporting implicit synchronization, fault containment and timeliness is an open research problem. Furthermore, the presented techniques for minimizing the overhead of adaptation (e.g., difference encoding of multi-schedule graphs, and adjustable reconvergence horizons for events) enable scalability and the deployment in resource-constrained applications.

The paper builds on previous work of the authors where a non-adaptive time-triggered multi-core architecture [6] was introduced as well as individual components for agreement [11] and adaptation [12]. This paper introduces the overall architecture of an adaptive time-triggered multi-core architecture along with the interplay of agreement, adaptation and meta scheduling. The paper provides experimental results for the overall architecture and shows the suitability for improved energy efficiency and fault tolerance. In addition, the paper introduces adaptation concepts for time-triggered systems and describes the services and system properties, which are essential to preserve implicit synchronization, temporal predictability and avoidance of resource conflicts.

The remainder of the paper is structured as follows. Section 2 analyzes the challenges and requirements for adaptation in time-triggered systems. The ATMA is the focus of Section 3. Section 4 describes the computation of multiple schedules, each serving for certain context events. Section 5 introduces agreement services, which establish chip-wide consistent context information. The context information is used for adaptive communication in Section 6. Section 7 presents example scenarios and the experimental evaluation.

#### **2. Adaptation in Time-Triggered Systems**

Adaptation in time-triggered systems is motivated by higher energy efficiency, fault recovery and the adjustment to changing environmental conditions. However, the fundamental properties and strengths of time-triggered systems must be preserved in order to obtain suitability for safety-critical systems.

#### *2.1. Properties of Time-Triggered Systems*

Time-triggered systems exhibit specific properties, which result from the dispatching of cyclic activities driven by the global time base and precomputed schedule tables:


These properties are essential characteristics of a time-triggered system and must be preserved despite adaptation. Henceforth, we call these properties the *ATMA properties*. In the ATMA, these properties are determined by the schedules computed offline, as well as the correct dispatching and schedule switching at runtime.

In order to realize these properties, a schedule encompasses three dimensions as depicted in Figure 1. The schedule defines in the temporal dimension when activities need to be dispatched with respect to the global time base. In the spatial dimension, the schedule defines the triggers for the different resources of the multi-core architecture such as Network Interfaces (NIs), routers and communication links. The third dimension corresponds to the contextual dimension, where different plans for the context events are distinguished. Overall, the time-triggered schedule thus defines what activities shall be dispatched for each resource and each relevant context event at which global points in time.

**Figure 1.** Deployment of time-triggered schedules (**left**) on the building blocks of the ATMA, such as NIs and routers (**right**).

*Time-Triggered Schedules Multi-Core Architecture*

In general, the schedules must be deployed in multiple building blocks of the architecture as depicted on the right hand side of Figure 1. Each schedule is fragmented along the spatial dimension and the resulting parts are assigned to the resources. NIs are the target in case of NoCs with source-based routing and routers in case of distributed routing. Furthermore, in order to acquire the ATMA properties, consistency of the schedules in temporal and spatial dimension is essential. For instance, messages must be passed without resource conflicts along the communication links in the spatial and temporal dimensions, while satisfying the deadlines and precedence constraints.

#### *2.2. Need for Adaptation*

In the following, the motivation for adaptation in time-triggered systems is detailed.

#### 2.2.1. Energy Efficiency

Energy efficiency is relevant for many safety-critical systems. Examples are battery-operated devices (e.g., medical equipment, wireless sensors for railway systems [13]) and systems with thermal constraints (e.g., avionics [14]). While techniques for energy management such as Dynamic Voltage and Frequency Scaling (DVFS) and clock gating are common in many application areas (e.g., consumer electronics), the applicability for safety-critical systems is often limited. Certification and the computation of Worst Case Execution Times (WCETs) for multi-core processors is challenging by itself [15] and further complicated by energy management. If dynamic slack of one task is exploited for dynamic modifications of frequencies or clock gating, then unpredictable timing effects and fault propagation can occur for other tasks due to shared resources such as caches and I/O.

In safety-critical systems, we can distinguish between two types of energy sensitivity with respect to safety certification:


Both types can be effectively supported by adaptable time-triggered systems. Adaptable time-triggered systems promise to improve energy efficiency without detrimental effects on temporal predictability and fault containment. Different schedules serve for potential system conditions like various dynamic-slack values. Each schedule can be analyzed in isolation as in a fully static time-triggered system.

#### 2.2.2. Lower Cost for Fault-Tolerance by Fault Recovery and Reduced Redundancy Degrees

In safety-critical applications, an embedded computer system has to provide its services with a dependability that is better than the dependability of any of its constituent components. Considering the failure-rate data of available electronic components, the required level of dependability can only be achieved if the system supports fault tolerance.

N-modular redundancy is a widely deployed technique, where the assumptions concerning failure modes and failure rates determine the necessary replication degrees. For example, a triple-triple redundant primary fight computer is deployed in the Boeing 777 aircraft [16]. However, emerging application areas with safety-critical embedded systems and stringent cost-pressure for electronic equipment cannot afford the high cost associated with excessive replication. For example, fully autonomous vehicles depend on ASIL-D [17] functions for environmental sensing and vehicle control. At the same time, the extreme cost pressure of the automotive industry precludes the massive deployment of redundant components.

Fault recovery is a viable alternative by switching to configurations that do not use failed resources. The following fault-recovery approaches can be deployed to achieve fault-tolerance:

1. **Modifying allocation of services to resources.** The system design involves scheduling and allocations decisions in order to perform a spatial and temporal mapping of the application (e.g., computational service, message-based communication) to the resources (e.g., routers, communication links, processing cores). Fault recovery by reconfiguration activates a configuration with a changed mapping, thereby avoiding the use of failed resources.


ATMA supports this goal by precomputing configurations for different types of faults. However, in order to comply with the requirements of safety-critical systems and to have a consistent fault-recovery, a number of challenges need to be addressed. The key challenges are the completeness of considering all potential faults, while also analyzing each configuration individually to ensure correct system states (e.g., no resource collisions, satisfaction of precedence constraints, timeliness). Another major challenge is the fault-tolerance of the fault recovery mechanism itself. A fault affecting the reconfiguration must not lead to an incorrect system state such as the partitioning of the system into subsystems with the old and the new configuration.

#### *2.3. Challenges for Adaptation*

The main challenges for adopting an adaptive system behavior in a time-triggered system are as follows:

#### 2.3.1. System-Wide Consistent State After Adaptation

A fundamental requirement for a time-triggered multi-core architecture is a consistent state of all building blocks at any point in time (e.g., tiles, NIs, routers). System-wide consistency of the active schedules with respect to the temporal and contextual dimensions is a prerequisite to maintain the ATMA properties. The consistency upon adaptation depends on:


#### 2.3.2. Bounded Time for Adaptation

The delay of reacting to context events often determines the utility of adaptation (e.g., exploitation of slack for energy efficiency). In particular, adaptation within bounded time is essential for fault recovery where the dynamics of the controlled object determine the permitted time intervals without service provision (e.g., maximum actuator freezing time).

#### 2.3.3. Fault-Tolerant Adaptation

The adaptation must be fault tolerant to ensure that a hardware or software fault does not bring the system into an erroneous state through faulty schedule switching.

#### 2.3.4. Avoidance of State Explosion

The scheduling algorithm for computing the schedules needs to avoid state explosion. For example, enforcing reconvergence horizons for context events prevents an exponential growth of the number of schedules with increasing numbers of context events [19].

Furthermore, a memory-efficient representation and storage of the schedules is required. Since schedules for different context events will typically differ only in small parts, differential representations are most suitable.

#### **3. Adaptive Time-Triggered Multi-Core Architecture**

#### *3.1. Architectural Building Blocks*

Figure 2 gives an overview of the *Adaptive Time-triggered Multi-core Architecture (ATMA)*. The architecture encompasses tiles, which are interconnected by a NoC. The NoC consists of routers, each of which is connected to other routers and to tiles using communication links. A tile includes three parts: cores for the application services (cf. green area in Figure 2), adaptation logic (cf. orange area in Figure 2) and the NI for accessing the NoC (cf. blue area in Figure 2).

**Figure 2.** Adaptive time-triggered multi-core architecture.

The cores of a tile can be heterogeneous encompassing processors that are managed by an operating system, processors with bare-metal application software or state machines implemented in hardware. Regardless of the implementation, message-based ports provide the interface of the cores towards the adaptation logic and the NI.

The adaptation logic is the key element for the consistent and timely processing of context events. It includes the following building blocks:


The NI serves as an interface to the NoC for the processing cores by injecting the messages into the NoC as well as delivering the received messages from the NoC to the cores. The NI generates the message segments (known as packets and flits), in the case a message needs to be injected into the interconnect. At the opposite direction, i.e., for the incoming messages, the NI assembles each message from the received segments and stores it to be read by the connected core.

#### *3.2. Local and Global Adaptation*

We distinguish between local and global adaptations. *Local adaptations* are those changes in a subsystem, which do not introduce any changes in the use of shared resources. For example, dynamic slack in a processor core would result in the early completion of a task and the generation of a message before its transmission is due according to the time-triggered schedule. DVFS can be used locally within the processor to complete the task and produce the message just before the scheduled transmission time. In this case, energy is saved locally without any implications on the rest of the system and the time-triggered schedule.

*Global adaptations*, in contrast, are those changes in a subsystem, which result in a temporal or spatial change in the usage of the shared resources. For example, dynamic slack of a sender can be used to transmit a message earlier. The receivers can then start their computations earlier with more time until their deadlines. The longer time budget can be used for DVFS in the receivers, thereby saving more energy because several receivers are clocked down instead of a single sender. However, a new schedule is required for global adaptation in order to preserve the ATMA properties. Such global adaptations and schedule changes must not introduce an inconsistency in the system and thus need to be harmonized over the entire chip.

To establish a chip-wide aligned adaptation of subsystems, the operation of the dispatchers of different tiles shall be harmonized by a *global time base* to have a common understanding of the time, despite different clock domains. The global time base is a low-frequency digital clock, which is based on the concept of a sparse time base [20]. The global time base provides a system-wide *notion of the time* between different components.

#### *3.3. Fault Tolerance*

ATMA offers inherent temporal predictability and fault containment for processing cores. The adaptation unit is deployed with a schedule that provides a priori knowledge about the permitted temporal behavior of messages. Using this knowledge, the adaptation unit blocks untimely messages from the processing cores, thereby preventing resource contention in the NoC. Likewise, the adaptation units and the network interfaces autonomously insert the source-based routing information into messages, thus a faulty processing core cannot influence the correct routing of messages in the NoC.

While faults of the processing cores are rigorously contained, faults affecting the adaptation units, context-agreement units, network interfaces and routers have the potential to cause the failure of the entire multi-core chip. In previous work on time-triggered multi-core architectures, the network interfaces and routers have thus been denoted as a *trusted subsystem* [6]. The risk of a failure due to design faults can be minimized through a rigorous design of these building blocks.

Two strategies can be distinguished for dealing with random faults of the trusted subsystem:


Another prerequisite for correct adaptation is the correctness of the schedules. The state of the art offers algorithms and tools supporting the verification of time-triggered schedules (e.g., TTPVerify [1] (p. 489)), which can be applied on each node of the multi-schedule graph.

#### **4. Meta Scheduling**

The meta scheduler [24] is an offline tool for computing time-triggered schedules considering the system's contextual, spatial and temporal dimensions. Different schedules are computed for all relevant context events and are deployed in the NoC adaptation units and the execution adaptation. For a given context, the corresponding schedules determine the dispatching of each resource at any point in time within the system's period.

Two types of schedules are taken into consideration: The computational schedules are computed for the tasks and define their allocation to cores and their start times. The communication schedules define the paths and injection times of the messages on the NoC. The computational and communication schedules must be synchronized as the messages to be sent are computed by the tasks.

#### *4.1. Input Models*

The meta scheduler requires the following three types of models as input (cf. Figure 3):


**Figure 3.** Meta scheduler overview.

#### *4.2. Meta Scheduling*

The meta scheduler computes a Multi-schedule Graph (MG) using the application, platform and context models. The MG is a directed acyclic graph of time-triggered schedules, which are precomputed at development time. At any instant during runtime, the time-triggered system is in one of the nodes of the multi-schedule graph. This node defines the temporal and spatial allocation of all computational and communication resources. Upon a relevant event from the context model, the node is left and the system traverses to another node of the multi-schedule graph.

To compute the MG, the meta scheduler repeatedly invokes a scheduler (cf. Figure 3). The scheduler takes an application model and a platform model as inputs and computes a time-triggered schedule fulfilling the scheduling constraints (e.g., collision avoidance, precedence constraints, and deadlines). The decision variables are the allocations of tasks to cores, start times of tasks, messages paths along routers, message injection times and fetch times for the adaptation manager to read the latest instance of the context vector.

Decision variables also include parameters for improving energy efficiency such as the time intervals for clock gating and DVFS. The time-triggered schedule specifies the frequencies of cores and routers in different time intervals of the time-triggered schedule [24,26]. The resulting overhead such as the time needed to adjust the voltage needs to be considered in the constraints of the optimization problem.

The state of the art encompasses a broad spectrum of algorithms (e.g., genetic algorithms, SAT/SMT, and MILP) to solve this static scheduling problem.

The algorithm of the meta scheduler is depicted in Figure 4. The meta scheduler starts in an initial state *S*<sup>0</sup> assuming the absence of faults, slack events and resource alerts. It invokes the scheduler to obtain a schedule for *S*0. The meta scheduler then performs a time step until one of the events from the context model can occur.

Some events such as the occurrence of a particular dynamic-slack value occur at specific points in time (e.g., termination of a task after 50% of its WCET). For events that can occur at any time such as faults and resource alerts, predefined sampling points for fault detection or resource levels are assumed.

The meta scheduler applies the earliest context event, which results in a changed application or platform model. For example, a fault event results in the removal of a resource from the platform model. A dynamic slack event results in a shorter execution time of a task. Thereafter, the meta scheduler invokes the scheduler again to compute a new schedule *S*<sup>1</sup> for the updated application and platform models. Some decision variables are fixed in the new scheduling problem, namely those that correspond to actions before the time of the processed context event.

Subsequently, the meta scheduler continues to perform time steps, each time applying the context event, invoking the scheduler and adding the new schedule to the MG. In this process, the meta scheduler considers the different potential state traces. For example, the second potential context event will be applied in both schedule *S*<sup>0</sup> as well as in schedule *S*1. Therefore, each context event results in a branching point of the MG, since the context event may occur or it can remain inactive.

The meta scheduler considers mutually exclusive events. For example, a certain dynamic slack value for a task precludes another dynamic slack event for the same task.

A major challenge in meta scheduling is the avoidance of state explosion in the MG. The meta scheduling addresses this challenge using the following techniques:


decision variables after this horizon are fixed in analogy to the decision variables corresponding to actions before the time of the context event (see Line 21 in Figure 4). Thereby, reconvergence of paths in the MG is ensured.

```
01 initial application model AM
02 initial platform model PM
03 initial context model CM
04 initial multi-schedule graph SG=^` 
05 initially fixed decision variables FIX=^`
06 proc meta-scheduler(AM,PM,CM,FIX,prev) 
07 invoke scheduler(AM,PM,FIX) to obtain schedule S 
08 S={<d,t(d)>}// decision variable d with action time t(d) 
09 n = <S,CM> // new node for schedule graph 
10 if n  SG // dejavu 
11 connect previous node prev to existing node n in SG
12 else 
13 add n to SG 
14 if (prevzNULL) connect node prev to new node n in SG 
15 while CMz^` 
16 e=earliest context event from CM with event time t(e)
17 EX=context events that are mutually exclusive with e
18 CM'=CM \ (EX  ^e`) 
19 AM'=result of applying e to AM 
20 PM'=result of applying e to PM 
21 FIX={<d,t(d)>S | t(d) d t(e)  t(d) t t(e)+HORIZON) 
22 recursively invoke meta-scheduler(AM',PM',CM',FIX,n) 
23 end 
24 endif 
25 end 
26 meta-scheduling: invoke meta-scheduler(AM,PM,FIX,NULL)
```
#### **Figure 4.** Meta-scheduling algorithm.

The size of the MG depends on the application, platform and context models. There is a linear relationship between the size of the MG and the number of tasks and messages in the application model, since the schedule must provide a resource allocation for each of these elements from the application model. In addition, the size of the MG depends linearly on the number of resources, which are deployed with dedicated schedules such as cores in case of source-based routing. For those resources, which are only referred to with indices by the schedules, there is a logarithmic relationship between the schedule size and the number of resources (e.g., routers in case of source-based routing). Without reconvergence, there would be an exponential relationship between the number of events in the context model and the schedule size. With a constant reconvergence horizon, there is a polynomial dependency between the number of events and the size of the MG [19].

#### *4.3. Tile-Specific Schedule Extraction and Difference Encoding*

Even with reconvergence, the meta scheduler can still generate MGs with hundreds or sometimes thousands of nodes. Storing those schedules in the memory space of the tiles would consume significant chip resources. Therefore, we extract an individual MG for each tile and we introduce difference encoding. This transformation of MGs is performed by the *MG compressor* in Figure 5.

The meta scheduler computes a MG where each node provides a schedule for the entire system with all tiles. However, an event will typically lead to changes in message injections at only a small subset of the tiles. Therefore, the graph compressor extracts graphs for the individual tiles from the MG. Each of the resulting tile-specific graphs contains a subset of the original nodes and the nodes contain a subset of the decision variables. Consequently, the resulting graphs are significantly smaller than the original MG.

**Figure 5.** Multi-schedule graph compressor.

Each tile is deployed with its complete base schedule, but it stores only the differences of the other schedules in the graph-based structure. As shown in Figure 5, the MG generator extracts for each tile the schedule information and their changes and stores them in a format that is suitable for the NoC adaptation unit. The NoC adaptation unit, in turn, uses the schedule information to control the source-based NoC and to change the NI behavior at runtime based on the global context information that is provided by the context agreement unit.

The effectiveness of the difference encoding depends on the number of tasks and messages with changed temporal/spatial resource allocations after an event. The ratio between the reconvergence horizon and the makespan provides a lower bound for the compression ratio, if the number of messages and tasks per unit of time is uniform throughout the makespan.

The effectiveness of the schedule extraction is determined by the number of resources (e.g., tiles in case of source-based routing) with changed scheduling information after an event. In the worst-case, all resources are effected by an event, thus yielding no benefit from the schedule extraction. However, typically only a small fraction of the resources requires updated schedules after an event. For example, the extraction results in an average reduction of the schedule size by 71% in the three experimental scenarios described in Section 7.

The compression also has an impact on the fault-tolerance of the multi-core architecture. On the one hand, the reduced size of the schedule information decreases the susceptibility to Single Event Upsets (SEUs). On the other hand, an SEUs affecting the schedule information for a task or a message can have a more severe effect, potentially corrupting also the parameters of subsequent tasks and messages.

#### **5. Context Monitor and Agreement Unit**

All decisions taken by the adaptation unit are taken in a distributed manner at each tile. This helps to avoid having a central tile with the role of a managing device that collects the information of all tiles and pushes the new schedule to the entire system. Otherwise, such a tile would represent a single point of failure and could degrade scalability as well as performance. However, in the proposed distributed manner for decision making, we must ensure that all tiles are aware of the same global context to achieve a coherent distributed decision process. Inconsistencies in the global view can lead to inconsistent schedule changes, which in turn can cause collisions on the NoC and deadline misses.

As shown in Figure 6, the context monitors and the context-agreement units in ATMA establish a global view on the context using the following three steps: (1) context reporting, (2) context distribution; and (3) context convergence. At the beginning of the process each tile has only a local view on the context and is not aware of the status of the other tiles. The three steps are explained in the following.

**Figure 6.** Timeline of agreement for an example context event.

#### *5.1. Context Reporting*

The context monitor focuses on the collection of the local context events within the associated context-agreement units.

We differentiate events based on their temporal properties and the safety implications:


The context reporting is performed by context monitors in ATMA. These monitors observe the system state with respect to relevant indicators serving as context. The context typically comprises both state information from within the computer system and from the environment. Therefore, monitors and sensors are realized in hardware and software. Software monitors are driver functions that can be used by the application to give direct feedback to the agreement unit. Hardware monitors contain sensors to evaluate the system or the environment. The output of the monitors consists of a bit for every potential event, indicating whether the event actually occurred. For example, an event can represent the exceeding of a threshold value at a sensor (e.g., battery level and temperature).

This information is encoded in a bit string where each event is mapped to a single bit within the string. The event bit thus indicates if the event occurred. This context string is prepared by each tile locally and thus encodes only the locally observed events, therefore we refer to this bit string as the local context. This local context bitstring is the information that is distributed in the following phase.

The monitors write the context information into dedicated ports. Each context event has a dedicated bit in one of the ports.

#### *5.2. Context Distribution*

The agreement on the context is realized by a broadcast protocol, which sends the messages with the context using a ring relay between all tiles, meaning that a message is sent by each tile to its neighbor and gets incrementally relayed until reaching the original sender [11].

The protocol is triggered periodically and executed by each context-agreement unit at the same time. Figure 7 shows which events are agreed upon in an example scenario. The start instant of the agreement process serves as a deadline *d*<sup>1</sup> for the event reporting of period 1 and all events that happen before *d*<sup>1</sup> are taken into consideration. Events happening after *d*<sup>1</sup> are considered for the next period, even if they occur during the context distribution phase of the protocol. Upon the trigger, the context-agreement unit reads the context ports and assembles a local context vector by concatenating the context information for different events. Once the information is gathered, the local context vector is sent to the other tiles within an agreement message that identifies the context string using the source tile id [11].

This can be done via the NoC or via a dedicated network [11]. The use of the NoC involves no hardware overhead for the implementation, but the agreement messages need to be added to the scheduling problem. This extension can render the communication more difficult to be scheduled. In such cases, a dedicated second network can be used. For the evaluation in Section 7, we implemented a FIFO ring structure where the agreement messages can be sent at arbitrary times without consideration of the application communication at the NoC. This implementation has shown to be able to save more energy than a NoC implementation in a benchmark setup [11].

Each tile in the FIFO ring sends its local context vector to its direct neighbors. This way no collisions can occur as the links that are used are predefined. Once the local context is sent, the tile also receives local context from its neighbors. The tile extracts the new information and saves it locally. Afterwards, it relays the received context to the next neighbors. This way, the local context is transmitted within a ring-like structure between all neighbors until it returns back to its original sender. There the sender knows that it has received all context information in the tile and it can proceed to build the global context vector. This exchange takes *n* transmission hops, with *n* being the number of tiles in the network, as the ring needs to be passed completely by all messages. Therefore, the execution time increases with the number of tiles.

**Figure 7.** Events before the context reporting deadline *dn* of the respective period *n* are considered for the agreed context *Cn*. Events happening between the agreement start and the agreed global context are considered for the next period.

#### *5.3. Context Convergence*

Once all messages with context information are received, the global context vector is produced. In this phase, the global context can also be converged from redundant information using majority voting. This is important if the same context event is observed redundantly by multiple tiles.

If the context events are observed only by one dedicated tile, the local context information is simply concatenated according to the predefined global context-vector layout in which each tile has a predefined interval where its local information is to be placed.

#### **6. NoC Adaptation Unit**

Robust and chip-wide aligned switching of schedules at all tiles are major requirements to ensure consistency and to preserve the properties of time-triggered systems. The NoC adaptation unit supports this requirement based on the assumption of a consistent context vector and correctness of the MG.

The NoC adaptation unit performs time-triggered dispatching of messages by deploying precomputed schedules, each of which is mapped to a particular set of context events. This unit receives the global context vector from the context agreement unit and triggers the ports for transmission of messages to support adaptive time-triggered communication. The global context vector is saved in a dedicated register within the NoC adaptation unit.

The operation of the adaptation unit occurs using the following four steps:


Figure 8 represents the internal structure of the NoC adaptation unit, which is composed of the following three internal building blocks: (1) the context register; (2) the Linked-List Multi-schedule Graph (LLMG) that stores the MG as a linked list; and (3) the adaptation manager.

**Figure 8.** Internal structure of the NoC adaptation unit.

The global context vector is stored in the *context register* to be read by the compare-mask building block. The global context vector is fetched at a scheduled time, which is defined by the LLMG.

Figure 9 shows an example of a LLMG, which is stored as a circular linked list of instant entries. Each entry in the linked list is associated with an instant of time and address in the schedule file.

Two types of instant entries are provided: message entries and branching entries. Table 1 presents the content of the two types of entries. The message entries contain Type, Instant, PortID and Next values. The branching entries contain Type, Instant, BitMask, NextTaken and NextNotTaken values:



**Table 1.** Structure of the schedule entries.

Figure 9 presents an example LLMG, in which two different events E1 and E2 occur. The type of the entry is distinguished by the color, where message entries are shown in white and the branching entries are shown in gray. The number in each entry represents the address of the entry in the schedule file and the context events are shown on the links between the entries.

**Figure 9.** Linked-List Multi-schedule Graph (LLMG).

The process of selecting the correct schedule is started by tracing the entries of the LLMG, based on the address and next pointers. The adaptation manager is triggered at the instant, which is given by the entries. In the example in Figure 9, entry 1 is a message entry so the message of the corresponding *Port ID* is transmitted at the *Instant* specified by the entry. In this entry type, the *Next* value points to the address of the next entry. Entry 2 is a branching entry, which points to two different entries. The selection of the next schedule depends on the occurrence of the specified context event. The event occurrence is determined by the global context vector and MaskedBit. In the case of E1 occurrence, entry 3 is selected as the next entry by *NextTaken*. If E1 does not occur, then entry 6 is selected by *NextNotTaken*. The same operation is applied for the other entries of the LLMG. After applying the operations for all entries, the next pointer of the last entry is followed, which points back to the first entry of the period.

The adaptation manager serves the ports by triggering the message transmissions. The selection of the schedules is fully dependent on the received global context vector and is implicitly consistent with the other tiles, because the context vector is globally consistent and the actions of the different tiles within each schedule are temporally aligned by the meta scheduler.

The adaptation manager consists of a state machine and a compare mask. The state machine reads the input from the context vector register and the LLMG for switching schedules. Figure 10 presents the states and transitions of the state machine.

The state machine wakes up at an instant of time, which is defined by the LLMG. At this instant of time, the type of the LLMG entry is checked. In the message-type entries, the state machine reads the portID and injects the message of the specified port at the instant of time. After the injection, the new entry is fetched by the Next value.

In the case of a branching entry, the global context vector is fetched from the context register at the specified instant of time. In parallel, the BitMask is read from the LLMG. The global context vector and the BitMask values are received by the compare mask to extract the specified event. The event occurrence value can be 0 or 1, indicating whether the event has occurred. Therefore, the NextTaken entry is selected. An event bit 1 means the event has not occurred, so the NextNotTaken entry is selected. After the selection of the next address, the state machine waits again for the dispatching time of the next entry.

**Figure 10.** Adaptation state machine.

#### **7. Results and Discussion**

The introduced architecture was instantiated for example scenarios in order to validate the adaptation services. In addition, the scenarios served as an evaluation of the improvements with respect to energy efficiency and investigate the overhead with respect to memory and logic.

#### *7.1. Zynq Prototype*

The architecture was instantiated on a Xilinx Zynq-7000 SoC ZC706 FPGA board. The System-on-Chip (SoC) of this board consisted of an ARM-based processing system and a programmable-logic unit on a single die.

The hardware platform comprised four tiles interconnected by a time-triggered NoC [27]. Each tile was deployed with a network interface, a NoC adaptation unit and a context agreement unit. The Nostrum NoC [28] served as the basis for the implementation of the adaptable NoC. One tile was located in the processing system and contained two ARM Cortex-A9 processor cores. The other tiles were implemented in the programmable logic, where each tile contained a single core realized as a soft-core MicroBlaze-processor. The resource consumption is shown in Table 2.


**Table 2.** FPGA resource utilization.

The meta scheduler and the MG compressor served for the generation of the schedules of the tiles, which were loaded to the dedicated memory of each corresponding tile. The meta scheduler is an implementation of the pseudo code in Figure 4. An existing optimal scheduler [29] for time-triggered networks was extended to use energy efficiency as the objective function. This optimal scheduler was implemented with IBM CPLEX and repeatedly invoked by the meta scheduler.

#### *7.2. Slack-Based Adaptation Scenarios*

Figures 11–13 show the three scenarios for the evaluation. In each scenario, different tasks with different WCETs and precedence relationships were hosted by the four tiles. The tasks had a period of 2 ms and it was assumed that each task could be subjected to dynamic slack of 50%. Table 3 summarizes the input models (i.e., AM, PM and CM) for the meta scheduling in the three scenarios.

*Designs* **2019**, *3*, 7

**Figure 12.** Scenario 2.

**Table 3.** Inputs models for scenarios.


Dynamic slack is the time difference between the WCET of the task and the actual point of time, at which the task ends. Slack can be used to save energy, as, in each execution, some or all tasks can be finished either as planned or earlier. In the case no slack happens, only one schedule can be used, as shown for example in Figure 11. In the case of a slack of 50% (e.g., T1 finishes 250us earlier), we changed the communication schedule to make use of the remaining time of the execution of T1 to start T2 (and consequently T3 and T4) earlier and achieve a shorter makespan for the system, as shown in Figure 14.

**Figure 14.** Example for slack in Task T1 of Scenario 1.

Energy reduction was achieved by clock-gating all tiles, their dedicated IPs and the interconnect. Clock gating aws performed between the completion of the makespan and the start of the subsequent period. In other words, all tiles as well as the NoC were in sleep mode after the termination of the last task, until the next period starts. This procedure was repeated every period.

#### *7.3. Evaluation and Results*

The evaluation was performed on the introduced prototype using the UCD90120A device, which is a 12-rail addressable power-supply sequencer and monitor. This chip was mounted on the evaluation board and it aws accessible by the processing system via the PMBus/I2C communication bus.

The experimental results encompass the power consumption for all combinations of slack events, thus showing the power savings depending on the completion times of tasks. Table 4 summarizes the numerical measurement results for the three scenarios. In each scenario, all possible slack combinations of different tasks were taken into consideration for the computation of the multi-schedule graph [30]. Each row of the table corresponds to an observed makespan value and it contains the number of schedules in the MG with this makespan, the corresponding power consumption in milliwatt and the power reduction percentage achieved by the proposed adaptation mechanisms. In addition, the table provides the average power reduction percentage under the assumption of a uniform probability distribution of the slack event combinations. This value is pessimistic, because in reality high slack values of tasks are common (e.g., as described by the literature on WCET analysis).


**Table 4.** Evaluation Results for the three scenarios.

In addition, the memory size for the storage of the generated schedules is indicated in Table 5. In general, the number of schedules would increase exponentially with the number of tasks. However, the mechanisms for reconvergence, tile-specific schedule partitioning and difference encoding result in a significant reduction of the state space and the memory consumption. The baseline memory consumption and the memory consumption with difference encoding and tile-specific schedule partitioning are shown in Table 5 as well.


**Table 5.** Memory usage in different scenarios.

The architectural building blocks for adaptation imposed delays, which added up to 20 clock cycles for a schedule change after a context event in the prototype implementation. The context monitor imposed an implementation-specific latency for the detection of context events. In addition, there was an additional delay of up to one sampling period for asynchronous context events. The prototype contained a hardware implementation of the context monitor in conjunction with synchronous context events involving a delay of two cycles.

The context agreement unit imposed a delay for distributing the context information among all tiles and establishing the globally consistent context vector. In the implementation, four clock cycles were needed for forwarding the context information between tiles using the FIFOs. Hence, 16 clock cycles were needed for the prototype system with four tiles.

The state machine of the adaptation unit in the prototype required two additional clock cycles for each message compared to the non-adaptive NoC in order to process the compressed schedule information and to traverse the linked lists.

#### **8. Conclusions**

The presented time-triggered multi-core architecture supports adaptation with multi-schedule graphs, while preserving significant properties of time-triggered systems including freedom from interference at shared resources, implicit synchronization, timeliness, implicit flow control and fault containment.

Architectural building blocks for agreement establish a consistent chip-wide view on context events, which is used by the adaptation unit in each tile for temporally aligned changes of schedules.

The meta-scheduler of the architecture introduces techniques for avoiding state explosion such as reconvergence of adaptation paths with bounded horizons for context events. In addition, memory consumption is minimized using difference-encoding and the tile-specific extraction of schedule information.

The meta-scheduler of the architecture computes Multi-schedule Graphs (MGs), which incorporate for each combination of context events the corresponding fixed scheduling decisions. These decisions include the start times of jobs, message injection times, messages paths and parameters for energy management such as time intervals with different frequency values for cores and routers.

For each schedule of the MG, deadlines, precedence constraints, resource contention and the adaptation overheads (e.g., delays for DVFS and delays for establishment of globally consistent context vectors) are considered. Consequently, correctness can be verified at design time and the presented time-triggered multi-core architecture enables adaptation for safety-critical embedded systems, where significant improvements with respect to energy efficiency and fault recovery can be obtained.

Plans for future work include the experimental evaluation of the fault tolerance using fault-injection experiments and the extension of ATMA towards a hierarchical architecture for the interconnection of adaptable multi-core chips via reconfigurable off-chip communication networks.

**Author Contributions:** R.O. contributed to the overall conceptualization, the adaptive multi-core architecture and the scheduling algorithms. H.A. contributed to the overall conceptualization, the adaptive multi-core architecture and the experimental evaluation. A.M. contributed to the conceptualization, implementation of the adaptation services and the experimental evaluation. Y.B. contributed to the prototype implementation and the experimental evaluation. A.L. contributed to the conceptualization and implementation of the agreement services. B.S. contributed to the scheduling algorithms.

**Funding:** This work was supported by the European project SAFEPOWER under the Grant Agreement No. 687902.

**Conflicts of Interest:** The authors declare no conflict of interest.

*Designs* **2019**, *3*, 7

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
