Next Article in Journal
Novel Numerical Methods in Heat and Mass Transfer
Next Article in Special Issue
Design and Verification of Petri-Net-Based Cyber-Physical Systems Oriented toward Implementation in Field-Programmable Gate Arrays—A Case Study Example
Previous Article in Journal
Optimisation of Integrated Systems: The Potential of Power and Residential Heat Sectors Coupling in Decarbonisation Strategies
Previous Article in Special Issue
An FEA-Assisted Decision-Making Framework for PEMFC Gasket Material Selection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems

by
Alexander Barkalov
1,2,*,
Larysa Titarenko
1,3 and
Kazimierz Krzywicki
4,*
1
Institute of Metrology, Electronics and Computer Science, University of Zielona Gora, Ul. Licealna 9, 65-417 Zielona Gora, Poland
2
Department of Computer Science and Information Technology, Vasyl Stus’ Donetsk National University (in Vinnytsia), 600-Richya Str. 21, 21021 Vinnytsia, Ukraine
3
Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine
4
Department of Technology, The Jacob of Paradies University, Ul. Teatralna 25, 66-400 Gorzow Wielkopolski, Poland
*
Authors to whom correspondence should be addressed.
Energies 2022, 15(7), 2636; https://doi.org/10.3390/en15072636
Submission received: 11 February 2022 / Revised: 25 March 2022 / Accepted: 1 April 2022 / Published: 4 April 2022
(This article belongs to the Special Issue Control Part of Cyber-Physical Systems: Modeling, Design and Analysis)

Abstract

:
A method is proposed for optimizing circuits of sequential devices which are used in cyber-physical systems (CPSs) implemented using field programmable gate arrays (FPGAs). The optimizing hardware is a very important problem connected with implementing digital parts of CPSs. In this article, we discuss a case when Mealy finite state machines (FSMs) represent behaviour of sequential devices. The proposed method is aimed at optimization of FSM circuits implemented with look-up table (LUT) elements of FPGA chip. The method aims to reduce the LUT count of Mealy FSMs with extended state codes. The method is based on finding a partition of the set of internal states by classes of compatible states. To reduce LUT count, we propose a special kind of state codes named composite state codes. The composite codes include two parts. The first part includes the binary codes of states as elements of some partition class. The second part consists of the code of corresponding partition class. Using composite state codes allows us to obtain FPGA-based FSM circuits with exactly two levels of logic. If some conditions hold, then any FSM function from the first level is implemented by a single LUT. The second level is represented as a network of multiplexers. Each multiplexer generates either an FSM output or input memory function. An example of synthesis is shown. The experiments prove that the proposed approach allows us to reduce hardware compared with two methods from Vivado, JEDI-based FSMs, and extended state assignment. Depending on the complexity of an FSM, the LUT count is reduced on average from 15.46 to 68.59 percent. The advantages of the proposed approach grow with the growth of FSM complexness. An additional positive effect of the proposed method is a decrease in the latency time.

1. Introduction

Our time is characterised by the wide application of various cyber-physical systems (CPSs) in many areas of human activity [1,2,3,4,5,6]. The typical CPS includes a digital part interacted with physical objects [7]. Very often, various sequential blocks can be found in digital parts of CPSs [3,7]. To improve the overall quality of a CPS digital part, it is necessary to optimize characteristics of its sequential blocks. In the current paper, the model of Mealy finite state machine (FSM) [8] represents the behaviour of these blocks.
The model of Mealy FSM [9] is one of the basic models used in the designing circuits of sequential devices [9,10]. Due to it, there are a large number of methods for synthesizing Mealy FSM logic circuits [11,12]. One of the main goals of these methods is to reach optimal values of the basic characteristics of resulting FSM circuits [10,13]. These characteristics are: (1) the hardware amount (in the case of VLSI, it is a chip area occupied by a circuit), (2) the performance, and (3) the power consumption. As a rule, it is not possible to achieve a simultaneous optimum for these three characteristics. For example, a decrease in the occupied chip area is often associated with an increase in the number of circuit levels, which leads to a decrease in performance [9,11]. Many studies show that the chip area occupied by an FSM circuit has a decisive influence on both the latency time and power consumption [14]. At the same time, it is important that reducing the area increases the delay time of the circuit as little as possible. In this paper, we propose just such a method focused on the case of the implementation of the FSM circuit using resources of FPGAs [5,15,16]. The proposed method develops ideas related to the use of extended state codes [17] and twofold state assignment [5]. The proposed approach belongs to methods of structural decomposition [17].
Now, a lot of digital systems are implemented using FPGA chips [5]. As follows from the analysis of VLSI’ market [16], the largest manufacturer of FPGA chips is Xilinx [18]. Due to it, we focus our current research on solutions of Xilinx. An FSM circuit is represented as a composition of look-up table (LUT) elements, programmable flip-flops, inter-slice multiplexers, programmable interconnects, synchronization tree, and programmable input-outputs.
Our current article is devoted to improving the LUT count of two-level LUT-based Mealy FSM circuits based on extended state codes (ESC) [17]. The main shortcoming of ESC-based FSMs is a significant increase in the number of used flip-flops compared to their minimum possible number. This disadvantage leads to two negative phenomena. First of all, this leads to increasing the number of outputs of the synchronization tree connected with the state code register (SCR). The second negative phenomenon is reduced to the fact that an increase in the state code length (number of bits) leads to a complication of the interconnect system. The negative impact of these two factors is reflected in the increase in power consumption of FSM circuits. This is why it is so important to reduce the number of flip-flops in SCR (without increasing the number of LUT levels of the resulting FSM circuit). The desire to eliminate this shortcoming is the main motivation of our current research. Therefore, the problem under consideration is formulated as follows: the development of a method for implementing circuits of LUT-based Mealy FSMs that allows for the simultaneous reduction of the number of LUTs and flip-flops in a two-level FSM with extended state codes.
The main contribution of this paper is the following:
  • There is proposed a new method for presenting FSM state codes. The proposed composite state codes (CSCs) consist of class codes and codes of class elements.
  • The proposed method allows us to obtain FPGA-based circuits having fewer LUTs than this number for circuits of equivalent Mealy FSMs implemented using known basic state encoding approaches (maximum binary, one-hot, JEDI), as well as the extended state codes. A positive side effect of the proposed method is a slight improvement in the temporal characteristics of the obtained FSM circuits in relation to their counterparts based on other state assignment approaches.
  • The gain from the application of the proposed method increases as the number of FSM inputs and states increases.
The novelty of our article is reduced to the development of a novel design method aimed at reducing the length of state codes for the two-level LUT-based Mealy FSMs. The method is based on using the composite state codes proposed in this paper. This reducing the length of state codes decreases the number of flip-flops in FSM state registers compared to this number for equivalent FSMs with extended state codes. As a result, the number of input memory functions is also reduced, which in turn reduces the LUT counts of the resulting circuits.
The biggest challenge in reducing the LUT counts in the circuits of FPGA-based FSM is to solve this problem with a minimum decrease in FSM performance. We solved this problem by reducing the number of state variables while keeping the same number of logical levels of FSM circuits compared to this value for optimized ESC-based FSMs.
The rest of the article is organized as follows. The basic information about LUT-based Mealy FSMs is discussed in Section 2. The Section 3 is devoted to the analysis of works related to FSM design. The background of the proposed method is shown in Section 4. An example of synthesis for a CSC-based FSM is shown in Section 5. Section 6 describes and analyzes the experimental results. The brief summary of the results is shown in Section 7.

2. Background Information

To design a Mealy FSM logic circuit, it is necessary to create systems of Boolean functions (SBFs) representing the circuit [8,9]. These SBFs show the dependences of FSM outputs and input memory functions (IMFs) on FSM inputs and state variables. The FSM outputs form a set O = { o 1 , , o N } . Elements of the following sets are used as arguments of these SBFs: the FSM inputs from a set I = { i 1 , , i L } and state variables from a set T = { T 1 , , T R } . The state variables encode internal states from a set S = { s 1 , , s M } . In this article, we use a case of maximum binary state encoding [9] when the number of state variables T r T is determined as
R = l o g 2 M .
The state codes K ( s m ) are kept into the state code register. As a rule, the register has informational inputs of D type [19,20]. To load state codes into SCR, the input memory functions are used. They form a set D = { D 1 , , D R } .
Two systems of functions represent logic circuits of so called P Mealy FSMs (Figure 1). They are the following:
D = D ( T , I ) ;
O = O ( T , I ) .
In Figure 1, the block of functions is synthesized using the SBFs (2) and (3). The SCR consists of R flip-flops and keeps the state codes K ( s m ) . The pulse Reset allows us to load the code with all zeros into SCR. As a rule, this combination of state variables encodes an initial state s 1 S . The pulse Clock allows loading state codes into SCR.
To get the systems (2) and (3), an FSM direct structure table (DST) is used [8]. The DST is constructed using either a state transition table (STT) [9] or a state transition graph [5,9]. In our paper, we use an STT as a tool for representing a Mealy FSM.
There are five columns in an STT [9]. These columns include: a current state s m ; a next state s T ; an input signal I h which is a conjunction of inputs (or their complements) determining the transition from s m into s T ; collection of outputs O h which are produced during the transition from s m into s T ; h is a column which includes the numbers of interstate transitions ( h 1 , , H ) .
The process of a DST creating begins from the executing state assignment. During this step, abstract states s m S are represented by their binary codes K ( s m ) . Next, a set of input memory functions can be obtained. Compared to an STT, a DST includes three additional columns [8]. They are: the code K ( s m ) , the code K ( s T ) , and a collection of IMFs D h D equal to 1 to load the code K ( s T ) into SCR.
In this paper, we consider a case when SBFs (2) and (3) are implemented using internal resources of FPGA chips. There are a lot of configurable logic blocks (CLB) in FPGAs produced by Xilinx [18,21]. To get an FSM circuit, it is necessary to connect CLBs using internal programmable interconnections [15]. It is enough three CLB elements to get an FSM circuit. The logic is implemented using LUTs. The expansion of LUT inputs is executed using internal multiplexers. The state register is represented by a collection of D flip-flops. Using the notation [11], we denote as N I L U T -LUT a LUT having N I L U T inputs and a single output. A Boolean function depending on up to N I L U T variables is represented by a single-LUT logic circuit. Various methods of functional decomposition (FD) [22,23] are used if some FSM functions depend on more than N I L U T arguments. It is known, the FD-based FSMs are represented by multi-level circuits with complicated systems of “spaghetti-type” interconnections [22].
If all LUTs have the same number of inputs, then such a logic basis is inflexible. It means that in some cases, only a part of the available inputs will be used. At the same time, in other cases, the LUTs need to be combined to increase the number of inputs. To reduce the impact of interconnects on such a join, it is important to have internal fast interconnects between some LUTs. In Xilinx solutions, these CLBs are combined into slices [18]. For example, the SLICEL of Virtex-7 includes four 6-LUTs, eight flip-flops and 27 multiplexers [24]. A part of SLICEL is shown in Figure 2.
As follows from Figure 2, using the resources of a single SLICEL allows the generation of up to 11 Boolean functions ( f 1 f 11 ) . The outputs f 2 , f 3 , f 6 , f 7 are outputs of corresponding 6-LUTs. They can be connected with the data inputs of flip-flops. Each 6-LUT can be organized as two 5-LUTs with shared inputs. These 5-LUTs generate functions f 1 f 8 . An output of each 5-LUT can be connected with informational input of a flip-flop. Due to it, there are 8 flip-flops in the circuit of SLICEL. The slice contains three internal multiplexers (MX1–MX3) which can be used for creating either two 7-LUTs or a single 8-LUT. These multiplexers have special control inputs (MC1–MC3), which can be used as additional inputs of LUTs having N I L U T equal to either 7 or 8. For example, using the input MC1, we can get a 7-LUT by combining LUT1 and LUT2. This 7-LUT implements function f9. Simultaneous using MC1–MC3 allows us to combine all 6-LUTs into a single 8-LUT with output f11.
In this paper, we use multiplexers to generate functions (2) and (3). We denote a multiplexer having K data inputs as K- M X . Using a single 6-LUT, we can implement a circuit of 4- M X . A 4- M X has two control inputs and four data inputs. Using an internal multiplexer, we can organize an 8- M X with help of two 6-LUTs. For example, using MC2, LUT3 and LUT4 gives us an 8- M X . Its circuit has only slightly bigger delay than a circuit of a 4- M X [24]. It is possible due to using the fast interconnections inside a slice. If a 16- M X has the control inputs T 1 T 4 , then its circuit includes four 6-LUTs controlled by T 3 T 4 . The inputs MC1 and MC2 are connected with the same control input T 2 . The input MC3 is connected with the most significant control bit T 1 . Obviously, to implement a 32- M X , we should use to slices and inter-slice interconnections. Due to it, a 32- M X is much slower than a 16- M X .
In LUT-based FSMs, the flip-flops of SCR are distributed among LUTs generating functions (2). Due to it, the SCR is hidden inside the same slices where the input memory functions are generated. There are two blocks in LUT-based P Mealy FSM (Figure 3).
The block of state variables (BSV) implements SBF (2). The state variables T r T are kept into the distributed SCR. To control the SRC operation, the signals clearing (Reset) and synchronization (Clock) enter BSV. The outputs o n O are generated by the block of output functions (BOF). This block implements SBF (3).
In the best case, there is exactly a single level of LUTs in the circuit of P FSM. This is possible if each function ϕ i D O depends on not more than N I L U T arguments. However, for modern LUTs, the following relation holds: N I L U T 6 [18,25]. To diminish the number of LUTs in an FSM logic circuit, it is necessary to increase the value of N I L U T . However, such an increasing leads to higher values of both the power consumption and latency time of a LUT. Due to this phenomenon, the number of LUT inputs is so small. If the numbers of arguments in functions ϕ i D O exceed the number of LUT inputs, then it results in increasing the numbers of LUTs and their logic levels in FSM circuits. To improve these characteristics of LUT-based FSM circuits, it is necessary to improve the existing design methods.
In addition, it is necessary to optimize the system of connections between different slices of an FSM circuit. As shown in [26], the interconnections are responsible for around 70% of power consumption. In addition, they are responsible for the major part of FSM circuit latency time [23]. As shown [11], the improving interconnections allows us to decrease the minimum latency time and power dissipation of FPGA-based circuits. Using either twofold state codes [5] or extended state codes [17] can improve the interconnection system characteristics.

3. Related Work

Various design methods have been proposed for designing LUT-based FSMs [5,19,20,23,27,28,29,30]. These methods should be applied if the number of arguments N A ( ϕ i ) exceeds the value of N I L U T at least for a single function ϕ i D O [5]. These methods can improve either the LUT count or the maximum operating frequency or the consumed power [31]. Sometimes, these methods try to find a solution when more than a single characteristic is optimized. In this paper, we propose a method for improving the LUT count of FPGA-based Mealy FSMs.
To diminish the values of N A ( ϕ i ) , various methods of state assignment may be used [9,10]. The numbers of state variables differ from (1) corresponding to maximum binary codes to M corresponding to the one-hot state assignment. These approaches are used in many academic and industrial CAD tools. The well-known academic systems are, for example, SIS [13] and ABC by Berkeley [32,33]. The manufactures of FPGA chips also have their CAD packages. For example, Xilinx has the CAD Vivado [34], whereas Intel (Altera) has the package Quartus [35].
It is very difficult to choose the best universal method of state assignment. For example, in [36] there are compared FSM circuits based on the maximum binary and one-hot state codes (OHC) [28]. As follows from [36], the using OHCs leads to improving the basic characteristics for rather complex FSMs having more than 16 states. However, FSM characteristics depend strongly on the number of inputs, too [5].The results of research reported in [28] prove that OHCs lead to the worsening FSM characteristics if there is L > 10 .
So, the characteristics of LUT-based circuits depend on the value of L + R . Depending on this parameter, either the maximum state codes or OHCs lead to improving either the LUT count or/and the latency time for a particular FSM. Therefore, both these state assignment approaches should be checked for a given FSM. We have investigated the efficiency of these both methods in our research. Both these methods are used in the CAD tool Vivado [34] by Xilinx [18]. They are named Auto and One-hot, respectively. We used Vivado because this system operates with Virtex-7 chips used in our experiments. It is noted in [13] that one of the best deterministic methods of the state assignment is JEDI [28]. Due to this quality, we use JEDI to compare JEDI-based FSMs with FSMs proposed by us in the current paper.
In this paper, we propose a new design method allowing to reduce the number of LUTs (LUT counts) in circuits of FPGA-based Mealy FSMs. The proposed method belongs to the methods of structural decomposition (SD) [11]. The SD-based FSM optimization is achieved due to using some intermediate logic levels between the arguments of SBFs (2) and (3) and the functions ϕ i D O . Due to it, the number of functions increases, but these functions have significantly fewer arguments than functions (2) and (3). These methods are analysed, for example, in [11].
The proposed method can be viewed as an evolution of methods of twofold state assignment [5]. The methods of this group are based on the constructing a partition Π S of the set S by the classes of compatible states. Each state s m S corresponds to a set I ( s m ) . This set consists of inputs i l I which determine next states for a given state. Let the symbols M j and R j stand for the number of states in some set S j S and the number of bits in maximal binary codes of these states, respectively. In the case of twofold state assignment, it is necessary to use an additional code for the relation s m S j . Therefore, the value of R j is determined by the following expression:
R j = l o g 2 ( M j + 1 ) .
We name states s m S j compatible if the following condition holds:
R j + L j N I L U T .
In (5), the symbol L j stands for the number of inputs determining transitions from states s m S j . These inputs are combined into a set I j I .
In [5], the method is proposed which allows the creation of the partition Π S = { C S 1 , , C S J } with minimum number of classes, J. Each class C S j Π S includes only compatible states. Each class C S j Π S determines sets I j I and O j O . The set O j O consists of outputs o n O generating during transitions from states s m C S j .
In the case of twofold state assignment, two different codes determine a state s m S [5]. To determine this state as an element of the set of states, we use a code K ( s m ) . The code C ( s m ) determines the state s m S as an element of some class of compatibility C S j Π S . Each class C S j Π S determines a collection of partial functions generating by the corresponding block of LUTs. These partial functions are partial outputs o n O j and partial IMFs D r D J . The set D j D includes IMFs generating during the transitions from the states s m C S j . We denote these partial functions as o n j and D r j , where the superscript j shows that the functions are determined by the class C S j Π S . Due to the validity of (5), each partial function is represented by a circuit consisting of a single LUT.
The main disadvantage of this approach is the need for a transformation of K ( s m ) into C ( s m ) . The transformation is executed by a special block of code transformation. This block consumes some internal resources of an FPGA chip. Moreover, it adds a delay to the total cycle time of a resulting FSM circuit.
An improvement of this approach is proposed in [17]. The improvement is reduced to using only codes C ( s m ) . These codes were named the extended state codes (ESCs). Using ESCs, allows the elimination of the block of code transformation. We use the symbol P E to show that a Mealy FSM is based on the extended state codes. To encode states from all classes of compatibility, R E state variables are used:
R E = R 1 + R 2 + + R J .
The value of R j ( j { 1 , , J } ) is determined by (5). To encode states s m C S j , the state variables from a set T j T are used. There are R E elements in the set T, where T = T 1 T 2 T J .
The logic circuit of P E FSMs is represented by a structural diagram (Figure 4).
The logic circuits of P E FSMs have two logic levels. The first level is a block of partial functions (BFP) represented by J blocks (Block1–BlockJ). These blocks implements system of partial functions:
D j = D j ( T j , I j ) ;
O j = O j ( T j , I j ) .
The second level of logic is represented by BlockOR. This block includes the SCR having R E flip-flops. These flip-flops are controlled by pulses Reset and Clock. In the best case, there are exactly N + R E LUTs in the circuit of BlockOR. This block implements disjunctions of the partial functions:
o n = o n 1 o n 2 o n J ( n { 1 , , N } ) ;
D r = D r 1 D r 2 D r J ( r { 1 , , R E } ) .
In (9) and (10), the superscript determines the number of block generating the particular partial function. The functions (10) are inputs of flip-flops. The state variables are outputs of these flip-flops.
Obviously, each class C S j Π S determines three sets: T j T , O j O and D j D , where there are R E elements in the set D. Two first sets are already defined. The set D j D includes input memory functions generating during transitions from states s m S j .
Consider a Mealy FSM U 1 represented by its STT (Table 1). The following characteristics of U 1 follow from Table 1: the number of states M = 8 , the number of inputs L = 6 , the number of outputs N = 7 , and the number of transitions H = 19 . We denote as P E ( U k ) that a Mealy FSM U k is implemented using the model P E . To synthesize the FSM P E ( U 1 ) , it is necessary to find a partition Π S . The number of classes, J, depends on the value of N I L U T . We discuss a case when a logic circuit of P E ( U 1 ) is synthesized using LUTs with N I L U T = 5 inputs.
To find the partition Π S with the minimum number of classes of compatible states, we can use a method proposed in [5]. Using this method gives the partition Π S = { C S 1 , C S 2 , C S 3 } , where C S 1 = { s 1 , s 3 , s 7 } , C S 2 = { s 2 , s 4 , s 5 } , and C S 3 = { s 6 , s 8 } . This gives the values J = 3 , M 1 = M 2 = 3 and M 3 = 2 .
Using (4) gives R 1 = R 2 = R 3 = 2 . This determines the sets T 1 = { T 1 , T 2 } , T 2 = { T 3 , T 4 } , T 3 = { T 5 , T 6 } , and T = { T 1 , , T 6 } with R E = 6 . Obviously, using (1) gives the minimum possible number of state variables R = 3 .
If a state s m belongs to a class C S j Π S , then only some state variables T r T j should differ from zero in the extended state code C ( s m ) . At the same time, if T r T j , then T r = 0 ( r { 1 , , R E } ) . One of the possible outcomes of such a state assignment is represented by Table 2. For example, the following ESCs can be found from Table 2: C ( s 1 ) = 010000 , C ( s 2 ) = 000100 , C ( s 6 ) = 000001 , and so on.
As follows from experiments [17], this approach allows an increase in performance up to 15.9% compared with equivalent FSMs based on the twofold state assignment. The growth of operating frequency is accompanied by a slight growth in the LUT count (up to 7.7%).
But the approach [17] has a serious drawback: the number of state variables can exceed significantly the minimum possible number determined by (1). This leads to increasing the number of flip-flops in SCR. In turn, this increases the number of buffers of the synchronization tree required by a P E FSM logic circuit compared with circuits of equivalent FSMs based on the twofold state assignment. In addition, the number of interconnections is increased.
Now we can sum up and perform a qualitative analysis of the discussed issues. The well-known state assignment methods (the maximum binary codes, one-hot codes, JEDI) do not guarantee a decrease in the number of arguments of all Boolean functions representing an FSM logic circuit. The greater the difference between the total number of FSM inputs and state variables, on the one hand, and the number of LUT, the higher the probability of the need to apply the methods of functional decomposition of SBFs (2) and (3). In this case, it is expedient to use methods of structural decomposition, which make it possible to obtain FSM circuits with a guaranteed number of logical levels. In addition, these methods allow getting rid of the “spaghetti-type” interconnection system inherent in LUT-based FSM circuits based on functional decomposition. One of the best structural decomposition methods is based on the use of extended state codes. However, this method is associated with a significant increase in the number of state variables in relation to the minimum value determined by (1). If this shortcoming is eliminated and the main advantages of the extended state codes are preserved, then it is possible to improve the basic characteristics of the FSM circuits (the LUT counts and performance) in comparison with their counterparts based on extended state codes.
In our current paper we propose an approach which allows an improvement of LUT count for circuits of Mealy FSMs based on the partition of the set S by classes of compatible states.

4. Main Idea of the Proposed Method

The proposed method is based on the finding a partition Π C = { C S 1 , , C S J C } of the set S by J C classes of compatible states. In this case, states s m C S j are encoded by codes C ( s m ) using R A state variables where
R A = m a x ( l o g 2 M 1 , , l o g 2 M J C ) .
To encode a class C S j Π C by a class code K ( C S j ) , it is necessary R C bits, where
R C = l o g 2 J C .
We propose to represent a state s m C S j by the code C C ( s m ) which we name a composite state code (CSC). This code is the following:
C C ( s m ) = K ( C S j ) C ( s m ) .
In (13), the sign “∗” denotes the concatenation of the codes. There are R C C state variables in the code (13). The value of R C C is determined as
R C C = R C + R A .
To encode the classes, we use the class variables from the set T C = { T 1 , , T R C } . To encode states as elements of classes C S j Π C , we use the state variables from the set T A = { T R C + 1 , , T R C C } . Together, these sets form a set T = T C T A having R C C elements.
Each class C S j Π C determines the following three sets: I j , O j , and D j . These sets have been defined before. There is no set T j T , because the states for each class are encoded using the same state variables T r T A . Each class C S j Π C determines the following partial functions:
D j = D j ( T A , I j ) ;
O j = O j ( T A , I j ) .
In (15) and (16), the following relation holds: j { 1 , , J C } .
To get the functions D r D and o n O , it is necessary to execute multiplexing of partial functions. To do it, N + R C C multiplexers should be used. The partial functions are used as data inputs of these multiplexers. The selection of a particular partial function is determined by the class variables T r T C . Therefore, the multiplexers generate the following SBFs:
D r = D r ( T C , D r 1 , , D r J C ) ( r { 1 , , R C C } ) ;
o n = o n ( T C , o n 1 , , o n J C ) ( n { 1 , , N } ) .
So, SBFs (15) and (16) determine a block of partial functions (BPF). The SBF (17) determines a multiplexer of state variables (MXSV), the SBF (18) determines a multiplexer of outputs (MXO). Together, the SBFs (15) and (18) determine a structural diagram of P C Mealy FSM shown in Figure 5.
There are three logic blocks in P C Mealy FSM. Their functions are clear from the previous text. The block of partial functions implements SBFs (15) and (16). The multiplexer of state variables (MXSV) implements SBF (17). Its circuit includes R C C multiplexers having R C control inputs and up to J C data inputs. The outputs of these multiplexers are connected with inputs of flip-flops creating the state code register, RSC. To control the RCS, the pulses of clearing and synchronization enter MXSV. There are R C C flip-flops in the circuit of RSC. There are N multiplexers in the circuit of M X O . The selection of a particular partial function o n j is executed under the control of state variables T r T C .
If C S j Π C , then there are R C j variables in the codes of states s m C S j , where
R C j = l o g 2 M j .
The comparison of formulae (4) and (19) shows that classes of the partition Π C can include more elements than classes of the partition Π E . This is determined by the absence of 1 in the formula (19).
For example, if there is N I L U T = 5 , then the following partition Π C can be constructed for Mealy FSM U 1 : Π C = { C S 1 , C S 2 } . Therefore, there are J C = 2 classes for FSM P C ( U 1 ) instead of J = 3 for the equivalent FSM P E ( U 1 ) . There are the following classes of compatible states in the discussed case: C S 1 = { s 1 , s 3 , s 6 , s 7 } and C S 2 = { s 2 , s 4 , s 5 , s 8 } .
In the common case, the following conditions hold:
J C J ;
R R C C < R E .
In the case of U 1 , we can find that R = 3 , R E = 6 , R A = 1 , R C = 2 , and R C C = 3 . Therefore, in the discussed case, there is R C C = R = 3 . In addition, there is J C = 2 < J = 3 . Due to it, we can expect that, in this case, the circuit of P E ( U 1 ) will have fewer LUTs and interconnections than the circuit of P C ( U 1 ) . We will check this in the next Section. In addition, we can expect that P C Mealy FSMs have, at least, the same performance as equivalent P E Mealy FSMs. The experiments reported in Section 6 show that our approach allows an improvement of the basic characteristics of LUT-based circuits of Mealy FSMs.
A method of P C Mealy FSMs logic synthesis is proposed in our current article. As a result, we have obtained the logic circuits of LUT-based FSMs where a LUT has N I L U T inputs. We start the synthesis process from an FSM state transition table. The proposed method includes the following steps:
  • Constructing the partition Π C of the set of states by classes of compatible states.
  • Encoding of FSM states by composite state codes C C ( s m ) .
  • Creating direct structure table of P C Mealy FSM.
  • Creating tables of blocks of partial functions for classes C S j Π C .
  • Creating table representing the multiplexer of outputs.
  • Creating table representing the multiplexer of state variables.
  • Constructing SBFs representing BPF, MXSV, and MXO.
  • Implementing the LUT-based circuit of P C Mealy FSM using FPGA chip’s internal resources.
We use the methods [5] to create the partition Π C . The main goal of these methods is the minimizing LUT counts in the resulting Mealy FSM circuits. If it is possible, each class of compatible states should include the maximum possible number of states. This helps minimizing the value of J C . The classes are created in a way minimizing the number of shared outputs. This optimizes the number of LUTs in the circuit of MXO. Any multiplexer from the second level of an FSM circuit is implemented by a single LUT if the following condition takes place:
R C + J C N I L U T .
Even if condition (22) is violated, then the multiplexers could be implemented as single-level circuits. This is possible, if the number of partial functions for a given function ϕ i D O does not exceed the value N I L U T R C .

5. Example of Synthesis

We use the symbol P C ( U a ) to show that the model of P C Mealy FSM (Figure 5) is used to implement the circuit of an FSM U a . In this Section, we show how to design the circuit of Mealy FSM P C ( U 1 ) using 5-LUTs. The synthesis process starts from Table 1.
Step 1. In the previous section, using Table 1 and 5-LUTs, we have got the partition Π C = { C S 1 , C S 2 } . The partition includes the classes C S 1 = { s 1 , s 3 , s 6 , s 7 } and C S 2 = { s 2 , s 4 , s 5 , s 8 } . Therefore, each class includes four states s m S . These classes determines the sets I 1 = { i 1 , i 2 , i 5 } , O 1 = { o 1 , o 2 , o 3 , o 4 } , I 2 = { i 3 , i 4 , i 6 } , and O 2 = { o 1 , o 3 , , o 7 } . Therefore, there is L 1 = L 2 = 3 . Using (19) gives R C 1 = R C 2 = 2 . There is R C j + L j = 5 = N I L U T ( j { 1 , 2 } ) . This means that condition (5) holds for given FSM and K-LUTs. Therefore, it is possible to use the model P C ( U 1 ) . The total number of elements in the sets O j ( j { 1 , 2 } ) determines how many LUTs are necessary to generated the partial output functions.
The sets I 1 and I 2 have no shared inputs ( I 1 I 2 = ) . This relation shows that there is the optimal system of interconnections between FSM inputs and LUTs of BPF.
Step 2. As we have found, there is R C 1 = R C 2 = 2 . Using (14) gives R C C = 4 . Now, we have the sets T = { T 1 , T 2 , T 3 } , T C = { T 1 } , and T A = { T 3 , T 4 } . One of the possible outcomes of the encoding is shown in Figure 6.
So, the classes are encoded in the following way: K ( C S 1 ) = 0 and K ( C S 2 ) = 1 . For example, the following relation holds: C ( s 1 ) = C ( s 2 ) = 00 . Using the codes of classes of compatible states gives the following composite state codes: C C ( s 1 ) = 000 and C C ( s 2 ) = 100 . Using the same approach, we can find the CSCs for all states s m S .
Step 3. Compared to STT (Table 1), the DST includes three additional columns. They are: C C ( s C ) including a CSC of the current state s C S ; C C ( s T ) with a CSC of the state of transition s T S ; D h with IMFs equal to 1 to load the code C C ( s T ) into SCR. In the discussed example, DST is represented by Table 3.
Step 4. The DST (Table 3) determines contents of tables of blocks of partial functions BPF j ( j { 1 , , J C } ) . In these tables, the column C C ( s C ) is replaced by the column C ( s C ) ; the column O h is replaced by the column O h j ; the column D h is replaced by the column D h j . The superscript j indicates that these functions are generated by the block BPF j ( j { 1 , , J C } ) .
In the discussed case, there are two blocks of partial functions. Table 4 represents the block B P F 1 and Table 5 represents the block B P F 2 . There are H 1 = 10 rows in Table 4, and H 2 = 9 rows in Table 5. Together, these tables have exactly H = 19 rows (as the number of rows in Table 3).
Step 5. There are the following columns in the table of MXO: o n , B l o c k . The second column is divided by J C sub-columns. If a partial output o n j O presents in the table of BPF j , then there is 1 on the intersection of the column j and the row o n . Otherwise, this intersection is marked by 0. The table is constructing using tables of blocks BPF j . In the discussed case, this is Table 6.
Step 6. There are the following columns in the table of MXSV: D r , B l o c k . As in the previous case, the column B l o c k is divided by J C sub-columns. If a partial IMF D r j presents in the table of BPF j , then there is 1 on the intersection of the column j and the row D r . Otherwise, this intersection is marked by 0. The table is constructing using tables of blocks BPF j . In the discussed case, this is Table 7.
Step 7. The BPF is represented by SBFs (15) and (16). These systems are constructed using tables of BPF j ( j { 1 , , J C } ) . The partial functions depend on product terms which are conjunctions of S m and I h . The conjunction S m is determined by the code C ( s m ) . For example, there is S 1 = T 2 ¯ T 3 ¯ = S 2 .
For example, using Table 4 gives us the following sum-of-products for functions D 1 1 and o 1 1 :
D 1 1 = F 1 F 3 [ F 4 F 5 ] F 9 = T 2 ¯ T 3 ¯ i 1 T 2 ¯ T 3 ¯ i 1 ¯ i 2 ¯ T 2 ¯ T 3 i 5 T 2 T 3 i 1 ¯ i 5 ; o 1 1 = F 1 F 7 F 10 = T 2 ¯ T 3 ¯ i 1 T 2 T 3 ¯ T 2 T 3 i 1 ¯ i 5 ¯ .
Using Table 5 gives us the following sum-of-products for functions D 1 2 and o 1 2 :
D 1 2 = [ F 2 F 3 ] [ F 4 F 5 ] = T 2 ¯ T 3 ¯ i 3 T 2 T 3 ¯ ; o 1 2 = F 1 F 6 = T 2 ¯ T 3 ¯ i 3 T 2 T 3 ¯ i 3 i 6 .
Using Table 6 gives the SBF representing the MXO. This SBF is created in the trivial way. In the discussed case, this is the following system:
o 1 = T 1 ¯ o 1 1 T 1 o 1 2 ; o 2 = T 1 ¯ o 2 1 ; o 3 = T 2 ¯ o 3 1 T 2 o 3 2 ; o 4 = T 1 ¯ o 4 1 T 2 o 4 2 ; o 5 = T 1 o 5 2 ; o 6 = T 1 o 6 2 ; o 7 = T 1 o 7 2 .
Using Table 7 gives the SBF representing the MXSV. This SBF is created in the trivial way, too. In the discussed case, this is the following system:
D 1 = T 1 ¯ D 1 1 T 1 D 1 2 ; D 2 = T 1 ¯ D 2 1 T 1 D 2 2 ; D 3 = T 1 ¯ D 3 1 T 1 D 3 2 .
Step 8. Using the obtained SBFs, we can get a logic circuit of Mealy FSM P C ( U 1 ) . It is shown in Figure 7.
Because each partial function includes no more than N I L U T = 5 arguments, there are 6 LUTs implementing the partial IMFs D r j and 10 LUTs implementing the partial outputs o n j . Therefore, there are 16 LUTs in the circuit of BPF. To implement IMFs (17), it is enough R C C = 3 LUTs. There are 7 LUTs in the circuit of MXO. Therefore, there are 26 5-LUTs in the logic circuit of Mealy FSM P C ( U 1 ) . These LUTs are connected using three buses. The B u s I T combines wires with inputs i l I and state variables T r T A . The B u s D O T includes wires with partial IMFs D r j D , partial outputs o n j O , and state variables T r T C used as control inputs of multiplexers MXO and MXSV. The BusT is an output bus of the distributed SCR. This bus includes wires with state variables T r T C T A . The buffers of the synchronization tree control the flip-flops connected with outputs of LUT17-LUT19.
We can compare LUT counts for Mealy FSMs P C ( U 1 ) and P E ( U 1 ) . In both cases, we use 5-LUTs. We have synthesized the logic circuit of P E ( U 1 ) . There is J = 3 for P E ( U 1 ) . There are 10 LUTs in the circuit of Block1, 11 elements in Block2, and 4 elements in Block3. These three blocks represent the block of partial functions having 25 5-LUTs. There are 10 LUTs in the circuit of BlockOR. Therefore, there are 35 5-LUTs in the logic circuit of Mealy FSM P E ( U 1 ) . It means that using the model P C ( U 1 ) instead of P E ( U 1 ) allows a decrease in the LUT count by 35:26 = 1.35 times. At the same time, both circuits have the same number of logic levels.
To get the electrical circuit of Mealy FSMs P C ( U 1 ) , it is necessary to execute the step of technology mapping [36]. This is connected with using the sophisticated CAD tools. In the case of circuits implemented with internal resources of Virtex-7, the industrial package Vivado [34] should be used. The Vivado executes the steps of technology mapping (such as mapping, placement, and so on). Obtaining an FSM circuit allows the determination of its real characteristics such as the number of LUTs and the minimum latency time. Using the latency time gives the maximum value of synchronization frequency. In addition, the value of power consumption is determined for the maximum operating frequency.
From the discussion of SLICEL follows that we cannot use Vivado to get the 5-LUT-based circuit of Mealy FSMs P C ( U 1 ) . However, in Section 6, we show results of experiments conducted using Vivado and the library of benchmark FSMs [19].

6. Experimental Results

We conducted a lot of experiments to compare the basic characteristics of P C -based Mealy FSMs with characteristics of FSM circuits based on some other models. The benchmark FSMs from the library [37] are used for the experiments. De facto, the used 48 benchmarks are represented by their state transition tables. The tables are represented by KISS2-based files. The basic characteristics of benchmarks (the values of parameters M, L, and N) have a wide range. Due to it, these benchmarks are used in many research as a base for comparison different FSM design methods. We do not show the characteristics of benchmark FSMs in this article. They can be found, for example, in [17]
We execute the experiments using a personal computer with the following characteristics: CPU: Intel Core i7 6700 K [email protected] GHz, Memory: 16 GB RAM 2400 MHz CL15. In addition, we use the Virtex-7 VC709 Evaluation Platform (this platform is based on the following FPGA chip: xc7vx690tffg1761-2) [38] and CAD tool Vivado v2019.1 (64-bit) [34]. There is N I L U T = 6 for FPGAs of Virtex-7. We use reports of Vivado to get the results of experiments. To enter Vivado, we use the CAD tool K2F [5]. This tool allows the creation of VHDL codes on the base of files represented in the KISS2 format.
Three parameters have been compared on the base of our experiments, namely, the chip areas occupied by FSM circuits, performance, and area-time products. To estimate the area, we use the LUT counts taken from reports of Vivado. The performance is represented by the latency time which is achievable for each benchmark FSM. The latency time is shown in Vivado reports. The amount of latency time is inversely proportional to the value of the maximum operating frequency. Thus, the shorter the latency time, the higher the frequency of synchronization pulses can be. The area-time products are calculated as results of multiplication of the LUT counts by the latency times. In our experiments, we use five FSM models. These models are P-FSMs based on either state codes with the minimum length (Auto) or OHCs with maximum number of state variables (One-hot) or some intermediate number of state variables (JEDI). The first two methods are the internal methods of Vivado. Because we try to improve the characteristics of P E -based FSMs, we use this model in our research. Obviously, we use the model of P C -based FSMs proposed in the current paper.
As in our previous research [17], we use the relation between the values of R + L and N I L U T to divide the benchmarks [37] by 5 categories. For LUTs of Virtex-7, there is N I L U T = 6 . We use this value to divide the benchmarks by the categories. The FSMs are trivial (category 0), if the result of summation of R and L does not exceeds 6. The FSMs are simple (category 1), if the result of summation does not exceeds 12. The FSMs are average (category 2), if the result of summation does not exceeds 18. The FSMs are big (category 3), if the result of summation does not exceeds 24. Otherwise, the benchmarks FSMs are very big (category 4). It is shown in the article [5] that there is a direct dependence between the improving of FSM characteristics due to using SD-based methods and the category number.
For our conditions, there is the following distribution of benchmarks [37] by categories. The category 0 consists of FSMs represented by: bbtas, dk17, dk27, dk512, ex3, ex5, lion, lion9, mc, modulo12, and shiftreg. The following FSMs create the category 1: bbara, bbsse, beecount, cse, dk14, dk15, dk16, donfile, ex2, ex4, ex6, ex7, keyb, mark1, opus, s27, s386, s840, and sse. The category 2 contains the FSMs: ex1, kirkman, planet, planet1, pma, s1, s1488, s1494, s1a, s208, styr, and tma. There is single FSM sand in the category of big benchmarks. Four FSMs (s420, s510, s820, and s832) belong to the category 4.
The results of experiments are shown in Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20, Table 21, Table 22 and Table 23. These tables are organized in the same manner. The table columns are marked by the names of investigated methods. The names of benchmarks are written in the table rows. The rows “Total” contain results of summation of values for each column. The row “Percentage” includes the percentage of summarized characteristics of FSM circuits produced by other methods respectively to P C -based FSMs. We use the model of P Mealy FSM as a starting point for methods Auto, One-hot, and JEDI.
Let us analyse the experimental results taken from the tables. The following information can be found in these tables: (1) the LUT counts for all benchmarks (Table 8); (2) the LUT counts for benchmarks of category 0 (Table 9); (3) the LUT counts for benchmarks of category 1 (Table 10); (4) the LUT counts for benchmarks of categories 2–4 (Table 11); (5) the latency time for all benchmarks (Table 12); (6) the latency time for benchmarks of category 0 (Table 13); (7) the latency time for benchmarks of category 1 (Table 14); (8) the latency time for benchmarks of categories 2–4 (Table 15); (9) the maximum operating frequency for all benchmarks (Table 16); (10) the maximum operating frequency for benchmarks of category 0 (Table 17); (11) the maximum operating frequency for benchmarks of category 1 (Table 18); (12) the maximum operating frequency for benchmarks of categories 2–4 (Table 19); (13) the area-time products for all benchmarks (Table 20); (14) the area-time products for benchmarks of category 0 (Table 21); (15) the area-time products for benchmarks of category 1 (Table 22); (16) the area-time products for benchmarks of categories 2–4 (Table 23).
As follows from Table 8, the P C -based FSMs require fewer LUTs than it is for other investigated methods. Our approach produces circuits having 44.87% less 6-LUTs that it is for equivalent Auto-based FSMs; 68.59% less 6-LUTs that it is for equivalent One-hot-based FSMs; 19.31% less 6-LUTs that it is for equivalent JEDI-based FSMs. While developing our method, we hoped that P C -based FSMs will require fewer LUTs in comparison with equivalent P E -based FSMs. As follows from the last column of Table 8, our assumptions turn out to be correct. Our approach produces circuits having an average 15.46% less 6-LUTs that it is for equivalent P E -based FSMs.
As follows from Table 9, our approach loses compared to both Auto-based FSMs (6.04% loss) and JEDI-based FSMs (7.58% loss). However, Table 9 reflects results for the simplest FSM (category 0). Let us point out that, even in this case, our approach gives a gain compared to One-hot-based (30.3%) and P E -based (7.58%) FSMs.
Analysis of Table 10 and Table 11 shows that the P C -based FSMs have circuits with fewer LUTs compared with all other investigated approaches. Compared with Auto-based FSMs, there is either 33.99% win rate (category 1) or 52.45% of gain (categories 2–4). Compared with One-hot-based FSMs, there is either 73.27% win rate (category 1) or 69.85% of gain (categories 2–4). Compared with JEDI-based FSMs, there is either 11.22% of gain (category 1) or 24.12% win rate (categories 2–4). Compared with P E -based FSMs, there is either 12.87% of gain (category 1) or 16.95% win rate (categories 2–4). Therefore, the gain from applying the proposed approach in relation to P E -based FSMs increases as the complexity of the FSM increases (increasing the category number).
As follows from Table 12, our approach produces faster LUT-based FSM circuits relative to other investigated methods. The average win is from 3.01% (compared with P E -based FSMs) to 17.71% relative to One-hot based FSMs.
For category 0 (Table 13), our approach provides minimal gain relative to Auto-based FSMs (0.19%) and One-hot-based FSMs (3.11%). At the same time, P C -based FSMs are a bit slower than their counterparts based on either JEDI (0.49%) or extended state codes (0.01%). Of course, such a loss is extremely insignificant. Analysis of Table 14 and Table 15 shows that our approach gives gain relatively to all other design methods starting from category 1. For category 1 (Table 14), there is the following gain in FSM performance: 16.25% compared with Auto, 16.13% compared with One-hot, 8.03% compared with JEDI-based FSMs, and 3.14% compared with P E -based counterparts. The gain is increased with increasing the category. This follows from Table 15 containing experimental results for categories 2–4. For these categories, there is the following gain: (1) 26.03% regarding Auto; (2) 27.12% regarding One-hot; (3) 16.32% regarding JEDI-based FSMs and (4) 4.52% regarding P E -based FSMs.
Obviously, using the latency time we can obtain the values of maximum operating frequency. This characteristic for all benchmarks is shown in Table 16. As follows from Table 16, our approach produces faster LUT-based FSM circuits relative to other investigated methods. The average win is from 3.99% (compared with P E -based FSMs) to 29.04% relative to One-hot based FSMs.
For category 0 (Table 17), our approach provides minimal gain relative to Auto-based FSMs (0.05%) and One-hot- based FSMs (2.5%). At the same time, P C -based FSMs are a bit slower than their counterparts based on either JEDI (0.85%) or extended state codes (0.01%). Of course, such a loss is extremely insignificant. Analysis of Table 18 and Table 19 shows that our approach gives gain relatively to all other design methods staring from category 1. For category 1 (Table 18), there is the following gain in FSM performance: 27.6% compared with Auto, 27.7% compared with One-hot, 22.46% compared with JEDI-based FSMs, and 3.88% compared with P E -based counterparts. The gain is increased with increasing the category. Therefore, for categories 2-4 (Table 19), we have the following gain: (1) 43.61% regarding Auto; (2) 43.76% regarding One-hot; (3) 36.17% regarding JEDI-based FSMs and (4) 6.11% regarding P E -based FSMs.
The main goal of our method was to reduce the number of LUTs (the chip area occupied by FSM circuit) compared to P E -based FSMs. The results of experiments show that this goal has been achieved. In addition, our approach simultaneously allows an increase in the maximum operating frequency (it is the same as the decreasing of the latency time). Due to it, our approach produces FSM circuits with the best values of area-time products. The corresponding values are shown in Table 20. Our approach provides the following average gain: (1) 83.10% regarding Auto; (2) 112.15% regarding One-hot; (3) 36.8% regarding JEDI and (4) 20.26% regarding P E -based FSMs. Analysis of Table 21 and Table 23 shows that the gain obtained by our approach increases with the increasing the FSM category.
For category 0 (Table 21), our approach loses out to the other two approaches: 4.54% lost relative to Auto-based FSMs and 6.68% lost relative to JEDI-based FSMs. However, for this category, our approach has gain compared with One-hot (35.95%) and P E -based FSMs (6.99%).
As follows from Table 22, our approach provides the win rate equal to: (1) 59.86% regarding Auto; (2) 105.58% regarding one-hot; (3) 21.59% regarding JEDI; (4) 16.53% regarding P E -based FSMs. As follows from Table 23, our approach provides the win rate equal to: (1) 95.58% regarding Auto; (2) 118.97% regarding one-hot; (3) 44.08% regarding JEDI; (4) 22.22% regarding P E -based FSMs.
So, the results of our experiments show that the proposed approach can be used instead of other models starting from simple FSMs (category 1). Our approach allows an improvement in LUT counts, maximum operating frequency (minimum latency time), and area-time products compared with other investigated design methods. We think that our approach has rather good potential and can be used in CAD systems targeting FPGA-based Mealy FSMs.

7. Conclusions

Very often, modern CPSs use FPGAs for implementing circuits of their digital parts. Modern FPGAs are very complicated devices having up to 7 billion transistors [15]. They are very efficient platforms for implementing various digital systems. As the complexity of the digital parts of CPSs increases, there is getting deeper a gap between a very big number of system inputs and a very small number of LUT inputs. Modern LUTs have around six inputs. Using internal multiplexers allows for LUTs that have up to eight inputs [24]. However, this value is still rather small compared with numbers of literals in SBFs representing FSM circuits. As a result, various FD-based design methods are applied to implement an FSM circuit [22]. However, the FD-based FSM circuits are multi-level. In addition, they have very complicated systems of spaghetti-type interconnections.
In the case of LUT-based design, the structural decomposition allows for significant improvement in the FSM basic characteristics. They are much better than the corresponding characteristics of equivalent FD-based circuits [22]. As follows from [17], the ESC-based FSMs have better performance than their counterparts based on twofold state assignment [5]. However, this gain is connected with increasing the LUT counts in ESC-based FSMs compared with FSM circuits obtained using the twofold state assignment. This is the main drawback of ESC-based methods [17].
This drawback may be eliminated due to applying composite state codes proposed in this article. Such a code is represented by a concatenation of the class code and the code of a state as an element of this class. This approach leads to two-level FSM circuits which require fewer LUTs than their ESC-based counterparts. There is a gain in the number of LUTs around 15.46%. At the same time, the CSC-based FSMs have slightly better performance as their ESC-based counterparts (around 3% relative to the latency time). Therefore, the proposed method is a good alternative to both the FSM design methods based on functional decomposition [22] and the method [17] based on extended state codes. Due to it, this approach can be used for optimizing characteristics of sequential blocks used in digital parts of cyber-physical systems.
We see two directions of future research. The first of them is the adaptation of the proposed method for optimizing the characteristics of LUT-based Moore FSMs. The second is related to the following. It is known that one of the most important areas of the development of modern cyber-physical systems is the protection of confidentiality of information [39]. This means that the opacity of the systems should be as high as possible [40]. This applies to the full extent to the sequential blocks of CPSs. The method proposed in our work is not aimed at increasing the level of security. Realizing the importance of this problem, we plan to develop an appropriate method to ensure the security of sequential blocks.

Author Contributions

Conceptualization, A.B., L.T. and K.K.; methodology, A.B., L.T. and K.K.; formal analysis, A.B., L.T. and K.K.; writing—original draft preparation, A.B., L.T. and K.K.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BOFblock of output functions
BPFblock of partial functions
BSVblock of state variables
CLBconfigurable logic block
CPScyber-physical system
CSCcomposite state code
DSTdirect structure table
ESCextended state code
FDfunctional decomposition
FSMfinite state machine
FPGAfield-programmable gate array
IMFinput memory function
LUTlook-up table
MXOmultiplexer of outputs
MXSVmultiplexer of state variables
OHCone-hot code
RSCregister of state codes
SBFsystems of Boolean functions
SCRstate code register
SDstructural decomposition
STTstate transition table

References

  1. Alur, R. Principles of Cyber-Physical Systems; MIT Press: Cambridge, MA, USA, 2015. [Google Scholar]
  2. Suh, S.C.; Tanik, U.J.; Carbone, J.N.; Eroglu, A. Applied Cyber-Physical Systems; Springer: New York, NY, USA, 2014. [Google Scholar]
  3. Marwedel, P. Embedded System Design: Embedded Systems Foundations of Cyber-Physical Systems, and the Internet of Things, 3rd ed.; Springer International Publishing: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  4. Wiśniewski, R.; Bazydło, G.; Szcześniak, P.; Wojnakowski, M. Petri net-based specification of cyber-physical systems oriented to control direct matrix converters with space vector modulation. IEEE Access 2019, 7, 23407–23420. [Google Scholar] [CrossRef]
  5. Barkalov, A.; Titarenko, L.; Mielcarek, K.; Chmielewski, S. Logic Synthesis for FPGA-Based Control Units—Structural Decomposition in Logic Design; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2020; Volume 636. [Google Scholar]
  6. Wiśniewski, R.; Benysek, G.; Gomes, L.; Kania, D.; Simos, T.; Zhou, M. IEEE Access Special Section: Cyber-Physical Systems. IEEE Access 2019, 7, 157688–157692. [Google Scholar] [CrossRef]
  7. Gajski, D.D.; Abdi, S.; Gerstlauer, A.; Schirner, G. Embedded System Design: Modeling, Synthesis and Verification; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  8. Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
  9. Micheli, G.D. Synthesis and Optimization of Digital Circuits; McGraw-Hill: Cambridge, MA, USA, 1994. [Google Scholar]
  10. Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 231. [Google Scholar]
  11. Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. [Google Scholar] [CrossRef]
  12. Zając, W.; Andrzejewski, G.; Krzywicki, K.; Królikowski, T. Finite State Machine Based Modelling of Discrete Control Algorithm in LAD Diagram Language with Use of New Generation Engineering Software. Procedia Comput. Sci. 2019, 159, 2560–2569. [Google Scholar] [CrossRef]
  13. Sentovich, E.M.; Singh, K.J.; Lavagno, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.R.; Brayton, R.K.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; University of California: Berkely, CA, USA, 1992. [Google Scholar]
  14. Grout, I. Digital Systems Design with FPGAs and CPLDs; Elsevier Science: Amsterdam, The Netherlands, 2011. [Google Scholar]
  15. Trimberger, S.M. Field-Programmable Gate Array Technology; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  16. Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef] [Green Version]
  17. Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving Characteristics of LUT-Based Mealy FSMs with Twofold State Assignment. Electronics 2021, 10, 901. [Google Scholar] [CrossRef]
  18. Xilinx FPGAs. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 11 February 2022).
  19. Salauyou, V.; Ostapczuk, M. State Assignment of Finite-State Machines by Using the Values of Output Variables. In Theory and Applications of Dependable Computer Systems. DepCoS-RELCOMEX 2020. Advances in Intelligent Systems and Computing; Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J., Eds.; Springer: Cham, Switzerland, 2020; Volume 1173, pp. 543–553. [Google Scholar]
  20. Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. 2019, 67, 947–956. [Google Scholar]
  21. Kuon, I.; Tessier, R.; Rose, J. FPGA architecture: Survey and challenges—Found trends. Electr. Des. Autom. 2008, 2, 135–253. [Google Scholar]
  22. Scholl, C. Functional Decomposition with Application to FPGA Synthesis; Kluwer Academic Publishers: Boston, MA, USA, 2001. [Google Scholar]
  23. Kubica, M.; Kania, D. Decomposition of multi-level functions oriented to configurability of logic blocks. Bull. Pol. Acad. Sci. 2017, 67, 317–331. [Google Scholar]
  24. Chapman, K. Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources; Xilinx All Programmable; Xilinx, Inc.: San Jose, CA, USA, 2014; pp. 1–32. [Google Scholar]
  25. Altera. Cyclone IV Device Handbook. Available online: http://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf (accessed on 11 February 2022).
  26. Mishchenko, A.; Brayton, R.; Jiang, J.H.R.; Jang, S. Scalable don’t-care-based logic optimization and resynthesis. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 2011, 4, 1–23. [Google Scholar] [CrossRef]
  27. Sklarova, D.; Sklarov, V.A.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Tallinn, Estonia, 2012. [Google Scholar]
  28. Sklyarov, V. Synthesis and implementation of RAM-based finite state machines in FPGAs. In International Workshop on Field Programmable Logic and Applications; Springer: Berlin/Heidelberg, Germany, 2000; pp. 718–727. [Google Scholar]
  29. Mishchenko, A.; Chattarejee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. CAD 2006, 27, 240–253. [Google Scholar]
  30. Kubica, M.; Kania, D.; Kulisz, J. A technology mapping of fsms based on a graph of excitations and outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
  31. El-Maleh, A.H. A Probabilistic Tabu Search State Assignment Algorithm for Area and Power Optimization of Sequential Circuits. Arab. J. Sci. Eng. 2020, 45, 6273–6285. [Google Scholar] [CrossRef]
  32. ABC System. Available online: https://people.eecs.berkeley.edu/alanmi/abc/ (accessed on 11 February 2022).
  33. Brayton, R.; Mishchenko, A. ABC: An Academic Industrial-Strength Verification Tool. In Computer Aided Verification; Touili, T., Cook, B., Jackson, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 24–40. [Google Scholar]
  34. Vivado Design Suite User Guide: Synthesis; UG901 (v2019.1). Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 11 February 2022).
  35. Quartus Prime. Available online: https://www.intel.pl/content/www/pl/pl/software/programmable/quartus-prime/overview.html (accessed on 11 February 2022).
  36. Khatri, S.P.; Gulati, K. Advanced Techniques in Logic Synthesis, Optimizations and Applications; Springer: New York, NY, USA, 2011. [Google Scholar]
  37. McElvain, K. LGSynth93 Benchmark; Mentor Graphics: Wilsonville, OR, USA, 1993. [Google Scholar]
  38. VC709 Evaluation Board for the Virtex-7 FPGA User Guide; UG887 (v1.6); Xilinx, Inc.: San Jose, CA, USA, 2019.
  39. An, L.; Yang, G.H. Opacity enforcement for confidential robust control in linear cyber-physical systems. IEEE Trans. Autom. Control 2019, 65, 1234–1241. [Google Scholar] [CrossRef]
  40. An, L.; Yang, G.H. Enhancement of opacity for distributed state estimation in cyber-physical systems. Automatica 2022, 136, 110087. [Google Scholar] [CrossRef]
Figure 1. Structural diagram of P Mealy FSM.
Figure 1. Structural diagram of P Mealy FSM.
Energies 15 02636 g001
Figure 2. Part of SLICEL of Virtex-7.
Figure 2. Part of SLICEL of Virtex-7.
Energies 15 02636 g002
Figure 3. Structural diagram of LUT-based P Mealy FSM.
Figure 3. Structural diagram of LUT-based P Mealy FSM.
Energies 15 02636 g003
Figure 4. Structural diagram of P E Mealy FSM (Adapted from [17]).
Figure 4. Structural diagram of P E Mealy FSM (Adapted from [17]).
Energies 15 02636 g004
Figure 5. Structural diagram of P C Mealy FSM.
Figure 5. Structural diagram of P C Mealy FSM.
Energies 15 02636 g005
Figure 6. Composite state codes of Mealy FSM P C ( U 1 ) .
Figure 6. Composite state codes of Mealy FSM P C ( U 1 ) .
Energies 15 02636 g006
Figure 7. Logic circuit of Mealy FSM P C ( U 1 ) .
Figure 7. Logic circuit of Mealy FSM P C ( U 1 ) .
Energies 15 02636 g007
Table 1. State transition table of U 1 .
Table 1. State transition table of U 1 .
s m s T I h O h h
s 1 s 2 i 1 o 1 o 2 1
s 3 i 1 ¯ i 2 o 4 2
s 4 i 1 ¯ i 2 o 3 3
s 2 s 3 i 3 o 1 o 5 4
s 5 i 3 ¯ i 4 o 6 5
s 2 i 3 ¯ i 4 ¯ o 4 o 7 6
s 3 s 4 i 5 i 2 o 2 o 4 7
s 5 i 5 i 2 o 3 8
s 6 i 5 ¯ o 2 9
s 4 s 4 i 4 o 3 o 5 10
s 5 i 4 ¯ o 6 11
s 5 s 1 i 3 i 6 o 1 o 5 12
s 6 i 3 i 6 ¯ o 4 o 7 13
s 7 i 3 ¯ o 3 14
s 6 s 7 1 o 1 o 2 o 4 15
s 7 s 1 i 1 16
s 8 i 1 ¯ i 5 o 3 17
s 7 i 1 ¯ i 5 ¯ o 1 o 2 18
s 8 s 1 1 o 6 19
Table 2. Extended state codes for FSM P E ( U 1 ) .
Table 2. Extended state codes for FSM P E ( U 1 ) .
s m CS 1 CS 2 CS 3
T 1 T 2 T 3 T 4 T 5 T 6
s 1 010000
s 2 000100
s 3 100000
s 4 001000
s 5 001100
s 6 000001
s 7 110000
s 8 000010
Table 3. Direct structure table of P C ( U 1 ) .
Table 3. Direct structure table of P C ( U 1 ) .
s C CC ( s C ) s T CC ( s T ) I h O h D h h
s 1 000 s 2 100 i 1 o 1 o 2 D 1 1
s 3 001 i 1 ¯ i 2 o 4 D 3 2
s 4 101 i 1 ¯ i 2 ¯ o 3 D 1 D 3 3
s 2 100 s 3 001 i 3 o 1 o 5 D 3 4
s 5 110 i 3 ¯ i 4 o 6 D 1 D 2 5
s 2 100 i 3 ¯ i 4 ¯ o 4 o 7 D 1 6
s 3 001 s 4 101 i 5 i 2 o 2 o 4 D 1 D 3 7
s 5 110 i 5 i 2 ¯ o 3 D 1 D 2 8
s 6 010 i 5 ¯ o 2 D 2 9
s 4 101 s 4 101 i 4 o 3 o 5 D 1 D 3 10
s 5 110 i 4 ¯ o 6 D 1 D 2 11
s 5 110 s 1 000 i 3 i 6 o 2 o 5 12
s 6 010 i 3 i 6 ¯ o 4 o 7 D 2 13
s 7 011 i 3 ¯ o 3 D 2 D 3 14
s 6 010 s 7 0111 o 1 o 2 o 4 D 2 D 3 15
s 7 011 s 1 000 i 1 16
s 8 111 i 1 ¯ i 5 o 3 D 1 D 2 D 3 17
s 7 011 i 1 ¯ i 5 ¯ o 1 o 2 D 2 D 3 18
s 8 111 s 1 0001 o 6 19
Table 4. Table of B P F 1 of Mealy FSM P C ( U 1 ) .
Table 4. Table of B P F 1 of Mealy FSM P C ( U 1 ) .
s C C ( s C ) s T C ( s T ) I h O h 1 D h 1 h
s 1 00 s 2 100 i 1 o 1 1 o 2 1 D 1 1 1
s 3 001 i 1 ¯ i 2 o 4 1 D 3 1 2
s 4 101 i 1 ¯ i 2 ¯ o 3 1 D 1 1 D 3 1 3
s 3 01 s 4 101 i 5 i 2 o 2 1 o 4 1 D 1 1 D 3 1 4
s 5 110 i 5 i 2 ¯ o 3 1 D 1 1 D 2 1 5
s 6 010 i 5 ¯ o 2 1 D 2 1 6
s 6 10 s 7 0111 o 1 1 o 2 1 o 4 1 D 2 1 D 3 1 7
s 7 11 s 1 000 i 1 8
s 8 111 i 1 ¯ i 5 o 3 1 D 1 1 D 2 1 D 3 1 9
s 7 011 i 1 ¯ i 5 ¯ o 1 1 o 2 1 D 2 1 D 3 1 10
Table 5. Table of B P F 2 of Mealy FSM P C ( U 1 ) .
Table 5. Table of B P F 2 of Mealy FSM P C ( U 1 ) .
s C C ( s C ) s T C ( s T ) I h O h 2 D h 2 h
s 2 00 s 3 001 i 3 o 1 2 o 5 2 D 3 2 1
s 5 110 i 3 ¯ i 4 o 6 2 D 1 2 D 2 2 2
s 2 100 i 3 ¯ i 4 ¯ o 4 2 o 7 2 D 1 2 3
s 4 01 s 4 101 i 4 o 3 2 o 5 2 D 1 2 D 3 2 4
s 5 110 i 4 ¯ o 6 2 D 1 2 D 2 2 5
s 5 10 s 1 000 i 3 i 6 o 1 2 o 5 2 6
s 6 010 i 3 i 6 ¯ o 4 2 o 7 2 D 2 2 7
s 7 011 i 3 ¯ o 3 2 D 2 2 D 3 2 8
s 8 11 s 1 0001 o 6 2 9
Table 6. Table of MXO for Mealy FSM P C ( U 1 ) .
Table 6. Table of MXO for Mealy FSM P C ( U 1 ) .
o n Block
12
o 1 11
o 2 10
o 3 11
o 4 11
o 5 01
o 6 01
o 7 01
Table 7. Table of MXSV for Mealy FSM P C ( U 1 ) .
Table 7. Table of MXSV for Mealy FSM P C ( U 1 ) .
D r Block
12
D 1 11
D 2 11
D 3 11
Table 8. Experimental results (LUT counts for all benchmarks).
Table 8. Experimental results (LUT counts for all benchmarks).
BenchmarkAutoOne-HotJEDI P E FSM P C FSMCategory
bbara17171013111
bbsse33372426211
bbtas555550
beecount19191414121
cse40663634311
dk1416271012111
dk15151612971
dk161534121191
dk175125550
dk27354440
dk51210109990
donfile31312421191
ex170745344372
ex2998981
ex3999990
ex415131211101
ex599910100
ex624362222201
ex7454641
keyb43614038361
kirkman42583937312
lion252440
lion96115650
mark123232020181
mc474640
modulo12777770
opus28282225231
planet1311318887722
planet11311318887722
pma94948680702
s165996161512
s148812413110896832
s149412613211094822
s1a49814347382
s2081231101192
s276186871
s38626392222201
s420103191094
s51048483231284
s89991291
s82088826859544
s83280796261504
sand132132114108973
shiftreg262640
sse33373029271
styr931208179652
tma45393936312
Total18082104148914411248
Percentage,%144.87168.59119.31115.46100.00
Table 9. Experimental results (LUT counts for benchmarks of category 0).
Table 9. Experimental results (LUT counts for benchmarks of category 0).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbtas55555
dk17512555
dk2735444
dk5121010999
ex399999
ex59991010
lion25244
lion9611565
mc47464
modulo1277777
shiftreg26264
Total6286617166
Percentage,%93.94130.3092.42107.58100.00
Table 10. Experimental results (LUT counts for benchmarks of category 1).
Table 10. Experimental results (LUT counts for benchmarks of category 1).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbara1717101311
bbsse3337242621
beecount1919141412
cse4066363431
dk141627101211
dk1515161297
dk16153412119
donfile3131242119
ex299898
ex41513121110
ex62436222220
ex745464
keyb4361403836
mark12323202018
opus2828222523
s27618687
s3862639222220
s8999129
sse3337302927
Total406525337342303
Percentage,%133.99173.27111.22112.87100.00
Table 11. Experimental results (LUT counts for benchmarks of categories 2–4).
Table 11. Experimental results (LUT counts for benchmarks of categories 2–4).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
ex17074534437
kirkman4258393731
planet131131888772
planet1131131888772
pma9494868070
s16599616151
s14881241311089683
s14941261321109482
s1a4981434738
s208123110119
styr93120817965
tma4539393631
sand13213211410897
s42010319109
s5104848323128
s8208882685954
s8328079626150
Total1340149310911028879
Percentage,%152.45169.85124.12116.95100.00
Table 12. Experimental results (the latency time for all benchmarks, nsec).
Table 12. Experimental results (the latency time for all benchmarks, nsec).
BenchmarkAutoOne-HotJEDI P E FSM P C FSMCategory
bbara5.1715.1714.7124.6904.5831
bbsse6.3675.9135.4844.7314.6071
bbtas4.8984.8984.8524.9914.9890
beecount6.0026.0025.3385.2165.0221
cse6.8296.1115.6145.3235.1971
dk145.2185.7925.1595.0294.9111
dk155.1945.3955.1324.9764.8541
dk165.8925.7215.0734.9064.7861
dk175.0185.9885.0155.0035.0240
dk274.8544.9534.8985.0855.0950
dk5125.0955.0955.0064.8044.8000
donfile5.4345.4354.9104.8034.6701
ex16.6257.1555.6545.3535.2352
ex25.0365.0364.9974.9604.7741
ex35.1325.1325.1084.9724.9680
ex45.5265.6275.1865.0684.9721
ex55.5485.5485.5205.4945.4900
ex65.8976.1055.6635.3095.0711
ex74.9994.9794.9854.7894.7141
keyb6.3926.9705.9375.4065.1281
kirkman7.0736.4946.3825.6455.4892
lion4.9404.9024.9424.9965.0000
lion94.8715.3994.8454.8284.8250
mark16.1586.1585.6765.3345.1881
mc5.0855.1165.0795.0995.0850
modulo124.8314.8314.8284.8054.8010
opus6.0176.0175.6085.4535.1741
planet7.5357.5356.3645.8305.6572
planet17.5357.5356.3645.8305.6572
pma6.8416.8415.8885.5615.4302
s16.8307.3616.3635.8325.6792
s14887.2207.5796.3625.7375.5862
s14946.6946.8616.0855.8125.6422
s1a6.5205.6695.9115.4195.2862
s2085.7365.6675.5945.3975.2522
s275.0325.2225.0224.8894.8071
s3865.9475.7655.5825.2955.1911
s4208.7818.5878.5297.5017.0354
s5108.5008.5008.4527.2596.7934
s85.5555.5885.5185.1644.9501
s8208.9298.8378.5787.2416.7914
s8328.6428.8328.4197.0316.5234
sand8.6238.6237.8857.0856.3033
shiftreg3.8073.7943.6203.8963.9000
sse6.3675.9135.7265.3935.1941
styr7.2677.6976.8665.8065.6332
tma6.1026.7666.0925.6955.5502
Total288.57291.11270.82254.74247.31
Percentage, %116.68117.71109.51103.01100.00
Table 13. Experimental results (the latency time for benchmarks of category 0, nsec).
Table 13. Experimental results (the latency time for benchmarks of category 0, nsec).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbtas4.8984.8984.8524.9914.989
dk175.0185.9885.0155.0035.024
dk274.8544.9534.8985.0855.095
dk5125.0955.0955.0064.8044.800
ex35.1325.1325.1084.9724.968
ex55.5485.5485.5205.4945.490
lion4.9404.9024.9424.9965.000
lion94.8715.3994.8454.8284.825
mc5.0855.1165.0795.0995.085
modulo124.8314.8314.8284.8054.801
shiftreg3.8073.7943.6203.8963.900
Total54.0855.6653.7153.9753.98
Percentage, %100.19103.1199.5199.99100.00
Table 14. Experimental results (the latency time for benchmarks of category 1, nsec).
Table 14. Experimental results (the latency time for benchmarks of category 1, nsec).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbara5.1715.1714.7124.6904.583
bbsse6.3675.9135.4844.7314.607
beecount6.0026.0025.3385.2165.022
cse6.8296.1115.6145.3235.197
dk145.2185.7925.1595.0294.911
dk155.1945.3955.1324.9764.854
dk165.8925.7215.0734.9064.786
donfile5.4345.4354.9104.8034.670
ex25.0365.0364.9974.9604.774
ex45.5265.6275.1865.0684.972
ex65.8976.1055.6635.3095.071
ex74.9994.9794.9854.7894.714
keyb6.3926.9705.9375.4065.128
mark16.1586.1585.6765.3345.188
opus6.0176.0175.6085.4535.174
s275.0325.2225.0224.8894.807
s3865.9475.7655.5825.2955.191
s85.5555.5885.5185.1644.950
sse6.3675.9135.7265.3935.194
Total109.03108.92101.3296.7493.79
Percentage, %116.25116.13108.03103.14100.00
Table 15. Experimental results (the latency time for benchmarks of categories 2–4, nsec).
Table 15. Experimental results (the latency time for benchmarks of categories 2–4, nsec).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
ex16.6257.1555.6545.3535.235
kirkman7.0736.4946.3825.6455.489
planet7.5357.5356.3645.8305.657
planet17.5357.5356.3645.8305.657
pma6.8416.8415.8885.5615.430
s16.8307.3616.3635.8325.679
s14887.2207.5796.3625.7375.586
s14946.6946.8616.0855.8125.642
s1a6.5205.6695.9115.4195.286
s2085.7365.6675.5945.3975.252
styr7.2677.6976.8665.8065.633
tma6.1026.7666.0925.6955.550
sand8.6238.6237.8857.0856.303
s4208.7818.5878.5297.5017.035
s5108.5008.5008.4527.2596.793
s8208.9298.8378.5787.2416.791
s8328.6428.8328.4197.0316.523
Total125.45126.54115.79104.0399.54
Percentage, %126.03127.12116.32104.52100.00
Table 16. Experimental results (the maximum operating frequency for all benchmarks).
Table 16. Experimental results (the maximum operating frequency for all benchmarks).
BenchmarkAutoOne-HotJEDI P E FSM P C FSMCategory
bbara193.39193.39212.21252.44262.221
bbsse157.06169.12182.34238.38248.071
bbtas204.16204.16206.12200.38200.450
beecount166.61166.61187.32241.72249.131
cse146.43163.64178.12247.86252.431
dk14191.64172.65193.85223.84233.631
dk15192.53185.36194.87226.97236.021
dk16169.72174.79197.13253.82264.931
dk17199.28167.00199.39199.87199.030
dk27206.02201.90204.18196.65196.280
dk512196.27196.27199.75208.17208.320
donfile184.03184.00203.65248.19258.111
ex1150.94139.76176.87276.82291.012
ex2198.57198.57200.14241.61251.451
ex3194.86194.86195.76201.12201.290
ex4180.96177.71192.83237.31247.141
ex5180.25180.25181.16182.01182.140
ex6169.57163.80176.59238.35248.211
ex7200.04200.84200.60240.83250.141
keyb156.45143.47168.43224.98235.011
kirkman141.38154.00156.68227.15242.192
lion202.43204.00202.35200.18200.020
lion9205.30185.22206.38207.13207.240
mark1162.39162.39176.18227.47237.761
mc196.66195.47196.87196.12196.650
modulo12207.00207.00207.13208.12208.310
opus166.20166.20178.32213.40223.261
planet132.71132.71187.14251.54266.782
planet1132.71132.71187.14251.54266.782
pma146.18146.18169.83239.83254.172
s1146.41135.85157.16221.47236.092
s1488138.50131.94157.18244.31259.032
s1494149.39145.75164.34242.05257.242
s1a153.37176.40169.17214.53229.172
s208174.34176.46178.76255.28270.422
s27198.73191.50199.13238.53248.041
s386168.15173.46179.15218.87228.631
s420173.88176.46177.25263.32283.144
s510177.65177.65198.32297.76317.224
s8180.02178.95181.23213.65223.041
s820152.00153.16176.58268.10287.264
s832145.71153.23173.78274.22293.314
sand115.97115.97126.82221.14239.653
shiftreg262.67263.57276.26256.69256.410
sse157.06169.12174.63205.41215.531
styr137.61129.92145.64232.24247.532
tma163.88147.80164.14235.59250.182
Total8127.088061.228718.8710,906.9611,360.06
Percentage, %71.5470.9676.7596.01100.00
Table 17. Experimental results (the maximum operating frequency for benchmarks of category 0).
Table 17. Experimental results (the maximum operating frequency for benchmarks of category 0).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbtas204.16204.16206.12200.38200.45
dk17199.28167.00199.39199.87199.03
dk27206.02201.90204.18196.65196.28
dk512196.27196.27199.75208.17208.32
ex3194.86194.86195.76201.12201.29
ex5180.25180.25181.16182.01182.14
lion202.43204.00202.35200.18200.02
lion9205.30185.22206.38207.13207.24
mc196.66195.47196.87196.12196.65
modulo12207.00207.00207.13208.12208.31
shiftreg262.67263.57276.26256.69256.41
Total2254.902199.702275.352256.442256.14
Percentage, %99.9597.50100.85100.01100.00
Table 18. Experimental results (the maximum operating frequency for benchmarks of category 1).
Table 18. Experimental results (the maximum operating frequency for benchmarks of category 1).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbara193.39193.39212.21252.44262.22
bbsse157.06169.12182.34238.38248.07
beecount166.61166.61187.32241.72249.13
cse146.43163.64178.12247.86252.43
dk14191.64172.65193.85223.84233.63
dk15192.53185.36194.87226.97236.02
dk16169.72174.79197.13253.82264.93
donfile184.03184.00203.65248.19258.11
ex2198.57198.57200.14241.61251.45
ex4180.96177.71192.83237.31247.14
ex6169.57163.80176.59238.35248.21
ex7200.04200.84200.60240.83250.14
keyb156.45143.47168.43224.98235.01
mark1162.39162.39176.18227.47237.76
opus166.20166.20178.32213.40223.26
s27198.73191.50199.13238.53248.04
s386168.15173.46179.15218.87228.63
s8180.02178.95181.23213.65223.04
sse157.06169.12174.63205.41215.53
Total3339.553335.573576.724433.634612.75
Percentage, %72.4072.3177.5496.12100.00
Table 19. Experimental results (the maximum operating frequency for benchmarks of categories 2–4).
Table 19. Experimental results (the maximum operating frequency for benchmarks of categories 2–4).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
ex1150.94139.76176.87276.82291.01
kirkman141.38154.00156.68227.15242.19
planet132.71132.71187.14251.54266.78
planet1132.71132.71187.14251.54266.78
pma146.18146.18169.83239.83254.17
s1146.41135.85157.16221.47236.09
s1488138.50131.94157.18244.31259.03
s1494149.39145.75164.34242.05257.24
s1a153.37176.40169.17214.53229.17
s208174.34176.46178.76255.28270.42
styr137.61129.92145.64232.24247.53
tma163.88147.80164.14235.59250.18
sand115.97115.97126.82221.14239.65
s420173.88176.46177.25263.32283.14
s510177.65177.65198.32297.76317.22
s820152.00153.16176.58268.10287.26
s832145.71153.23173.78274.22293.31
Total2532.632525.952866.804216.894491.17
Percentage, %56.3956.2463.8393.89100.00
Table 20. Experimental results (the area-time products for all benchmarks).
Table 20. Experimental results (the area-time products for all benchmarks).
BenchmarkAutoOne-HotJEDI P E FSM P C FSMCategory
bbara87.9187.9147.1260.9850.411
bbsse210.11218.78131.62123.0096.741
bbtas24.4924.4924.2624.9524.940
beecount114.04114.0474.7473.0260.261
cse273.17403.32202.11180.99161.101
dk1483.49156.3951.5960.3554.021
dk1577.9186.3261.5844.7833.981
dk1688.38194.5260.8753.9743.081
dk1725.0971.8625.0825.0225.120
dk2714.5624.7619.5920.3420.380
dk51250.9550.9545.0643.2343.200
donfile168.45168.48117.85100.8788.741
ex1463.76529.48299.66235.52193.712
ex245.3245.3239.9744.6438.201
ex346.1946.1945.9744.7544.710
ex482.8973.1562.2355.7549.721
ex549.9349.9349.6854.9454.900
ex6141.53219.78124.58116.80101.411
ex720.0024.9019.9428.7318.861
keyb274.85425.18237.49205.43184.611
kirkman297.07376.62248.91208.86170.152
lion9.8824.519.8819.9820.000
lion929.2359.3924.2328.9724.130
mark1141.63141.63113.52106.6893.381
mc20.3435.8120.3230.5920.340
modulo1233.8233.8233.8033.6333.600
opus168.47168.47123.37136.31119.011
planet987.11987.11560.01507.17407.292
planet1987.11987.11560.01507.17407.292
pma643.04643.04506.39444.86380.082
s1443.96728.74388.14355.75289.622
s1488895.31992.88687.11550.74463.612
s1494843.43905.66669.34546.35462.652
s1a319.49459.18254.18254.70200.882
s20868.83175.6855.9459.3747.262
s2730.1993.9930.1339.1133.651
s386154.62224.84122.80116.48103.831
s42087.81266.1976.7675.0163.324
s510407.99407.99270.45225.03190.194
s849.9950.2949.6661.9744.551
s820785.71724.64583.29427.23366.704
s832691.38697.69521.97428.91326.144
sand1138.231138.23898.91765.20611.413
shiftreg7.6122.767.2423.3715.600
sse210.11218.78171.79156.41140.241
styr675.82923.65556.17458.66366.142
tma274.59263.87237.60205.02172.052
Total12,745.8214,768.329522.938371.636961.17
Percentage,%183.10212.15136.80120.26100.00
Table 21. Experimental results (the area-time products for benchmarks of category 0).
Table 21. Experimental results (the area-time products for benchmarks of category 0).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbtas24.4924.4924.2624.9524.94
dk1725.0971.8625.0825.0225.12
dk2714.5624.7619.5920.3420.38
dk51250.9550.9545.0643.2343.20
ex346.1946.1945.9744.7544.71
ex549.9349.9349.6854.9454.90
lion9.8824.519.8819.9820.00
lion929.2359.3924.2328.9724.13
mc20.3435.8120.3230.5920.34
modulo1233.8233.8233.8033.6333.60
shiftreg7.6122.767.2423.3715.60
Total312.09444.47305.10349.79326.93
Percentage,%95.46135.9593.32106.99100.00
Table 22. Experimental results (the area-time products for benchmarks of category 1).
Table 22. Experimental results (the area-time products for benchmarks of category 1).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
bbara87.9187.9147.1260.9850.41
bbsse210.11218.78131.62123.0096.74
beecount114.04114.0474.7473.0260.26
cse273.17403.32202.11180.99161.10
dk1483.49156.3951.5960.3554.02
dk1577.9186.3261.5844.7833.98
dk1688.38194.5260.8753.9743.08
donfile168.45168.48117.85100.8788.74
ex245.3245.3239.9744.6438.20
ex482.8973.1562.2355.7549.72
ex6141.53219.78124.58116.80101.41
ex720.0024.9019.9428.7318.86
keyb274.85425.18237.49205.43184.61
mark1141.63141.63113.52106.6893.38
opus168.47168.47123.37136.31119.01
s2730.1993.9930.1339.1133.65
s386154.62224.84122.80116.48103.83
s849.9950.2949.6661.9744.55
sse210.11218.78171.79156.41140.24
Total2423.083116.091842.981766.281515.76
Percentage,%159.86205.58121.59116.53100.00
Table 23. Experimental results (the area-time products for benchmarks of categories 2–4).
Table 23. Experimental results (the area-time products for benchmarks of categories 2–4).
BenchmarkAutoOne-HotJEDI P E FSM P C FSM
ex1463.76529.48299.66235.52193.71
kirkman297.07376.62248.91208.86170.15
planet987.11987.11560.01507.17407.29
planet1987.11987.11560.01507.17407.29
pma643.04643.04506.39444.86380.08
s1443.96728.74388.14355.75289.62
s1488895.31992.88687.11550.74463.61
s1494843.43905.66669.34546.35462.65
s1a319.49459.18254.18254.70200.88
s20868.83175.6855.9459.3747.26
styr675.82923.65556.17458.66366.14
tma274.59263.87237.60205.02172.05
sand1138.231138.23898.91765.20611.41
s42087.81266.1976.7675.0163.32
s510407.99407.99270.45225.03190.19
s820785.71724.64583.29427.23366.70
s832691.38697.69521.97428.91326.14
Total10,010.6611,207.777374.856255.565118.48
Percentage,%195.58218.97144.08122.22100.00
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Krzywicki, K. Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies 2022, 15, 2636. https://doi.org/10.3390/en15072636

AMA Style

Barkalov A, Titarenko L, Krzywicki K. Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies. 2022; 15(7):2636. https://doi.org/10.3390/en15072636

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, and Kazimierz Krzywicki. 2022. "Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems" Energies 15, no. 7: 2636. https://doi.org/10.3390/en15072636

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop