Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems

Barkalov, Alexander; Titarenko, Larysa; Krzywicki, Kazimierz

doi:10.3390/en15072636

Open AccessArticle

Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems

by

Alexander Barkalov

^1,2,*

,

Larysa Titarenko

^1,3

and

Kazimierz Krzywicki

^4,*

¹

Institute of Metrology, Electronics and Computer Science, University of Zielona Gora, Ul. Licealna 9, 65-417 Zielona Gora, Poland

²

Department of Computer Science and Information Technology, Vasyl Stus’ Donetsk National University (in Vinnytsia), 600-Richya Str. 21, 21021 Vinnytsia, Ukraine

³

Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine

⁴

Department of Technology, The Jacob of Paradies University, Ul. Teatralna 25, 66-400 Gorzow Wielkopolski, Poland

^*

Authors to whom correspondence should be addressed.

Energies 2022, 15(7), 2636; https://doi.org/10.3390/en15072636

Submission received: 11 February 2022 / Revised: 25 March 2022 / Accepted: 1 April 2022 / Published: 4 April 2022

(This article belongs to the Special Issue Control Part of Cyber-Physical Systems: Modeling, Design and Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

A method is proposed for optimizing circuits of sequential devices which are used in cyber-physical systems (CPSs) implemented using field programmable gate arrays (FPGAs). The optimizing hardware is a very important problem connected with implementing digital parts of CPSs. In this article, we discuss a case when Mealy finite state machines (FSMs) represent behaviour of sequential devices. The proposed method is aimed at optimization of FSM circuits implemented with look-up table (LUT) elements of FPGA chip. The method aims to reduce the LUT count of Mealy FSMs with extended state codes. The method is based on finding a partition of the set of internal states by classes of compatible states. To reduce LUT count, we propose a special kind of state codes named composite state codes. The composite codes include two parts. The first part includes the binary codes of states as elements of some partition class. The second part consists of the code of corresponding partition class. Using composite state codes allows us to obtain FPGA-based FSM circuits with exactly two levels of logic. If some conditions hold, then any FSM function from the first level is implemented by a single LUT. The second level is represented as a network of multiplexers. Each multiplexer generates either an FSM output or input memory function. An example of synthesis is shown. The experiments prove that the proposed approach allows us to reduce hardware compared with two methods from Vivado, JEDI-based FSMs, and extended state assignment. Depending on the complexity of an FSM, the LUT count is reduced on average from 15.46 to 68.59 percent. The advantages of the proposed approach grow with the growth of FSM complexness. An additional positive effect of the proposed method is a decrease in the latency time.

Keywords:

mealy FSM; FPGA; LUT count; synthesis; extended state codes; composite state codes; cyber-physical systems

1. Introduction

Our time is characterised by the wide application of various cyber-physical systems (CPSs) in many areas of human activity [1,2,3,4,5,6]. The typical CPS includes a digital part interacted with physical objects [7]. Very often, various sequential blocks can be found in digital parts of CPSs [3,7]. To improve the overall quality of a CPS digital part, it is necessary to optimize characteristics of its sequential blocks. In the current paper, the model of Mealy finite state machine (FSM) [8] represents the behaviour of these blocks.

The model of Mealy FSM [9] is one of the basic models used in the designing circuits of sequential devices [9,10]. Due to it, there are a large number of methods for synthesizing Mealy FSM logic circuits [11,12]. One of the main goals of these methods is to reach optimal values of the basic characteristics of resulting FSM circuits [10,13]. These characteristics are: (1) the hardware amount (in the case of VLSI, it is a chip area occupied by a circuit), (2) the performance, and (3) the power consumption. As a rule, it is not possible to achieve a simultaneous optimum for these three characteristics. For example, a decrease in the occupied chip area is often associated with an increase in the number of circuit levels, which leads to a decrease in performance [9,11]. Many studies show that the chip area occupied by an FSM circuit has a decisive influence on both the latency time and power consumption [14]. At the same time, it is important that reducing the area increases the delay time of the circuit as little as possible. In this paper, we propose just such a method focused on the case of the implementation of the FSM circuit using resources of FPGAs [5,15,16]. The proposed method develops ideas related to the use of extended state codes [17] and twofold state assignment [5]. The proposed approach belongs to methods of structural decomposition [17].

Now, a lot of digital systems are implemented using FPGA chips [5]. As follows from the analysis of VLSI’ market [16], the largest manufacturer of FPGA chips is Xilinx [18]. Due to it, we focus our current research on solutions of Xilinx. An FSM circuit is represented as a composition of look-up table (LUT) elements, programmable flip-flops, inter-slice multiplexers, programmable interconnects, synchronization tree, and programmable input-outputs.

Our current article is devoted to improving the LUT count of two-level LUT-based Mealy FSM circuits based on extended state codes (ESC) [17]. The main shortcoming of ESC-based FSMs is a significant increase in the number of used flip-flops compared to their minimum possible number. This disadvantage leads to two negative phenomena. First of all, this leads to increasing the number of outputs of the synchronization tree connected with the state code register (SCR). The second negative phenomenon is reduced to the fact that an increase in the state code length (number of bits) leads to a complication of the interconnect system. The negative impact of these two factors is reflected in the increase in power consumption of FSM circuits. This is why it is so important to reduce the number of flip-flops in SCR (without increasing the number of LUT levels of the resulting FSM circuit). The desire to eliminate this shortcoming is the main motivation of our current research. Therefore, the problem under consideration is formulated as follows: the development of a method for implementing circuits of LUT-based Mealy FSMs that allows for the simultaneous reduction of the number of LUTs and flip-flops in a two-level FSM with extended state codes.

The main contribution of this paper is the following:

There is proposed a new method for presenting FSM state codes. The proposed composite state codes (CSCs) consist of class codes and codes of class elements.
The proposed method allows us to obtain FPGA-based circuits having fewer LUTs than this number for circuits of equivalent Mealy FSMs implemented using known basic state encoding approaches (maximum binary, one-hot, JEDI), as well as the extended state codes. A positive side effect of the proposed method is a slight improvement in the temporal characteristics of the obtained FSM circuits in relation to their counterparts based on other state assignment approaches.
The gain from the application of the proposed method increases as the number of FSM inputs and states increases.

The novelty of our article is reduced to the development of a novel design method aimed at reducing the length of state codes for the two-level LUT-based Mealy FSMs. The method is based on using the composite state codes proposed in this paper. This reducing the length of state codes decreases the number of flip-flops in FSM state registers compared to this number for equivalent FSMs with extended state codes. As a result, the number of input memory functions is also reduced, which in turn reduces the LUT counts of the resulting circuits.

The biggest challenge in reducing the LUT counts in the circuits of FPGA-based FSM is to solve this problem with a minimum decrease in FSM performance. We solved this problem by reducing the number of state variables while keeping the same number of logical levels of FSM circuits compared to this value for optimized ESC-based FSMs.

The rest of the article is organized as follows. The basic information about LUT-based Mealy FSMs is discussed in Section 2. The Section 3 is devoted to the analysis of works related to FSM design. The background of the proposed method is shown in Section 4. An example of synthesis for a CSC-based FSM is shown in Section 5. Section 6 describes and analyzes the experimental results. The brief summary of the results is shown in Section 7.

2. Background Information

To design a Mealy FSM logic circuit, it is necessary to create systems of Boolean functions (SBFs) representing the circuit [8,9]. These SBFs show the dependences of FSM outputs and input memory functions (IMFs) on FSM inputs and state variables. The FSM outputs form a set

O = {o_{1}, \dots, o_{N}}

. Elements of the following sets are used as arguments of these SBFs: the FSM inputs from a set

I = {i_{1}, \dots, i_{L}}

and state variables from a set

T = {T_{1}, \dots, T_{R}}

. The state variables encode internal states from a set

S = {s_{1}, \dots, s_{M}}

. In this article, we use a case of maximum binary state encoding [9] when the number of state variables

T_{r} \in T

is determined as

R = ⌈ l o g_{2} M ⌉ .

(1)

The state codes

K (s_{m})

are kept into the state code register. As a rule, the register has informational inputs of D type [19,20]. To load state codes into SCR, the input memory functions are used. They form a set

D = {D_{1}, \dots, D_{R}}

.

Two systems of functions represent logic circuits of so called P Mealy FSMs (Figure 1). They are the following:

D = D (T, I);

(2)

O = O (T, I) .

(3)

In Figure 1, the block of functions is synthesized using the SBFs (2) and (3). The SCR consists of R flip-flops and keeps the state codes

K (s_{m})

. The pulse Reset allows us to load the code with all zeros into SCR. As a rule, this combination of state variables encodes an initial state

s_{1} \in S

. The pulse Clock allows loading state codes into SCR.

To get the systems (2) and (3), an FSM direct structure table (DST) is used [8]. The DST is constructed using either a state transition table (STT) [9] or a state transition graph [5,9]. In our paper, we use an STT as a tool for representing a Mealy FSM.

There are five columns in an STT [9]. These columns include: a current state

s_{m}

; a next state

s_{T}

; an input signal

I_{h}

which is a conjunction of inputs (or their complements) determining the transition from

s_{m}

into

s_{T}

; collection of outputs

O_{h}

which are produced during the transition from

s_{m}

into

s_{T}

; h is a column which includes the numbers of interstate transitions

(h \in 1, \dots, H)

.

The process of a DST creating begins from the executing state assignment. During this step, abstract states

s_{m} \in S

are represented by their binary codes

K (s_{m})

. Next, a set of input memory functions can be obtained. Compared to an STT, a DST includes three additional columns [8]. They are: the code

K (s_{m})

, the code

K (s_{T})

, and a collection of IMFs

D_{h} \subseteq D

equal to 1 to load the code

K (s_{T})

into SCR.

In this paper, we consider a case when SBFs (2) and (3) are implemented using internal resources of FPGA chips. There are a lot of configurable logic blocks (CLB) in FPGAs produced by Xilinx [18,21]. To get an FSM circuit, it is necessary to connect CLBs using internal programmable interconnections [15]. It is enough three CLB elements to get an FSM circuit. The logic is implemented using LUTs. The expansion of LUT inputs is executed using internal multiplexers. The state register is represented by a collection of D flip-flops. Using the notation [11], we denote as

N I_{L U T}

-LUT a LUT having

N I_{L U T}

inputs and a single output. A Boolean function depending on up to

N I_{L U T}

variables is represented by a single-LUT logic circuit. Various methods of functional decomposition (FD) [22,23] are used if some FSM functions depend on more than

N I_{L U T}

arguments. It is known, the FD-based FSMs are represented by multi-level circuits with complicated systems of “spaghetti-type” interconnections [22].

If all LUTs have the same number of inputs, then such a logic basis is inflexible. It means that in some cases, only a part of the available inputs will be used. At the same time, in other cases, the LUTs need to be combined to increase the number of inputs. To reduce the impact of interconnects on such a join, it is important to have internal fast interconnects between some LUTs. In Xilinx solutions, these CLBs are combined into slices [18]. For example, the SLICEL of Virtex-7 includes four 6-LUTs, eight flip-flops and 27 multiplexers [24]. A part of SLICEL is shown in Figure 2.

As follows from Figure 2, using the resources of a single SLICEL allows the generation of up to 11 Boolean functions

(f 1

–

f 11)

. The outputs

f 2, f 3, f 6, f 7

are outputs of corresponding 6-LUTs. They can be connected with the data inputs of flip-flops. Each 6-LUT can be organized as two 5-LUTs with shared inputs. These 5-LUTs generate functions

f 1

–

f 8

. An output of each 5-LUT can be connected with informational input of a flip-flop. Due to it, there are 8 flip-flops in the circuit of SLICEL. The slice contains three internal multiplexers (MX1–MX3) which can be used for creating either two 7-LUTs or a single 8-LUT. These multiplexers have special control inputs (MC1–MC3), which can be used as additional inputs of LUTs having

N I_{L U T}

equal to either 7 or 8. For example, using the input MC1, we can get a 7-LUT by combining LUT1 and LUT2. This 7-LUT implements function f9. Simultaneous using MC1–MC3 allows us to combine all 6-LUTs into a single 8-LUT with output f11.

In this paper, we use multiplexers to generate functions (2) and (3). We denote a multiplexer having K data inputs as K-

M X

. Using a single 6-LUT, we can implement a circuit of 4-

M X

. A 4-

M X

has two control inputs and four data inputs. Using an internal multiplexer, we can organize an 8-

M X

with help of two 6-LUTs. For example, using MC2, LUT3 and LUT4 gives us an 8-

M X

. Its circuit has only slightly bigger delay than a circuit of a 4-

M X

[24]. It is possible due to using the fast interconnections inside a slice. If a 16-

M X

has the control inputs

T 1

–

T 4

, then its circuit includes four 6-LUTs controlled by

T 3

–

T 4

. The inputs MC1 and MC2 are connected with the same control input

T 2

. The input MC3 is connected with the most significant control bit

T 1

. Obviously, to implement a 32-

M X

, we should use to slices and inter-slice interconnections. Due to it, a 32-

M X

is much slower than a 16-

M X

.

In LUT-based FSMs, the flip-flops of SCR are distributed among LUTs generating functions (2). Due to it, the SCR is hidden inside the same slices where the input memory functions are generated. There are two blocks in LUT-based P Mealy FSM (Figure 3).

The block of state variables (BSV) implements SBF (2). The state variables

T_{r} \in T

are kept into the distributed SCR. To control the SRC operation, the signals clearing (Reset) and synchronization (Clock) enter BSV. The outputs

o_{n} \in O

are generated by the block of output functions (BOF). This block implements SBF (3).

In the best case, there is exactly a single level of LUTs in the circuit of P FSM. This is possible if each function

ϕ_{i} \in D \cup O

depends on not more than

N I_{L U T}

arguments. However, for modern LUTs, the following relation holds:

N I_{L U T} \leq 6

[18,25]. To diminish the number of LUTs in an FSM logic circuit, it is necessary to increase the value of

N I_{L U T}

. However, such an increasing leads to higher values of both the power consumption and latency time of a LUT. Due to this phenomenon, the number of LUT inputs is so small. If the numbers of arguments in functions

ϕ_{i} \in D \cup O

exceed the number of LUT inputs, then it results in increasing the numbers of LUTs and their logic levels in FSM circuits. To improve these characteristics of LUT-based FSM circuits, it is necessary to improve the existing design methods.

In addition, it is necessary to optimize the system of connections between different slices of an FSM circuit. As shown in [26], the interconnections are responsible for around 70% of power consumption. In addition, they are responsible for the major part of FSM circuit latency time [23]. As shown [11], the improving interconnections allows us to decrease the minimum latency time and power dissipation of FPGA-based circuits. Using either twofold state codes [5] or extended state codes [17] can improve the interconnection system characteristics.

3. Related Work

Various design methods have been proposed for designing LUT-based FSMs [5,19,20,23,27,28,29,30]. These methods should be applied if the number of arguments

N A (ϕ_{i})

exceeds the value of

N I_{L U T}

at least for a single function

ϕ_{i} \in D \cup O

[5]. These methods can improve either the LUT count or the maximum operating frequency or the consumed power [31]. Sometimes, these methods try to find a solution when more than a single characteristic is optimized. In this paper, we propose a method for improving the LUT count of FPGA-based Mealy FSMs.

To diminish the values of

N A (ϕ_{i})

, various methods of state assignment may be used [9,10]. The numbers of state variables differ from (1) corresponding to maximum binary codes to M corresponding to the one-hot state assignment. These approaches are used in many academic and industrial CAD tools. The well-known academic systems are, for example, SIS [13] and ABC by Berkeley [32,33]. The manufactures of FPGA chips also have their CAD packages. For example, Xilinx has the CAD Vivado [34], whereas Intel (Altera) has the package Quartus [35].

It is very difficult to choose the best universal method of state assignment. For example, in [36] there are compared FSM circuits based on the maximum binary and one-hot state codes (OHC) [28]. As follows from [36], the using OHCs leads to improving the basic characteristics for rather complex FSMs having more than 16 states. However, FSM characteristics depend strongly on the number of inputs, too [5].The results of research reported in [28] prove that OHCs lead to the worsening FSM characteristics if there is

L > 10

.

So, the characteristics of LUT-based circuits depend on the value of

L + R

. Depending on this parameter, either the maximum state codes or OHCs lead to improving either the LUT count or/and the latency time for a particular FSM. Therefore, both these state assignment approaches should be checked for a given FSM. We have investigated the efficiency of these both methods in our research. Both these methods are used in the CAD tool Vivado [34] by Xilinx [18]. They are named Auto and One-hot, respectively. We used Vivado because this system operates with Virtex-7 chips used in our experiments. It is noted in [13] that one of the best deterministic methods of the state assignment is JEDI [28]. Due to this quality, we use JEDI to compare JEDI-based FSMs with FSMs proposed by us in the current paper.

In this paper, we propose a new design method allowing to reduce the number of LUTs (LUT counts) in circuits of FPGA-based Mealy FSMs. The proposed method belongs to the methods of structural decomposition (SD) [11]. The SD-based FSM optimization is achieved due to using some intermediate logic levels between the arguments of SBFs (2) and (3) and the functions

ϕ_{i} \in D \cup O

. Due to it, the number of functions increases, but these functions have significantly fewer arguments than functions (2) and (3). These methods are analysed, for example, in [11].

The proposed method can be viewed as an evolution of methods of twofold state assignment [5]. The methods of this group are based on the constructing a partition

Π_{S}

of the set S by the classes of compatible states. Each state

s_{m} \in S

corresponds to a set

I (s_{m})

. This set consists of inputs

i_{l} \in I

which determine next states for a given state. Let the symbols

M_{j}

and

R_{j}

stand for the number of states in some set

S_{j} \subseteq S

and the number of bits in maximal binary codes of these states, respectively. In the case of twofold state assignment, it is necessary to use an additional code for the relation

s_{m} \notin S_{j}

. Therefore, the value of

R_{j}

is determined by the following expression:

R_{j} = ⌈ l o g_{2} (M_{j} + 1) ⌉ .

(4)

We name states

s_{m} \in S_{j}

compatible if the following condition holds:

R_{j} + L_{j} \leq N I_{L U T} .

(5)

In (5), the symbol

L_{j}

stands for the number of inputs determining transitions from states

s_{m} \in S_{j}

. These inputs are combined into a set

I^{j} \subseteq I

.

In [5], the method is proposed which allows the creation of the partition

Π_{S} = {C S^{1}, \dots, C S^{J}}

with minimum number of classes, J. Each class

C S^{j} \in Π_{S}

includes only compatible states. Each class

C S^{j} \in Π_{S}

determines sets

I^{j} \subseteq I

and

O^{j} \subseteq O

. The set

O^{j} \subseteq O

consists of outputs

o_{n} \in O

generating during transitions from states

s_{m} \in C S^{j}

.

In the case of twofold state assignment, two different codes determine a state

s_{m} \in S

[5]. To determine this state as an element of the set of states, we use a code

K (s_{m})

. The code

C (s_{m})

determines the state

s_{m} \in S

as an element of some class of compatibility

C S^{j} \in Π_{S}

. Each class

C S^{j} \in Π_{S}

determines a collection of partial functions generating by the corresponding block of LUTs. These partial functions are partial outputs

o_{n} \in O^{j}

and partial IMFs

D_{r} \in D^{J}

. The set

D^{j} \subseteq D

includes IMFs generating during the transitions from the states

s_{m} \in C S^{j}

. We denote these partial functions as

o_{n}^{j}

and

D_{r}^{j}

, where the superscript j shows that the functions are determined by the class

C S^{j} \in Π_{S}

. Due to the validity of (5), each partial function is represented by a circuit consisting of a single LUT.

The main disadvantage of this approach is the need for a transformation of

K (s_{m})

into

C (s_{m})

. The transformation is executed by a special block of code transformation. This block consumes some internal resources of an FPGA chip. Moreover, it adds a delay to the total cycle time of a resulting FSM circuit.

An improvement of this approach is proposed in [17]. The improvement is reduced to using only codes

C (s_{m})

. These codes were named the extended state codes (ESCs). Using ESCs, allows the elimination of the block of code transformation. We use the symbol

P_{E}

to show that a Mealy FSM is based on the extended state codes. To encode states from all classes of compatibility,

R_{E}

state variables are used:

R_{E} = R_{1} + R_{2} + \dots + R_{J} .

(6)

The value of

R_{j} (j \in {1, \dots, J})

is determined by (5). To encode states

s_{m} \in C S^{j}

, the state variables from a set

T^{j} \subset T

are used. There are

R_{E}

elements in the set T, where

T = T^{1} \cup T^{2} \cup \dots \cup T^{J}

.

The logic circuit of

P_{E}

FSMs is represented by a structural diagram (Figure 4).

The logic circuits of

P_{E}

FSMs have two logic levels. The first level is a block of partial functions (BFP) represented by J blocks (Block1–BlockJ). These blocks implements system of partial functions:

D^{j} = D^{j} (T^{j}, I^{j});

(7)

O^{j} = O^{j} (T^{j}, I^{j}) .

(8)

The second level of logic is represented by BlockOR. This block includes the SCR having

R_{E}

flip-flops. These flip-flops are controlled by pulses Reset and Clock. In the best case, there are exactly

N + R_{E}

LUTs in the circuit of BlockOR. This block implements disjunctions of the partial functions:

o_{n} = o_{n}^{1} \lor o_{n}^{2} \lor \dots \lor o_{n}^{J} (n \in {1, \dots, N});

(9)

D_{r} = D_{r}^{1} \lor D_{r}^{2} \lor \dots \lor D_{r}^{J} (r \in {1, \dots, R_{E}}) .

(10)

In (9) and (10), the superscript determines the number of block generating the particular partial function. The functions (10) are inputs of flip-flops. The state variables are outputs of these flip-flops.

Obviously, each class

C S^{j} \in Π_{S}

determines three sets:

T^{j} \subseteq T

,

O^{j} \subseteq O

and

D^{j} \subseteq D

, where there are

R_{E}

elements in the set D. Two first sets are already defined. The set

D^{j} \subseteq D

includes input memory functions generating during transitions from states

s_{m} \in S^{j}

.

Consider a Mealy FSM

U_{1}

represented by its STT (Table 1). The following characteristics of

U_{1}

follow from Table 1: the number of states

M = 8

, the number of inputs

L = 6

, the number of outputs

N = 7

, and the number of transitions

H = 19

. We denote as

P_{E} (U_{k})

that a Mealy FSM

U_{k}

is implemented using the model

P_{E}

. To synthesize the FSM

P_{E} (U_{1})

, it is necessary to find a partition

Π_{S}

. The number of classes, J, depends on the value of

N I_{L U T}

. We discuss a case when a logic circuit of

P_{E} (U_{1})

is synthesized using LUTs with

N I_{L U T} = 5

inputs.

To find the partition

Π_{S}

with the minimum number of classes of compatible states, we can use a method proposed in [5]. Using this method gives the partition

Π_{S} = {C S^{1}, C S^{2}, C S^{3}}

, where

C S^{1} = {s_{1}, s_{3}, s_{7}}

,

C S^{2} = {s_{2}, s_{4}, s_{5}}

, and

C S^{3} = {s_{6}, s_{8}}

. This gives the values

J = 3

,

M_{1} = M_{2} = 3

and

M_{3} = 2

.

Using (4) gives

R_{1} = R_{2} = R_{3} = 2

. This determines the sets

T^{1} = {T_{1}, T_{2}}

,

T^{2} = {T_{3}, T_{4}}

,

T^{3} = {T_{5}, T_{6}}

, and

T = {T_{1}, \dots, T_{6}}

with

R_{E} = 6

. Obviously, using (1) gives the minimum possible number of state variables

R = 3

.

If a state

s_{m}

belongs to a class

C S^{j} \in Π_{S}

, then only some state variables

T_{r} \in T^{j}

should differ from zero in the extended state code

C (s_{m})

. At the same time, if

T_{r} \notin T^{j}

, then

T_{r} = 0 (r \in {1, \dots, R_{E}})

. One of the possible outcomes of such a state assignment is represented by Table 2. For example, the following ESCs can be found from Table 2:

C (s_{1}) = 010000

,

C (s_{2}) = 000100

,

C (s_{6}) = 000001

, and so on.

As follows from experiments [17], this approach allows an increase in performance up to 15.9% compared with equivalent FSMs based on the twofold state assignment. The growth of operating frequency is accompanied by a slight growth in the LUT count (up to 7.7%).

But the approach [17] has a serious drawback: the number of state variables can exceed significantly the minimum possible number determined by (1). This leads to increasing the number of flip-flops in SCR. In turn, this increases the number of buffers of the synchronization tree required by a

P_{E}

FSM logic circuit compared with circuits of equivalent FSMs based on the twofold state assignment. In addition, the number of interconnections is increased.

Now we can sum up and perform a qualitative analysis of the discussed issues. The well-known state assignment methods (the maximum binary codes, one-hot codes, JEDI) do not guarantee a decrease in the number of arguments of all Boolean functions representing an FSM logic circuit. The greater the difference between the total number of FSM inputs and state variables, on the one hand, and the number of LUT, the higher the probability of the need to apply the methods of functional decomposition of SBFs (2) and (3). In this case, it is expedient to use methods of structural decomposition, which make it possible to obtain FSM circuits with a guaranteed number of logical levels. In addition, these methods allow getting rid of the “spaghetti-type” interconnection system inherent in LUT-based FSM circuits based on functional decomposition. One of the best structural decomposition methods is based on the use of extended state codes. However, this method is associated with a significant increase in the number of state variables in relation to the minimum value determined by (1). If this shortcoming is eliminated and the main advantages of the extended state codes are preserved, then it is possible to improve the basic characteristics of the FSM circuits (the LUT counts and performance) in comparison with their counterparts based on extended state codes.

In our current paper we propose an approach which allows an improvement of LUT count for circuits of Mealy FSMs based on the partition of the set S by classes of compatible states.

4. Main Idea of the Proposed Method

The proposed method is based on the finding a partition

Π_{C} = {C S^{1}, \dots, C S^{J_{C}}}

of the set S by

J_{C}

classes of compatible states. In this case, states

s_{m} \in C S^{j}

are encoded by codes

C (s_{m})

using

R_{A}

state variables where

R_{A} = m a x (⌈ l o g_{2} M_{1} ⌉, \dots, ⌈ l o g_{2} M_{J_{C}} ⌉) .

(11)

To encode a class

C S^{j} \in Π_{C}

by a class code

K (C S^{j})

, it is necessary

R_{C}

bits, where

R_{C} = ⌈ l o g_{2} J_{C} ⌉ .

(12)

We propose to represent a state

s_{m} \in C S^{j}

by the code

C C (s_{m})

which we name a composite state code (CSC). This code is the following:

C C (s_{m}) = K (C S^{j}) * C (s_{m}) .

(13)

In (13), the sign “∗” denotes the concatenation of the codes. There are

R_{C C}

state variables in the code (13). The value of

R_{C C}

is determined as

R_{C C} = R_{C} + R_{A} .

(14)

To encode the classes, we use the class variables from the set

T_{C} = {T_{1}, \dots, T_{R C}}

. To encode states as elements of classes

C S^{j} \in Π_{C}

, we use the state variables from the set

T_{A} = {T_{R C + 1}, \dots, T_{R C C}}

. Together, these sets form a set

T = T_{C} \cup T_{A}

having

R_{C C}

elements.

Each class

C S^{j} \in Π_{C}

determines the following three sets:

I^{j}

,

O^{j}

, and

D^{j}

. These sets have been defined before. There is no set

T^{j} \subset T

, because the states for each class are encoded using the same state variables

T_{r} \in T_{A}

. Each class

C S^{j} \in Π_{C}

determines the following partial functions:

D^{j} = D^{j} (T_{A}, I^{j});

(15)

O^{j} = O^{j} (T_{A}, I^{j}) .

(16)

In (15) and (16), the following relation holds:

j \in {1, \dots, J_{C}}

.

To get the functions

D_{r} \in D

and

o_{n} \in O

, it is necessary to execute multiplexing of partial functions. To do it,

N + R_{C C}

multiplexers should be used. The partial functions are used as data inputs of these multiplexers. The selection of a particular partial function is determined by the class variables

T_{r} \in T_{C}

. Therefore, the multiplexers generate the following SBFs:

D_{r} = D_{r} (T_{C}, D_{r}^{1}, \dots, D_{r}^{J_{C}}) (r \in {1, \dots, R_{C C}});

(17)

o_{n} = o_{n} (T_{C}, o_{n}^{1}, \dots, o_{n}^{J_{C}}) (n \in {1, \dots, N}) .

(18)

So, SBFs (15) and (16) determine a block of partial functions (BPF). The SBF (17) determines a multiplexer of state variables (MXSV), the SBF (18) determines a multiplexer of outputs (MXO). Together, the SBFs (15) and (18) determine a structural diagram of

P_{C}

Mealy FSM shown in Figure 5.

There are three logic blocks in

P_{C}

Mealy FSM. Their functions are clear from the previous text. The block of partial functions implements SBFs (15) and (16). The multiplexer of state variables (MXSV) implements SBF (17). Its circuit includes

R_{C C}

multiplexers having

R_{C}

control inputs and up to

J_{C}

data inputs. The outputs of these multiplexers are connected with inputs of flip-flops creating the state code register, RSC. To control the RCS, the pulses of clearing and synchronization enter MXSV. There are

R_{C C}

flip-flops in the circuit of RSC. There are N multiplexers in the circuit of

M X O

. The selection of a particular partial function

o_{n}^{j}

is executed under the control of state variables

T_{r} \in T_{C}

.

If

C S^{j} \in Π_{C}

, then there are

R_{C j}

variables in the codes of states

s_{m} \in C S^{j}

, where

R_{C j} = ⌈ l o g_{2} M_{j} ⌉ .

(19)

The comparison of formulae (4) and (19) shows that classes of the partition

Π_{C}

can include more elements than classes of the partition

Π_{E}

. This is determined by the absence of 1 in the formula (19).

For example, if there is

N I_{L U T} = 5

, then the following partition

Π_{C}

can be constructed for Mealy FSM

U_{1}

:

Π_{C} = {C S^{1}, C S^{2}}

. Therefore, there are

J_{C} = 2

classes for FSM

P_{C} (U_{1})

instead of

J = 3

for the equivalent FSM

P_{E} (U_{1})

. There are the following classes of compatible states in the discussed case:

C S^{1} = {s_{1}, s_{3}, s_{6}, s_{7}}

and

C S^{2} = {s_{2}, s_{4}, s_{5}, s_{8}}

.

In the common case, the following conditions hold:

J_{C} \leq J;

(20)

R \leq R_{C C} < R_{E} .

(21)

In the case of

U_{1}

, we can find that

R = 3

,

R_{E} = 6

,

R_{A} = 1

,

R_{C} = 2

, and

R_{C C} = 3

. Therefore, in the discussed case, there is

R_{C C} = R = 3

. In addition, there is

J_{C} = 2 < J = 3

. Due to it, we can expect that, in this case, the circuit of

P_{E} (U_{1})

will have fewer LUTs and interconnections than the circuit of

P_{C} (U_{1})

. We will check this in the next Section. In addition, we can expect that

P_{C}

Mealy FSMs have, at least, the same performance as equivalent

P_{E}

Mealy FSMs. The experiments reported in Section 6 show that our approach allows an improvement of the basic characteristics of LUT-based circuits of Mealy FSMs.

A method of

P_{C}

Mealy FSMs logic synthesis is proposed in our current article. As a result, we have obtained the logic circuits of LUT-based FSMs where a LUT has

N I_{L U T}

inputs. We start the synthesis process from an FSM state transition table. The proposed method includes the following steps:

Constructing the partition $Π_{C}$ of the set of states by classes of compatible states.
Encoding of FSM states by composite state codes $C C (s_{m})$ .
Creating direct structure table of $P_{C}$ Mealy FSM.
Creating tables of blocks of partial functions for classes $C S^{j} \in Π_{C}$ .
Creating table representing the multiplexer of outputs.
Creating table representing the multiplexer of state variables.
Constructing SBFs representing BPF, MXSV, and MXO.
Implementing the LUT-based circuit of $P_{C}$ Mealy FSM using FPGA chip’s internal resources.

We use the methods [5] to create the partition

Π_{C}

. The main goal of these methods is the minimizing LUT counts in the resulting Mealy FSM circuits. If it is possible, each class of compatible states should include the maximum possible number of states. This helps minimizing the value of

J_{C}

. The classes are created in a way minimizing the number of shared outputs. This optimizes the number of LUTs in the circuit of MXO. Any multiplexer from the second level of an FSM circuit is implemented by a single LUT if the following condition takes place:

R_{C} + J_{C} \leq N I_{L U T} .

(22)

Even if condition (22) is violated, then the multiplexers could be implemented as single-level circuits. This is possible, if the number of partial functions for a given function

ϕ_{i} \in D \cup O

does not exceed the value

N I_{L U T} - R_{C}

.

5. Example of Synthesis

We use the symbol

P_{C} (U_{a})

to show that the model of

P_{C}

Mealy FSM (Figure 5) is used to implement the circuit of an FSM

U_{a}

. In this Section, we show how to design the circuit of Mealy FSM

P_{C} (U_{1})

using 5-LUTs. The synthesis process starts from Table 1.

Step 1. In the previous section, using Table 1 and 5-LUTs, we have got the partition

Π_{C} = {C S^{1}, C S^{2}}

. The partition includes the classes

C S^{1} = {s_{1}, s_{3}, s_{6}, s_{7}}

and

C S^{2} = {s_{2}, s_{4}, s_{5}, s_{8}}

. Therefore, each class includes four states

s_{m} \in S

. These classes determines the sets

I^{1} = {i_{1}, i_{2}, i_{5}}

,

O^{1} = {o_{1}, o_{2}, o_{3}, o_{4}}

,

I^{2} = {i_{3}, i_{4}, i_{6}}

, and

O^{2} = {o_{1}, o_{3}, \dots, o_{7}}

. Therefore, there is

L_{1} = L_{2} = 3

. Using (19) gives

R_{C 1} = R_{C 2} = 2

. There is

R_{C j} + L_{j} = 5 = N I_{L U T} (j \in {1, 2})

. This means that condition (5) holds for given FSM and K-LUTs. Therefore, it is possible to use the model

P_{C} (U_{1})

. The total number of elements in the sets

O^{j} (j \in {1, 2})

determines how many LUTs are necessary to generated the partial output functions.

The sets

I^{1}

and

I^{2}

have no shared inputs

(I^{1} \cap I^{2} = \emptyset)

. This relation shows that there is the optimal system of interconnections between FSM inputs and LUTs of BPF.

Step 2. As we have found, there is

R_{C 1} = R_{C 2} = 2

. Using (14) gives

R_{C C} = 4

. Now, we have the sets

T = {T_{1}, T_{2}, T_{3}}

,

T_{C} = {T_{1}}

, and

T_{A} = {T_{3}, T_{4}}

. One of the possible outcomes of the encoding is shown in Figure 6.

So, the classes are encoded in the following way:

K (C S^{1}) = 0

and

K (C S^{2}) = 1

. For example, the following relation holds:

C (s_{1}) = C (s_{2}) = 00

. Using the codes of classes of compatible states gives the following composite state codes:

C C (s_{1}) = 000

and

C C (s_{2}) = 100

. Using the same approach, we can find the CSCs for all states

s_{m} \in S

.

Step 3. Compared to STT (Table 1), the DST includes three additional columns. They are:

C C (s_{C})

including a CSC of the current state

s_{C} \in S

;

C C (s_{T})

with a CSC of the state of transition

s_{T} \in S

;

D_{h}

with IMFs equal to 1 to load the code

C C (s_{T})

into SCR. In the discussed example, DST is represented by Table 3.

Step 4. The DST (Table 3) determines contents of tables of blocks of partial functions BPF

^{j} (j \in {1, \dots, J_{C}})

. In these tables, the column

C C (s_{C})

is replaced by the column

C (s_{C})

; the column

O_{h}

is replaced by the column

O_{h}^{j}

; the column

D_{h}

is replaced by the column

D_{h}^{j}

. The superscript j indicates that these functions are generated by the block BPF

^{j} (j \in {1, \dots, J_{C}})

.

In the discussed case, there are two blocks of partial functions. Table 4 represents the block

B P F^{1}

and Table 5 represents the block

B P F^{2}

. There are

H_{1} = 10

rows in Table 4, and

H_{2} = 9

rows in Table 5. Together, these tables have exactly

H = 19

rows (as the number of rows in Table 3).

Step 5. There are the following columns in the table of MXO:

o_{n}

,

B l o c k

. The second column is divided by

J_{C}

sub-columns. If a partial output

o_{n}^{j} \in O

presents in the table of BPF

^{j}

, then there is 1 on the intersection of the column j and the row

o_{n}

. Otherwise, this intersection is marked by 0. The table is constructing using tables of blocks BPF

^{j}

. In the discussed case, this is Table 6.

Step 6. There are the following columns in the table of MXSV:

D_{r}

,

B l o c k

. As in the previous case, the column

B l o c k

is divided by

J_{C}

sub-columns. If a partial IMF

D_{r}^{j}

presents in the table of BPF

^{j}

, then there is 1 on the intersection of the column j and the row

D_{r}

. Otherwise, this intersection is marked by 0. The table is constructing using tables of blocks BPF

^{j}

. In the discussed case, this is Table 7.

Step 7. The BPF is represented by SBFs (15) and (16). These systems are constructed using tables of BPF

^{j} (j \in {1, \dots, J_{C}})

. The partial functions depend on product terms which are conjunctions of

S_{m}

and

I_{h}

. The conjunction

S_{m}

is determined by the code

C (s_{m})

. For example, there is

S_{1} = \bar{T_{2}} \bar{T_{3}} = S_{2}

.

For example, using Table 4 gives us the following sum-of-products for functions

D_{1}^{1}

and

o_{1}^{1}

:

\begin{matrix} D_{1}^{1} = F_{1} \lor F_{3} \lor [F_{4} \lor F_{5}] \lor F_{9} = \bar{T_{2}} \bar{T_{3}} i_{1} \lor \bar{T_{2}} \bar{T_{3}} \bar{i_{1}} \bar{i_{2}} \lor \bar{T_{2}} T_{3} i_{5} \lor T_{2} T_{3} \bar{i_{1}} i_{5}; \\ o_{1}^{1} = F_{1} \lor F_{7} \lor F_{10} = \bar{T_{2}} \bar{T_{3}} i_{1} \lor T_{2} \bar{T_{3}} \lor T_{2} T_{3} \bar{i_{1}} \bar{i_{5}} . \end{matrix}

(23)

Using Table 5 gives us the following sum-of-products for functions

D_{1}^{2}

and

o_{1}^{2}

:

\begin{matrix} D_{1}^{2} = [F_{2} \lor F_{3}] \lor [F_{4} \lor F_{5}] = \bar{T_{2}} \bar{T_{3}} i_{3} \lor T_{2} \bar{T_{3}}; \\ o_{1}^{2} = F_{1} \lor F_{6} = \bar{T_{2}} \bar{T_{3}} i_{3} \lor T_{2} \bar{T_{3}} i_{3} i_{6} . \end{matrix}

(24)

Using Table 6 gives the SBF representing the MXO. This SBF is created in the trivial way. In the discussed case, this is the following system:

\begin{matrix} o_{1} = \bar{T_{1}} o_{1}^{1} \lor T_{1} o_{1}^{2}; o_{2} = \bar{T_{1}} o_{2}^{1}; \\ o_{3} = \bar{T_{2}} o_{3}^{1} \lor T_{2} o_{3}^{2}; o_{4} = \bar{T_{1}} o_{4}^{1} \lor T_{2} o_{4}^{2}; \\ o_{5} = T_{1} o_{5}^{2}; o_{6} = T_{1} o_{6}^{2}; o_{7} = T_{1} o_{7}^{2} . \end{matrix}

(25)

Using Table 7 gives the SBF representing the MXSV. This SBF is created in the trivial way, too. In the discussed case, this is the following system:

\begin{matrix} D_{1} = \bar{T_{1}} D_{1}^{1} \lor T_{1} D_{1}^{2}; D_{2} = \bar{T_{1}} D_{2}^{1} \lor T_{1} D_{2}^{2}; \\ D_{3} = \bar{T_{1}} D_{3}^{1} \lor T_{1} D_{3}^{2} . \end{matrix}

(26)

Step 8. Using the obtained SBFs, we can get a logic circuit of Mealy FSM

P_{C} (U_{1})

. It is shown in Figure 7.

Because each partial function includes no more than

N I_{L U T} = 5

arguments, there are 6 LUTs implementing the partial IMFs

D_{r}^{j}

and 10 LUTs implementing the partial outputs

o_{n}^{j}

. Therefore, there are 16 LUTs in the circuit of BPF. To implement IMFs (17), it is enough

R_{C C} = 3

LUTs. There are 7 LUTs in the circuit of MXO. Therefore, there are 26 5-LUTs in the logic circuit of Mealy FSM

P_{C} (U_{1})

. These LUTs are connected using three buses. The

B u s I T

combines wires with inputs

i_{l} \in I

and state variables

T_{r} \in T_{A}

. The

B u s D O T

includes wires with partial IMFs

D_{r}^{j} \in D

, partial outputs

o_{n}^{j} \in O

, and state variables

T_{r} \in T_{C}

used as control inputs of multiplexers MXO and MXSV. The BusT is an output bus of the distributed SCR. This bus includes wires with state variables

T_{r} \in T_{C} \cup T_{A}

. The buffers of the synchronization tree control the flip-flops connected with outputs of LUT17-LUT19.

We can compare LUT counts for Mealy FSMs

P_{C} (U_{1})

and

P_{E} (U_{1})

. In both cases, we use 5-LUTs. We have synthesized the logic circuit of

P_{E} (U_{1})

. There is

J = 3

for

P_{E} (U_{1})

. There are 10 LUTs in the circuit of Block1, 11 elements in Block2, and 4 elements in Block3. These three blocks represent the block of partial functions having 25 5-LUTs. There are 10 LUTs in the circuit of BlockOR. Therefore, there are 35 5-LUTs in the logic circuit of Mealy FSM

P_{E} (U_{1})

. It means that using the model

P_{C} (U_{1})

instead of

P_{E} (U_{1})

allows a decrease in the LUT count by 35:26 = 1.35 times. At the same time, both circuits have the same number of logic levels.

To get the electrical circuit of Mealy FSMs

P_{C} (U_{1})

, it is necessary to execute the step of technology mapping [36]. This is connected with using the sophisticated CAD tools. In the case of circuits implemented with internal resources of Virtex-7, the industrial package Vivado [34] should be used. The Vivado executes the steps of technology mapping (such as mapping, placement, and so on). Obtaining an FSM circuit allows the determination of its real characteristics such as the number of LUTs and the minimum latency time. Using the latency time gives the maximum value of synchronization frequency. In addition, the value of power consumption is determined for the maximum operating frequency.

From the discussion of SLICEL follows that we cannot use Vivado to get the 5-LUT-based circuit of Mealy FSMs

P_{C} (U_{1})

. However, in Section 6, we show results of experiments conducted using Vivado and the library of benchmark FSMs [19].

6. Experimental Results

We conducted a lot of experiments to compare the basic characteristics of

P_{C}

-based Mealy FSMs with characteristics of FSM circuits based on some other models. The benchmark FSMs from the library [37] are used for the experiments. De facto, the used 48 benchmarks are represented by their state transition tables. The tables are represented by KISS2-based files. The basic characteristics of benchmarks (the values of parameters M, L, and N) have a wide range. Due to it, these benchmarks are used in many research as a base for comparison different FSM design methods. We do not show the characteristics of benchmark FSMs in this article. They can be found, for example, in [17]

We execute the experiments using a personal computer with the following characteristics: CPU: Intel Core i7 6700 K 4.2@4.4 GHz, Memory: 16 GB RAM 2400 MHz CL15. In addition, we use the Virtex-7 VC709 Evaluation Platform (this platform is based on the following FPGA chip: xc7vx690tffg1761-2) [38] and CAD tool Vivado v2019.1 (64-bit) [34]. There is

N I_{L U T} = 6

for FPGAs of Virtex-7. We use reports of Vivado to get the results of experiments. To enter Vivado, we use the CAD tool K2F [5]. This tool allows the creation of VHDL codes on the base of files represented in the KISS2 format.

Three parameters have been compared on the base of our experiments, namely, the chip areas occupied by FSM circuits, performance, and area-time products. To estimate the area, we use the LUT counts taken from reports of Vivado. The performance is represented by the latency time which is achievable for each benchmark FSM. The latency time is shown in Vivado reports. The amount of latency time is inversely proportional to the value of the maximum operating frequency. Thus, the shorter the latency time, the higher the frequency of synchronization pulses can be. The area-time products are calculated as results of multiplication of the LUT counts by the latency times. In our experiments, we use five FSM models. These models are P-FSMs based on either state codes with the minimum length (Auto) or OHCs with maximum number of state variables (One-hot) or some intermediate number of state variables (JEDI). The first two methods are the internal methods of Vivado. Because we try to improve the characteristics of

P_{E}

-based FSMs, we use this model in our research. Obviously, we use the model of

P_{C}

-based FSMs proposed in the current paper.

As in our previous research [17], we use the relation between the values of

R + L

and

N I_{L U T}

to divide the benchmarks [37] by 5 categories. For LUTs of Virtex-7, there is

N I_{L U T} = 6

. We use this value to divide the benchmarks by the categories. The FSMs are trivial (category 0), if the result of summation of R and L does not exceeds 6. The FSMs are simple (category 1), if the result of summation does not exceeds 12. The FSMs are average (category 2), if the result of summation does not exceeds 18. The FSMs are big (category 3), if the result of summation does not exceeds 24. Otherwise, the benchmarks FSMs are very big (category 4). It is shown in the article [5] that there is a direct dependence between the improving of FSM characteristics due to using SD-based methods and the category number.

For our conditions, there is the following distribution of benchmarks [37] by categories. The category 0 consists of FSMs represented by: bbtas, dk17, dk27, dk512, ex3, ex5, lion, lion9, mc, modulo12, and shiftreg. The following FSMs create the category 1: bbara, bbsse, beecount, cse, dk14, dk15, dk16, donfile, ex2, ex4, ex6, ex7, keyb, mark1, opus, s27, s386, s840, and sse. The category 2 contains the FSMs: ex1, kirkman, planet, planet1, pma, s1, s1488, s1494, s1a, s208, styr, and tma. There is single FSM sand in the category of big benchmarks. Four FSMs (s420, s510, s820, and s832) belong to the category 4.

The results of experiments are shown in Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20, Table 21, Table 22 and Table 23. These tables are organized in the same manner. The table columns are marked by the names of investigated methods. The names of benchmarks are written in the table rows. The rows “Total” contain results of summation of values for each column. The row “Percentage” includes the percentage of summarized characteristics of FSM circuits produced by other methods respectively to

P_{C}

-based FSMs. We use the model of P Mealy FSM as a starting point for methods Auto, One-hot, and JEDI.

Let us analyse the experimental results taken from the tables. The following information can be found in these tables: (1) the LUT counts for all benchmarks (Table 8); (2) the LUT counts for benchmarks of category 0 (Table 9); (3) the LUT counts for benchmarks of category 1 (Table 10); (4) the LUT counts for benchmarks of categories 2–4 (Table 11); (5) the latency time for all benchmarks (Table 12); (6) the latency time for benchmarks of category 0 (Table 13); (7) the latency time for benchmarks of category 1 (Table 14); (8) the latency time for benchmarks of categories 2–4 (Table 15); (9) the maximum operating frequency for all benchmarks (Table 16); (10) the maximum operating frequency for benchmarks of category 0 (Table 17); (11) the maximum operating frequency for benchmarks of category 1 (Table 18); (12) the maximum operating frequency for benchmarks of categories 2–4 (Table 19); (13) the area-time products for all benchmarks (Table 20); (14) the area-time products for benchmarks of category 0 (Table 21); (15) the area-time products for benchmarks of category 1 (Table 22); (16) the area-time products for benchmarks of categories 2–4 (Table 23).

As follows from Table 8, the

P_{C}

-based FSMs require fewer LUTs than it is for other investigated methods. Our approach produces circuits having 44.87% less 6-LUTs that it is for equivalent Auto-based FSMs; 68.59% less 6-LUTs that it is for equivalent One-hot-based FSMs; 19.31% less 6-LUTs that it is for equivalent JEDI-based FSMs. While developing our method, we hoped that

P_{C}

-based FSMs will require fewer LUTs in comparison with equivalent

P_{E}

-based FSMs. As follows from the last column of Table 8, our assumptions turn out to be correct. Our approach produces circuits having an average 15.46% less 6-LUTs that it is for equivalent

P_{E}

-based FSMs.

As follows from Table 9, our approach loses compared to both Auto-based FSMs (6.04% loss) and JEDI-based FSMs (7.58% loss). However, Table 9 reflects results for the simplest FSM (category 0). Let us point out that, even in this case, our approach gives a gain compared to One-hot-based (30.3%) and

P_{E}

-based (7.58%) FSMs.

Analysis of Table 10 and Table 11 shows that the

P_{C}

-based FSMs have circuits with fewer LUTs compared with all other investigated approaches. Compared with Auto-based FSMs, there is either 33.99% win rate (category 1) or 52.45% of gain (categories 2–4). Compared with One-hot-based FSMs, there is either 73.27% win rate (category 1) or 69.85% of gain (categories 2–4). Compared with JEDI-based FSMs, there is either 11.22% of gain (category 1) or 24.12% win rate (categories 2–4). Compared with

P_{E}

-based FSMs, there is either 12.87% of gain (category 1) or 16.95% win rate (categories 2–4). Therefore, the gain from applying the proposed approach in relation to

P_{E}

-based FSMs increases as the complexity of the FSM increases (increasing the category number).

As follows from Table 12, our approach produces faster LUT-based FSM circuits relative to other investigated methods. The average win is from 3.01% (compared with

P_{E}

-based FSMs) to 17.71% relative to One-hot based FSMs.

For category 0 (Table 13), our approach provides minimal gain relative to Auto-based FSMs (0.19%) and One-hot-based FSMs (3.11%). At the same time,

P_{C}

-based FSMs are a bit slower than their counterparts based on either JEDI (0.49%) or extended state codes (0.01%). Of course, such a loss is extremely insignificant. Analysis of Table 14 and Table 15 shows that our approach gives gain relatively to all other design methods starting from category 1. For category 1 (Table 14), there is the following gain in FSM performance: 16.25% compared with Auto, 16.13% compared with One-hot, 8.03% compared with JEDI-based FSMs, and 3.14% compared with

P_{E}

-based counterparts. The gain is increased with increasing the category. This follows from Table 15 containing experimental results for categories 2–4. For these categories, there is the following gain: (1) 26.03% regarding Auto; (2) 27.12% regarding One-hot; (3) 16.32% regarding JEDI-based FSMs and (4) 4.52% regarding

P_{E}

-based FSMs.

Obviously, using the latency time we can obtain the values of maximum operating frequency. This characteristic for all benchmarks is shown in Table 16. As follows from Table 16, our approach produces faster LUT-based FSM circuits relative to other investigated methods. The average win is from 3.99% (compared with

P_{E}

-based FSMs) to 29.04% relative to One-hot based FSMs.

For category 0 (Table 17), our approach provides minimal gain relative to Auto-based FSMs (0.05%) and One-hot- based FSMs (2.5%). At the same time,

P_{C}

-based FSMs are a bit slower than their counterparts based on either JEDI (0.85%) or extended state codes (0.01%). Of course, such a loss is extremely insignificant. Analysis of Table 18 and Table 19 shows that our approach gives gain relatively to all other design methods staring from category 1. For category 1 (Table 18), there is the following gain in FSM performance: 27.6% compared with Auto, 27.7% compared with One-hot, 22.46% compared with JEDI-based FSMs, and 3.88% compared with

P_{E}

-based counterparts. The gain is increased with increasing the category. Therefore, for categories 2-4 (Table 19), we have the following gain: (1) 43.61% regarding Auto; (2) 43.76% regarding One-hot; (3) 36.17% regarding JEDI-based FSMs and (4) 6.11% regarding

P_{E}

-based FSMs.

The main goal of our method was to reduce the number of LUTs (the chip area occupied by FSM circuit) compared to

P_{E}

-based FSMs. The results of experiments show that this goal has been achieved. In addition, our approach simultaneously allows an increase in the maximum operating frequency (it is the same as the decreasing of the latency time). Due to it, our approach produces FSM circuits with the best values of area-time products. The corresponding values are shown in Table 20. Our approach provides the following average gain: (1) 83.10% regarding Auto; (2) 112.15% regarding One-hot; (3) 36.8% regarding JEDI and (4) 20.26% regarding

P_{E}

-based FSMs. Analysis of Table 21 and Table 23 shows that the gain obtained by our approach increases with the increasing the FSM category.

For category 0 (Table 21), our approach loses out to the other two approaches: 4.54% lost relative to Auto-based FSMs and 6.68% lost relative to JEDI-based FSMs. However, for this category, our approach has gain compared with One-hot (35.95%) and

P_{E}

-based FSMs (6.99%).

As follows from Table 22, our approach provides the win rate equal to: (1) 59.86% regarding Auto; (2) 105.58% regarding one-hot; (3) 21.59% regarding JEDI; (4) 16.53% regarding

P_{E}

-based FSMs. As follows from Table 23, our approach provides the win rate equal to: (1) 95.58% regarding Auto; (2) 118.97% regarding one-hot; (3) 44.08% regarding JEDI; (4) 22.22% regarding

P_{E}

-based FSMs.

So, the results of our experiments show that the proposed approach can be used instead of other models starting from simple FSMs (category 1). Our approach allows an improvement in LUT counts, maximum operating frequency (minimum latency time), and area-time products compared with other investigated design methods. We think that our approach has rather good potential and can be used in CAD systems targeting FPGA-based Mealy FSMs.

7. Conclusions

Very often, modern CPSs use FPGAs for implementing circuits of their digital parts. Modern FPGAs are very complicated devices having up to 7 billion transistors [15]. They are very efficient platforms for implementing various digital systems. As the complexity of the digital parts of CPSs increases, there is getting deeper a gap between a very big number of system inputs and a very small number of LUT inputs. Modern LUTs have around six inputs. Using internal multiplexers allows for LUTs that have up to eight inputs [24]. However, this value is still rather small compared with numbers of literals in SBFs representing FSM circuits. As a result, various FD-based design methods are applied to implement an FSM circuit [22]. However, the FD-based FSM circuits are multi-level. In addition, they have very complicated systems of spaghetti-type interconnections.

In the case of LUT-based design, the structural decomposition allows for significant improvement in the FSM basic characteristics. They are much better than the corresponding characteristics of equivalent FD-based circuits [22]. As follows from [17], the ESC-based FSMs have better performance than their counterparts based on twofold state assignment [5]. However, this gain is connected with increasing the LUT counts in ESC-based FSMs compared with FSM circuits obtained using the twofold state assignment. This is the main drawback of ESC-based methods [17].

This drawback may be eliminated due to applying composite state codes proposed in this article. Such a code is represented by a concatenation of the class code and the code of a state as an element of this class. This approach leads to two-level FSM circuits which require fewer LUTs than their ESC-based counterparts. There is a gain in the number of LUTs around 15.46%. At the same time, the CSC-based FSMs have slightly better performance as their ESC-based counterparts (around 3% relative to the latency time). Therefore, the proposed method is a good alternative to both the FSM design methods based on functional decomposition [22] and the method [17] based on extended state codes. Due to it, this approach can be used for optimizing characteristics of sequential blocks used in digital parts of cyber-physical systems.

We see two directions of future research. The first of them is the adaptation of the proposed method for optimizing the characteristics of LUT-based Moore FSMs. The second is related to the following. It is known that one of the most important areas of the development of modern cyber-physical systems is the protection of confidentiality of information [39]. This means that the opacity of the systems should be as high as possible [40]. This applies to the full extent to the sequential blocks of CPSs. The method proposed in our work is not aimed at increasing the level of security. Realizing the importance of this problem, we plan to develop an appropriate method to ensure the security of sequential blocks.

Author Contributions

Conceptualization, A.B., L.T. and K.K.; methodology, A.B., L.T. and K.K.; formal analysis, A.B., L.T. and K.K.; writing—original draft preparation, A.B., L.T. and K.K.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BOF	block of output functions
BPF	block of partial functions
BSV	block of state variables
CLB	configurable logic block
CPS	cyber-physical system
CSC	composite state code
DST	direct structure table
ESC	extended state code
FD	functional decomposition
FSM	finite state machine
FPGA	field-programmable gate array
IMF	input memory function
LUT	look-up table
MXO	multiplexer of outputs
MXSV	multiplexer of state variables
OHC	one-hot code
RSC	register of state codes
SBF	systems of Boolean functions
SCR	state code register
SD	structural decomposition
STT	state transition table

References

Alur, R. Principles of Cyber-Physical Systems; MIT Press: Cambridge, MA, USA, 2015. [Google Scholar]
Suh, S.C.; Tanik, U.J.; Carbone, J.N.; Eroglu, A. Applied Cyber-Physical Systems; Springer: New York, NY, USA, 2014. [Google Scholar]
Marwedel, P. Embedded System Design: Embedded Systems Foundations of Cyber-Physical Systems, and the Internet of Things, 3rd ed.; Springer International Publishing: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Wiśniewski, R.; Bazydło, G.; Szcześniak, P.; Wojnakowski, M. Petri net-based specification of cyber-physical systems oriented to control direct matrix converters with space vector modulation. IEEE Access 2019, 7, 23407–23420. [Google Scholar] [CrossRef]
Barkalov, A.; Titarenko, L.; Mielcarek, K.; Chmielewski, S. Logic Synthesis for FPGA-Based Control Units—Structural Decomposition in Logic Design; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2020; Volume 636. [Google Scholar]
Wiśniewski, R.; Benysek, G.; Gomes, L.; Kania, D.; Simos, T.; Zhou, M. IEEE Access Special Section: Cyber-Physical Systems. IEEE Access 2019, 7, 157688–157692. [Google Scholar] [CrossRef]
Gajski, D.D.; Abdi, S.; Gerstlauer, A.; Schirner, G. Embedded System Design: Modeling, Synthesis and Verification; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
Micheli, G.D. Synthesis and Optimization of Digital Circuits; McGraw-Hill: Cambridge, MA, USA, 1994. [Google Scholar]
Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 231. [Google Scholar]
Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. [Google Scholar] [CrossRef]
Zając, W.; Andrzejewski, G.; Krzywicki, K.; Królikowski, T. Finite State Machine Based Modelling of Discrete Control Algorithm in LAD Diagram Language with Use of New Generation Engineering Software. Procedia Comput. Sci. 2019, 159, 2560–2569. [Google Scholar] [CrossRef]
Sentovich, E.M.; Singh, K.J.; Lavagno, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.R.; Brayton, R.K.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; University of California: Berkely, CA, USA, 1992. [Google Scholar]
Grout, I. Digital Systems Design with FPGAs and CPLDs; Elsevier Science: Amsterdam, The Netherlands, 2011. [Google Scholar]
Trimberger, S.M. Field-Programmable Gate Array Technology; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef] [Green Version]
Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving Characteristics of LUT-Based Mealy FSMs with Twofold State Assignment. Electronics 2021, 10, 901. [Google Scholar] [CrossRef]
Xilinx FPGAs. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 11 February 2022).
Salauyou, V.; Ostapczuk, M. State Assignment of Finite-State Machines by Using the Values of Output Variables. In Theory and Applications of Dependable Computer Systems. DepCoS-RELCOMEX 2020. Advances in Intelligent Systems and Computing; Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J., Eds.; Springer: Cham, Switzerland, 2020; Volume 1173, pp. 543–553. [Google Scholar]
Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. 2019, 67, 947–956. [Google Scholar]
Kuon, I.; Tessier, R.; Rose, J. FPGA architecture: Survey and challenges—Found trends. Electr. Des. Autom. 2008, 2, 135–253. [Google Scholar]
Scholl, C. Functional Decomposition with Application to FPGA Synthesis; Kluwer Academic Publishers: Boston, MA, USA, 2001. [Google Scholar]
Kubica, M.; Kania, D. Decomposition of multi-level functions oriented to configurability of logic blocks. Bull. Pol. Acad. Sci. 2017, 67, 317–331. [Google Scholar]
Chapman, K. Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources; Xilinx All Programmable; Xilinx, Inc.: San Jose, CA, USA, 2014; pp. 1–32. [Google Scholar]
Altera. Cyclone IV Device Handbook. Available online: http://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf (accessed on 11 February 2022).
Mishchenko, A.; Brayton, R.; Jiang, J.H.R.; Jang, S. Scalable don’t-care-based logic optimization and resynthesis. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 2011, 4, 1–23. [Google Scholar] [CrossRef]
Sklarova, D.; Sklarov, V.A.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Tallinn, Estonia, 2012. [Google Scholar]
Sklyarov, V. Synthesis and implementation of RAM-based finite state machines in FPGAs. In International Workshop on Field Programmable Logic and Applications; Springer: Berlin/Heidelberg, Germany, 2000; pp. 718–727. [Google Scholar]
Mishchenko, A.; Chattarejee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. CAD 2006, 27, 240–253. [Google Scholar]
Kubica, M.; Kania, D.; Kulisz, J. A technology mapping of fsms based on a graph of excitations and outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
El-Maleh, A.H. A Probabilistic Tabu Search State Assignment Algorithm for Area and Power Optimization of Sequential Circuits. Arab. J. Sci. Eng. 2020, 45, 6273–6285. [Google Scholar] [CrossRef]
ABC System. Available online: https://people.eecs.berkeley.edu/alanmi/abc/ (accessed on 11 February 2022).
Brayton, R.; Mishchenko, A. ABC: An Academic Industrial-Strength Verification Tool. In Computer Aided Verification; Touili, T., Cook, B., Jackson, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 24–40. [Google Scholar]
Vivado Design Suite User Guide: Synthesis; UG901 (v2019.1). Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 11 February 2022).
Quartus Prime. Available online: https://www.intel.pl/content/www/pl/pl/software/programmable/quartus-prime/overview.html (accessed on 11 February 2022).
Khatri, S.P.; Gulati, K. Advanced Techniques in Logic Synthesis, Optimizations and Applications; Springer: New York, NY, USA, 2011. [Google Scholar]
McElvain, K. LGSynth93 Benchmark; Mentor Graphics: Wilsonville, OR, USA, 1993. [Google Scholar]
VC709 Evaluation Board for the Virtex-7 FPGA User Guide; UG887 (v1.6); Xilinx, Inc.: San Jose, CA, USA, 2019.
An, L.; Yang, G.H. Opacity enforcement for confidential robust control in linear cyber-physical systems. IEEE Trans. Autom. Control 2019, 65, 1234–1241. [Google Scholar] [CrossRef]
An, L.; Yang, G.H. Enhancement of opacity for distributed state estimation in cyber-physical systems. Automatica 2022, 136, 110087. [Google Scholar] [CrossRef]

Figure 1. Structural diagram of P Mealy FSM.

Figure 2. Part of SLICEL of Virtex-7.

Figure 3. Structural diagram of LUT-based P Mealy FSM.

Figure 4. Structural diagram of

P_{E}

Mealy FSM (Adapted from [17]).

Figure 4. Structural diagram of

P_{E}

Mealy FSM (Adapted from [17]).

Figure 5. Structural diagram of

P_{C}

Mealy FSM.

Figure 5. Structural diagram of

P_{C}

Mealy FSM.

Figure 6. Composite state codes of Mealy FSM

P_{C} (U_{1})

.

Figure 6. Composite state codes of Mealy FSM

P_{C} (U_{1})

.

Figure 7. Logic circuit of Mealy FSM

P_{C} (U_{1})

.

Figure 7. Logic circuit of Mealy FSM

P_{C} (U_{1})

.

Table 1. State transition table of

U_{1}

.

Table 1. State transition table of

U_{1}

.

$s_{m}$	$s_{T}$	$I_{h}$	$O_{h}$	h
$s_{1}$	$s_{2}$	$i_{1}$	$o_{1} o_{2}$	1
	$s_{3}$	$\bar{i_{1}} i_{2}$	$o_{4}$	2
	$s_{4}$	$\bar{i_{1}} i_{2}$	$o_{3}$	3
$s_{2}$	$s_{3}$	$i_{3}$	$o_{1} o_{5}$	4
	$s_{5}$	$\bar{i_{3}} i_{4}$	$o_{6}$	5
	$s_{2}$	$\bar{i_{3}} \bar{i_{4}}$	$o_{4} o_{7}$	6
$s_{3}$	$s_{4}$	$i_{5} i_{2}$	$o_{2} o_{4}$	7
	$s_{5}$	$i_{5} i_{2}$	$o_{3}$	8
	$s_{6}$	$\bar{i_{5}}$	$o_{2}$	9
$s_{4}$	$s_{4}$	$i_{4}$	$o_{3} o_{5}$	10
	$s_{5}$	$\bar{i_{4}}$	$o_{6}$	11
$s_{5}$	$s_{1}$	$i_{3} i_{6}$	$o_{1} o_{5}$	12
	$s_{6}$	$i_{3} \bar{i_{6}}$	$o_{4} o_{7}$	13
	$s_{7}$	$\bar{i_{3}}$	$o_{3}$	14
$s_{6}$	$s_{7}$	1	$o_{1} o_{2} o_{4}$	15
$s_{7}$	$s_{1}$	$i_{1}$	–	16
	$s_{8}$	$\bar{i_{1}} i_{5}$	$o_{3}$	17
	$s_{7}$	$\bar{i_{1}} \bar{i_{5}}$	$o_{1} o_{2}$	18
$s_{8}$	$s_{1}$	1	$o_{6}$	19

Table 2. Extended state codes for FSM

P_{E} (U_{1})

.

Table 2. Extended state codes for FSM

P_{E} (U_{1})

.

$s_{m}$	${CS}^{1}$	${CS}^{2}$	${CS}^{3}$
$s_{m}$	$T_{1} T_{2}$	$T_{3} T_{4}$	$T_{5} T_{6}$
$s_{1}$	01	00	00
$s_{2}$	00	01	00
$s_{3}$	10	00	00
$s_{4}$	00	10	00
$s_{5}$	00	11	00
$s_{6}$	00	00	01
$s_{7}$	11	00	00
$s_{8}$	00	00	10

Table 3. Direct structure table of

P_{C} (U_{1})

.

Table 3. Direct structure table of

P_{C} (U_{1})

.

$s_{C}$	$CC (s_{C})$	$s_{T}$	$CC (s_{T})$	$I_{h}$	$O_{h}$	$D_{h}$	h
$s_{1}$	000	$s_{2}$	100	$i_{1}$	$o_{1} o_{2}$	$D_{1}$	1
		$s_{3}$	001	$\bar{i_{1}} i_{2}$	$o_{4}$	$D_{3}$	2
		$s_{4}$	101	$\bar{i_{1}} \bar{i_{2}}$	$o_{3}$	$D_{1} D_{3}$	3
$s_{2}$	100	$s_{3}$	001	$i_{3}$	$o_{1} o_{5}$	$D_{3}$	4
		$s_{5}$	110	$\bar{i_{3}} i_{4}$	$o_{6}$	$D_{1} D_{2}$	5
		$s_{2}$	100	$\bar{i_{3}} \bar{i_{4}}$	$o_{4} o_{7}$	$D_{1}$	6
$s_{3}$	001	$s_{4}$	101	$i_{5} i_{2}$	$o_{2} o_{4}$	$D_{1} D_{3}$	7
		$s_{5}$	110	$i_{5} \bar{i_{2}}$	$o_{3}$	$D_{1} D_{2}$	8
		$s_{6}$	010	$\bar{i_{5}}$	$o_{2}$	$D_{2}$	9
$s_{4}$	101	$s_{4}$	101	$i_{4}$	$o_{3} o_{5}$	$D_{1} D_{3}$	10
		$s_{5}$	110	$\bar{i_{4}}$	$o_{6}$	$D_{1} D_{2}$	11
$s_{5}$	110	$s_{1}$	000	$i_{3} i_{6}$	$o_{2} o_{5}$	–	12
		$s_{6}$	010	$i_{3} \bar{i_{6}}$	$o_{4} o_{7}$	$D_{2}$	13
		$s_{7}$	011	$\bar{i_{3}}$	$o_{3}$	$D_{2} D_{3}$	14
$s_{6}$	010	$s_{7}$	011	1	$o_{1} o_{2} o_{4}$	$D_{2} D_{3}$	15
$s_{7}$	011	$s_{1}$	000	$i_{1}$	–	–	16
		$s_{8}$	111	$\bar{i_{1}} i_{5}$	$o_{3}$	$D_{1} D_{2} D_{3}$	17
		$s_{7}$	011	$\bar{i_{1}} \bar{i_{5}}$	$o_{1} o_{2}$	$D_{2} D_{3}$	18
$s_{8}$	111	$s_{1}$	000	1	$o_{6}$	–	19

Table 4. Table of

B P F^{1}

of Mealy FSM

P_{C} (U_{1})

.

Table 4. Table of

B P F^{1}

of Mealy FSM

P_{C} (U_{1})

.

$s_{C}$	$C (s_{C})$	$s_{T}$	$C (s_{T})$	$I_{h}$	$O_{h}^{1}$	$D_{h}^{1}$	h
$s_{1}$	00	$s_{2}$	100	$i_{1}$	$o_{1}^{1} o_{2}^{1}$	$D_{1}^{1}$	1
		$s_{3}$	001	$\bar{i_{1}} i_{2}$	$o_{4}^{1}$	$D_{3}^{1}$	2
		$s_{4}$	101	$\bar{i_{1}} \bar{i_{2}}$	$o_{3}^{1}$	$D_{1}^{1} D_{3}^{1}$	3
$s_{3}$	01	$s_{4}$	101	$i_{5} i_{2}$	$o_{2}^{1} o_{4}^{1}$	$D_{1}^{1} D_{3}^{1}$	4
		$s_{5}$	110	$i_{5} \bar{i_{2}}$	$o_{3}^{1}$	$D_{1}^{1} D_{2}^{1}$	5
		$s_{6}$	010	$\bar{i_{5}}$	$o_{2}^{1}$	$D_{2}^{1}$	6
$s_{6}$	10	$s_{7}$	011	1	$o_{1}^{1} o_{2}^{1} o_{4}^{1}$	$D_{2}^{1} D_{3}^{1}$	7
$s_{7}$	11	$s_{1}$	000	$i_{1}$	–	–	8
		$s_{8}$	111	$\bar{i_{1}} i_{5}$	$o_{3}^{1}$	$D_{1}^{1} D_{2}^{1} D_{3}^{1}$	9
		$s_{7}$	011	$\bar{i_{1}} \bar{i_{5}}$	$o_{1}^{1} o_{2}^{1}$	$D_{2}^{1} D_{3}^{1}$	10

Table 5. Table of

B P F^{2}

of Mealy FSM

P_{C} (U_{1})

.

Table 5. Table of

B P F^{2}

of Mealy FSM

P_{C} (U_{1})

.

$s_{C}$	$C (s_{C})$	$s_{T}$	$C (s_{T})$	$I_{h}$	$O_{h}^{2}$	$D_{h}^{2}$	h
$s_{2}$	00	$s_{3}$	001	$i_{3}$	$o_{1}^{2} o_{5}^{2}$	$D_{3}^{2}$	1
		$s_{5}$	110	$\bar{i_{3}} i_{4}$	$o_{6}^{2}$	$D_{1}^{2} D_{2}^{2}$	2
		$s_{2}$	100	$\bar{i_{3}} \bar{i_{4}}$	$o_{4}^{2} o_{7}^{2}$	$D_{1}^{2}$	3
$s_{4}$	01	$s_{4}$	101	$i_{4}$	$o_{3}^{2} o_{5}^{2}$	$D_{1}^{2} D_{3}^{2}$	4
		$s_{5}$	110	$\bar{i_{4}}$	$o_{6}^{2}$	$D_{1}^{2} D_{2}^{2}$	5
$s_{5}$	10	$s_{1}$	000	$i_{3} i_{6}$	$o_{1}^{2} o_{5}^{2}$	–	6
		$s_{6}$	010	$i_{3} \bar{i_{6}}$	$o_{4}^{2} o_{7}^{2}$	$D_{2}^{2}$	7
		$s_{7}$	011	$\bar{i_{3}}$	$o_{3}^{2}$	$D_{2}^{2} D_{3}^{2}$	8
$s_{8}$	11	$s_{1}$	000	1	$o_{6}^{2}$	–	9

Table 6. Table of MXO for Mealy FSM

P_{C} (U_{1})

.

Table 6. Table of MXO for Mealy FSM

P_{C} (U_{1})

.

$o_{n}$	Block
$o_{n}$	1	2
$o_{1}$	1	1
$o_{2}$	1	0
$o_{3}$	1	1
$o_{4}$	1	1
$o_{5}$	0	1
$o_{6}$	0	1
$o_{7}$	0	1

Table 7. Table of MXSV for Mealy FSM

P_{C} (U_{1})

.

Table 7. Table of MXSV for Mealy FSM

P_{C} (U_{1})

.

$D_{r}$	Block
$D_{r}$	1	2
$D_{1}$	1	1
$D_{2}$	1	1
$D_{3}$	1	1

Table 8. Experimental results (LUT counts for all benchmarks).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM	Category
bbara	17	17	10	13	11	1
bbsse	33	37	24	26	21	1
bbtas	5	5	5	5	5	0
beecount	19	19	14	14	12	1
cse	40	66	36	34	31	1
dk14	16	27	10	12	11	1
dk15	15	16	12	9	7	1
dk16	15	34	12	11	9	1
dk17	5	12	5	5	5	0
dk27	3	5	4	4	4	0
dk512	10	10	9	9	9	0
donfile	31	31	24	21	19	1
ex1	70	74	53	44	37	2
ex2	9	9	8	9	8	1
ex3	9	9	9	9	9	0
ex4	15	13	12	11	10	1
ex5	9	9	9	10	10	0
ex6	24	36	22	22	20	1
ex7	4	5	4	6	4	1
keyb	43	61	40	38	36	1
kirkman	42	58	39	37	31	2
lion	2	5	2	4	4	0
lion9	6	11	5	6	5	0
mark1	23	23	20	20	18	1
mc	4	7	4	6	4	0
modulo12	7	7	7	7	7	0
opus	28	28	22	25	23	1
planet	131	131	88	87	72	2
planet1	131	131	88	87	72	2
pma	94	94	86	80	70	2
s1	65	99	61	61	51	2
s1488	124	131	108	96	83	2
s1494	126	132	110	94	82	2
s1a	49	81	43	47	38	2
s208	12	31	10	11	9	2
s27	6	18	6	8	7	1
s386	26	39	22	22	20	1
s420	10	31	9	10	9	4
s510	48	48	32	31	28	4
s8	9	9	9	12	9	1
s820	88	82	68	59	54	4
s832	80	79	62	61	50	4
sand	132	132	114	108	97	3
shiftreg	2	6	2	6	4	0
sse	33	37	30	29	27	1
styr	93	120	81	79	65	2
tma	45	39	39	36	31	2
Total	1808	2104	1489	1441	1248
Percentage,%	144.87	168.59	119.31	115.46	100.00

Table 9. Experimental results (LUT counts for benchmarks of category 0).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbtas	5	5	5	5	5
dk17	5	12	5	5	5
dk27	3	5	4	4	4
dk512	10	10	9	9	9
ex3	9	9	9	9	9
ex5	9	9	9	10	10
lion	2	5	2	4	4
lion9	6	11	5	6	5
mc	4	7	4	6	4
modulo12	7	7	7	7	7
shiftreg	2	6	2	6	4
Total	62	86	61	71	66
Percentage,%	93.94	130.30	92.42	107.58	100.00

Table 10. Experimental results (LUT counts for benchmarks of category 1).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbara	17	17	10	13	11
bbsse	33	37	24	26	21
beecount	19	19	14	14	12
cse	40	66	36	34	31
dk14	16	27	10	12	11
dk15	15	16	12	9	7
dk16	15	34	12	11	9
donfile	31	31	24	21	19
ex2	9	9	8	9	8
ex4	15	13	12	11	10
ex6	24	36	22	22	20
ex7	4	5	4	6	4
keyb	43	61	40	38	36
mark1	23	23	20	20	18
opus	28	28	22	25	23
s27	6	18	6	8	7
s386	26	39	22	22	20
s8	9	9	9	12	9
sse	33	37	30	29	27
Total	406	525	337	342	303
Percentage,%	133.99	173.27	111.22	112.87	100.00

Table 11. Experimental results (LUT counts for benchmarks of categories 2–4).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
ex1	70	74	53	44	37
kirkman	42	58	39	37	31
planet	131	131	88	87	72
planet1	131	131	88	87	72
pma	94	94	86	80	70
s1	65	99	61	61	51
s1488	124	131	108	96	83
s1494	126	132	110	94	82
s1a	49	81	43	47	38
s208	12	31	10	11	9
styr	93	120	81	79	65
tma	45	39	39	36	31
sand	132	132	114	108	97
s420	10	31	9	10	9
s510	48	48	32	31	28
s820	88	82	68	59	54
s832	80	79	62	61	50
Total	1340	1493	1091	1028	879
Percentage,%	152.45	169.85	124.12	116.95	100.00

Table 12. Experimental results (the latency time for all benchmarks, nsec).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM	Category
bbara	5.171	5.171	4.712	4.690	4.583	1
bbsse	6.367	5.913	5.484	4.731	4.607	1
bbtas	4.898	4.898	4.852	4.991	4.989	0
beecount	6.002	6.002	5.338	5.216	5.022	1
cse	6.829	6.111	5.614	5.323	5.197	1
dk14	5.218	5.792	5.159	5.029	4.911	1
dk15	5.194	5.395	5.132	4.976	4.854	1
dk16	5.892	5.721	5.073	4.906	4.786	1
dk17	5.018	5.988	5.015	5.003	5.024	0
dk27	4.854	4.953	4.898	5.085	5.095	0
dk512	5.095	5.095	5.006	4.804	4.800	0
donfile	5.434	5.435	4.910	4.803	4.670	1
ex1	6.625	7.155	5.654	5.353	5.235	2
ex2	5.036	5.036	4.997	4.960	4.774	1
ex3	5.132	5.132	5.108	4.972	4.968	0
ex4	5.526	5.627	5.186	5.068	4.972	1
ex5	5.548	5.548	5.520	5.494	5.490	0
ex6	5.897	6.105	5.663	5.309	5.071	1
ex7	4.999	4.979	4.985	4.789	4.714	1
keyb	6.392	6.970	5.937	5.406	5.128	1
kirkman	7.073	6.494	6.382	5.645	5.489	2
lion	4.940	4.902	4.942	4.996	5.000	0
lion9	4.871	5.399	4.845	4.828	4.825	0
mark1	6.158	6.158	5.676	5.334	5.188	1
mc	5.085	5.116	5.079	5.099	5.085	0
modulo12	4.831	4.831	4.828	4.805	4.801	0
opus	6.017	6.017	5.608	5.453	5.174	1
planet	7.535	7.535	6.364	5.830	5.657	2
planet1	7.535	7.535	6.364	5.830	5.657	2
pma	6.841	6.841	5.888	5.561	5.430	2
s1	6.830	7.361	6.363	5.832	5.679	2
s1488	7.220	7.579	6.362	5.737	5.586	2
s1494	6.694	6.861	6.085	5.812	5.642	2
s1a	6.520	5.669	5.911	5.419	5.286	2
s208	5.736	5.667	5.594	5.397	5.252	2
s27	5.032	5.222	5.022	4.889	4.807	1
s386	5.947	5.765	5.582	5.295	5.191	1
s420	8.781	8.587	8.529	7.501	7.035	4
s510	8.500	8.500	8.452	7.259	6.793	4
s8	5.555	5.588	5.518	5.164	4.950	1
s820	8.929	8.837	8.578	7.241	6.791	4
s832	8.642	8.832	8.419	7.031	6.523	4
sand	8.623	8.623	7.885	7.085	6.303	3
shiftreg	3.807	3.794	3.620	3.896	3.900	0
sse	6.367	5.913	5.726	5.393	5.194	1
styr	7.267	7.697	6.866	5.806	5.633	2
tma	6.102	6.766	6.092	5.695	5.550	2
Total	288.57	291.11	270.82	254.74	247.31
Percentage, %	116.68	117.71	109.51	103.01	100.00

Table 13. Experimental results (the latency time for benchmarks of category 0, nsec).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbtas	4.898	4.898	4.852	4.991	4.989
dk17	5.018	5.988	5.015	5.003	5.024
dk27	4.854	4.953	4.898	5.085	5.095
dk512	5.095	5.095	5.006	4.804	4.800
ex3	5.132	5.132	5.108	4.972	4.968
ex5	5.548	5.548	5.520	5.494	5.490
lion	4.940	4.902	4.942	4.996	5.000
lion9	4.871	5.399	4.845	4.828	4.825
mc	5.085	5.116	5.079	5.099	5.085
modulo12	4.831	4.831	4.828	4.805	4.801
shiftreg	3.807	3.794	3.620	3.896	3.900
Total	54.08	55.66	53.71	53.97	53.98
Percentage, %	100.19	103.11	99.51	99.99	100.00

Table 14. Experimental results (the latency time for benchmarks of category 1, nsec).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbara	5.171	5.171	4.712	4.690	4.583
bbsse	6.367	5.913	5.484	4.731	4.607
beecount	6.002	6.002	5.338	5.216	5.022
cse	6.829	6.111	5.614	5.323	5.197
dk14	5.218	5.792	5.159	5.029	4.911
dk15	5.194	5.395	5.132	4.976	4.854
dk16	5.892	5.721	5.073	4.906	4.786
donfile	5.434	5.435	4.910	4.803	4.670
ex2	5.036	5.036	4.997	4.960	4.774
ex4	5.526	5.627	5.186	5.068	4.972
ex6	5.897	6.105	5.663	5.309	5.071
ex7	4.999	4.979	4.985	4.789	4.714
keyb	6.392	6.970	5.937	5.406	5.128
mark1	6.158	6.158	5.676	5.334	5.188
opus	6.017	6.017	5.608	5.453	5.174
s27	5.032	5.222	5.022	4.889	4.807
s386	5.947	5.765	5.582	5.295	5.191
s8	5.555	5.588	5.518	5.164	4.950
sse	6.367	5.913	5.726	5.393	5.194
Total	109.03	108.92	101.32	96.74	93.79
Percentage, %	116.25	116.13	108.03	103.14	100.00

Table 15. Experimental results (the latency time for benchmarks of categories 2–4, nsec).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
ex1	6.625	7.155	5.654	5.353	5.235
kirkman	7.073	6.494	6.382	5.645	5.489
planet	7.535	7.535	6.364	5.830	5.657
planet1	7.535	7.535	6.364	5.830	5.657
pma	6.841	6.841	5.888	5.561	5.430
s1	6.830	7.361	6.363	5.832	5.679
s1488	7.220	7.579	6.362	5.737	5.586
s1494	6.694	6.861	6.085	5.812	5.642
s1a	6.520	5.669	5.911	5.419	5.286
s208	5.736	5.667	5.594	5.397	5.252
styr	7.267	7.697	6.866	5.806	5.633
tma	6.102	6.766	6.092	5.695	5.550
sand	8.623	8.623	7.885	7.085	6.303
s420	8.781	8.587	8.529	7.501	7.035
s510	8.500	8.500	8.452	7.259	6.793
s820	8.929	8.837	8.578	7.241	6.791
s832	8.642	8.832	8.419	7.031	6.523
Total	125.45	126.54	115.79	104.03	99.54
Percentage, %	126.03	127.12	116.32	104.52	100.00

Table 16. Experimental results (the maximum operating frequency for all benchmarks).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM	Category
bbara	193.39	193.39	212.21	252.44	262.22	1
bbsse	157.06	169.12	182.34	238.38	248.07	1
bbtas	204.16	204.16	206.12	200.38	200.45	0
beecount	166.61	166.61	187.32	241.72	249.13	1
cse	146.43	163.64	178.12	247.86	252.43	1
dk14	191.64	172.65	193.85	223.84	233.63	1
dk15	192.53	185.36	194.87	226.97	236.02	1
dk16	169.72	174.79	197.13	253.82	264.93	1
dk17	199.28	167.00	199.39	199.87	199.03	0
dk27	206.02	201.90	204.18	196.65	196.28	0
dk512	196.27	196.27	199.75	208.17	208.32	0
donfile	184.03	184.00	203.65	248.19	258.11	1
ex1	150.94	139.76	176.87	276.82	291.01	2
ex2	198.57	198.57	200.14	241.61	251.45	1
ex3	194.86	194.86	195.76	201.12	201.29	0
ex4	180.96	177.71	192.83	237.31	247.14	1
ex5	180.25	180.25	181.16	182.01	182.14	0
ex6	169.57	163.80	176.59	238.35	248.21	1
ex7	200.04	200.84	200.60	240.83	250.14	1
keyb	156.45	143.47	168.43	224.98	235.01	1
kirkman	141.38	154.00	156.68	227.15	242.19	2
lion	202.43	204.00	202.35	200.18	200.02	0
lion9	205.30	185.22	206.38	207.13	207.24	0
mark1	162.39	162.39	176.18	227.47	237.76	1
mc	196.66	195.47	196.87	196.12	196.65	0
modulo12	207.00	207.00	207.13	208.12	208.31	0
opus	166.20	166.20	178.32	213.40	223.26	1
planet	132.71	132.71	187.14	251.54	266.78	2
planet1	132.71	132.71	187.14	251.54	266.78	2
pma	146.18	146.18	169.83	239.83	254.17	2
s1	146.41	135.85	157.16	221.47	236.09	2
s1488	138.50	131.94	157.18	244.31	259.03	2
s1494	149.39	145.75	164.34	242.05	257.24	2
s1a	153.37	176.40	169.17	214.53	229.17	2
s208	174.34	176.46	178.76	255.28	270.42	2
s27	198.73	191.50	199.13	238.53	248.04	1
s386	168.15	173.46	179.15	218.87	228.63	1
s420	173.88	176.46	177.25	263.32	283.14	4
s510	177.65	177.65	198.32	297.76	317.22	4
s8	180.02	178.95	181.23	213.65	223.04	1
s820	152.00	153.16	176.58	268.10	287.26	4
s832	145.71	153.23	173.78	274.22	293.31	4
sand	115.97	115.97	126.82	221.14	239.65	3
shiftreg	262.67	263.57	276.26	256.69	256.41	0
sse	157.06	169.12	174.63	205.41	215.53	1
styr	137.61	129.92	145.64	232.24	247.53	2
tma	163.88	147.80	164.14	235.59	250.18	2
Total	8127.08	8061.22	8718.87	10,906.96	11,360.06
Percentage, %	71.54	70.96	76.75	96.01	100.00

Table 17. Experimental results (the maximum operating frequency for benchmarks of category 0).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbtas	204.16	204.16	206.12	200.38	200.45
dk17	199.28	167.00	199.39	199.87	199.03
dk27	206.02	201.90	204.18	196.65	196.28
dk512	196.27	196.27	199.75	208.17	208.32
ex3	194.86	194.86	195.76	201.12	201.29
ex5	180.25	180.25	181.16	182.01	182.14
lion	202.43	204.00	202.35	200.18	200.02
lion9	205.30	185.22	206.38	207.13	207.24
mc	196.66	195.47	196.87	196.12	196.65
modulo12	207.00	207.00	207.13	208.12	208.31
shiftreg	262.67	263.57	276.26	256.69	256.41
Total	2254.90	2199.70	2275.35	2256.44	2256.14
Percentage, %	99.95	97.50	100.85	100.01	100.00

Table 18. Experimental results (the maximum operating frequency for benchmarks of category 1).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbara	193.39	193.39	212.21	252.44	262.22
bbsse	157.06	169.12	182.34	238.38	248.07
beecount	166.61	166.61	187.32	241.72	249.13
cse	146.43	163.64	178.12	247.86	252.43
dk14	191.64	172.65	193.85	223.84	233.63
dk15	192.53	185.36	194.87	226.97	236.02
dk16	169.72	174.79	197.13	253.82	264.93
donfile	184.03	184.00	203.65	248.19	258.11
ex2	198.57	198.57	200.14	241.61	251.45
ex4	180.96	177.71	192.83	237.31	247.14
ex6	169.57	163.80	176.59	238.35	248.21
ex7	200.04	200.84	200.60	240.83	250.14
keyb	156.45	143.47	168.43	224.98	235.01
mark1	162.39	162.39	176.18	227.47	237.76
opus	166.20	166.20	178.32	213.40	223.26
s27	198.73	191.50	199.13	238.53	248.04
s386	168.15	173.46	179.15	218.87	228.63
s8	180.02	178.95	181.23	213.65	223.04
sse	157.06	169.12	174.63	205.41	215.53
Total	3339.55	3335.57	3576.72	4433.63	4612.75
Percentage, %	72.40	72.31	77.54	96.12	100.00

Table 19. Experimental results (the maximum operating frequency for benchmarks of categories 2–4).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
ex1	150.94	139.76	176.87	276.82	291.01
kirkman	141.38	154.00	156.68	227.15	242.19
planet	132.71	132.71	187.14	251.54	266.78
planet1	132.71	132.71	187.14	251.54	266.78
pma	146.18	146.18	169.83	239.83	254.17
s1	146.41	135.85	157.16	221.47	236.09
s1488	138.50	131.94	157.18	244.31	259.03
s1494	149.39	145.75	164.34	242.05	257.24
s1a	153.37	176.40	169.17	214.53	229.17
s208	174.34	176.46	178.76	255.28	270.42
styr	137.61	129.92	145.64	232.24	247.53
tma	163.88	147.80	164.14	235.59	250.18
sand	115.97	115.97	126.82	221.14	239.65
s420	173.88	176.46	177.25	263.32	283.14
s510	177.65	177.65	198.32	297.76	317.22
s820	152.00	153.16	176.58	268.10	287.26
s832	145.71	153.23	173.78	274.22	293.31
Total	2532.63	2525.95	2866.80	4216.89	4491.17
Percentage, %	56.39	56.24	63.83	93.89	100.00

Table 20. Experimental results (the area-time products for all benchmarks).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM	Category
bbara	87.91	87.91	47.12	60.98	50.41	1
bbsse	210.11	218.78	131.62	123.00	96.74	1
bbtas	24.49	24.49	24.26	24.95	24.94	0
beecount	114.04	114.04	74.74	73.02	60.26	1
cse	273.17	403.32	202.11	180.99	161.10	1
dk14	83.49	156.39	51.59	60.35	54.02	1
dk15	77.91	86.32	61.58	44.78	33.98	1
dk16	88.38	194.52	60.87	53.97	43.08	1
dk17	25.09	71.86	25.08	25.02	25.12	0
dk27	14.56	24.76	19.59	20.34	20.38	0
dk512	50.95	50.95	45.06	43.23	43.20	0
donfile	168.45	168.48	117.85	100.87	88.74	1
ex1	463.76	529.48	299.66	235.52	193.71	2
ex2	45.32	45.32	39.97	44.64	38.20	1
ex3	46.19	46.19	45.97	44.75	44.71	0
ex4	82.89	73.15	62.23	55.75	49.72	1
ex5	49.93	49.93	49.68	54.94	54.90	0
ex6	141.53	219.78	124.58	116.80	101.41	1
ex7	20.00	24.90	19.94	28.73	18.86	1
keyb	274.85	425.18	237.49	205.43	184.61	1
kirkman	297.07	376.62	248.91	208.86	170.15	2
lion	9.88	24.51	9.88	19.98	20.00	0
lion9	29.23	59.39	24.23	28.97	24.13	0
mark1	141.63	141.63	113.52	106.68	93.38	1
mc	20.34	35.81	20.32	30.59	20.34	0
modulo12	33.82	33.82	33.80	33.63	33.60	0
opus	168.47	168.47	123.37	136.31	119.01	1
planet	987.11	987.11	560.01	507.17	407.29	2
planet1	987.11	987.11	560.01	507.17	407.29	2
pma	643.04	643.04	506.39	444.86	380.08	2
s1	443.96	728.74	388.14	355.75	289.62	2
s1488	895.31	992.88	687.11	550.74	463.61	2
s1494	843.43	905.66	669.34	546.35	462.65	2
s1a	319.49	459.18	254.18	254.70	200.88	2
s208	68.83	175.68	55.94	59.37	47.26	2
s27	30.19	93.99	30.13	39.11	33.65	1
s386	154.62	224.84	122.80	116.48	103.83	1
s420	87.81	266.19	76.76	75.01	63.32	4
s510	407.99	407.99	270.45	225.03	190.19	4
s8	49.99	50.29	49.66	61.97	44.55	1
s820	785.71	724.64	583.29	427.23	366.70	4
s832	691.38	697.69	521.97	428.91	326.14	4
sand	1138.23	1138.23	898.91	765.20	611.41	3
shiftreg	7.61	22.76	7.24	23.37	15.60	0
sse	210.11	218.78	171.79	156.41	140.24	1
styr	675.82	923.65	556.17	458.66	366.14	2
tma	274.59	263.87	237.60	205.02	172.05	2
Total	12,745.82	14,768.32	9522.93	8371.63	6961.17
Percentage,%	183.10	212.15	136.80	120.26	100.00

Table 21. Experimental results (the area-time products for benchmarks of category 0).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbtas	24.49	24.49	24.26	24.95	24.94
dk17	25.09	71.86	25.08	25.02	25.12
dk27	14.56	24.76	19.59	20.34	20.38
dk512	50.95	50.95	45.06	43.23	43.20
ex3	46.19	46.19	45.97	44.75	44.71
ex5	49.93	49.93	49.68	54.94	54.90
lion	9.88	24.51	9.88	19.98	20.00
lion9	29.23	59.39	24.23	28.97	24.13
mc	20.34	35.81	20.32	30.59	20.34
modulo12	33.82	33.82	33.80	33.63	33.60
shiftreg	7.61	22.76	7.24	23.37	15.60
Total	312.09	444.47	305.10	349.79	326.93
Percentage,%	95.46	135.95	93.32	106.99	100.00

Table 22. Experimental results (the area-time products for benchmarks of category 1).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
bbara	87.91	87.91	47.12	60.98	50.41
bbsse	210.11	218.78	131.62	123.00	96.74
beecount	114.04	114.04	74.74	73.02	60.26
cse	273.17	403.32	202.11	180.99	161.10
dk14	83.49	156.39	51.59	60.35	54.02
dk15	77.91	86.32	61.58	44.78	33.98
dk16	88.38	194.52	60.87	53.97	43.08
donfile	168.45	168.48	117.85	100.87	88.74
ex2	45.32	45.32	39.97	44.64	38.20
ex4	82.89	73.15	62.23	55.75	49.72
ex6	141.53	219.78	124.58	116.80	101.41
ex7	20.00	24.90	19.94	28.73	18.86
keyb	274.85	425.18	237.49	205.43	184.61
mark1	141.63	141.63	113.52	106.68	93.38
opus	168.47	168.47	123.37	136.31	119.01
s27	30.19	93.99	30.13	39.11	33.65
s386	154.62	224.84	122.80	116.48	103.83
s8	49.99	50.29	49.66	61.97	44.55
sse	210.11	218.78	171.79	156.41	140.24
Total	2423.08	3116.09	1842.98	1766.28	1515.76
Percentage,%	159.86	205.58	121.59	116.53	100.00

Table 23. Experimental results (the area-time products for benchmarks of categories 2–4).

Benchmark	Auto	One-Hot	JEDI	$P_{E}$ FSM	$P_{C}$ FSM
ex1	463.76	529.48	299.66	235.52	193.71
kirkman	297.07	376.62	248.91	208.86	170.15
planet	987.11	987.11	560.01	507.17	407.29
planet1	987.11	987.11	560.01	507.17	407.29
pma	643.04	643.04	506.39	444.86	380.08
s1	443.96	728.74	388.14	355.75	289.62
s1488	895.31	992.88	687.11	550.74	463.61
s1494	843.43	905.66	669.34	546.35	462.65
s1a	319.49	459.18	254.18	254.70	200.88
s208	68.83	175.68	55.94	59.37	47.26
styr	675.82	923.65	556.17	458.66	366.14
tma	274.59	263.87	237.60	205.02	172.05
sand	1138.23	1138.23	898.91	765.20	611.41
s420	87.81	266.19	76.76	75.01	63.32
s510	407.99	407.99	270.45	225.03	190.19
s820	785.71	724.64	583.29	427.23	366.70
s832	691.38	697.69	521.97	428.91	326.14
Total	10,010.66	11,207.77	7374.85	6255.56	5118.48
Percentage,%	195.58	218.97	144.08	122.22	100.00

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Krzywicki, K. Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies 2022, 15, 2636. https://doi.org/10.3390/en15072636

AMA Style

Barkalov A, Titarenko L, Krzywicki K. Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies. 2022; 15(7):2636. https://doi.org/10.3390/en15072636

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, and Kazimierz Krzywicki. 2022. "Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems" Energies 15, no. 7: 2636. https://doi.org/10.3390/en15072636

APA Style

Barkalov, A., Titarenko, L., & Krzywicki, K. (2022). Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies, 15(7), 2636. https://doi.org/10.3390/en15072636

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems

Abstract

1. Introduction

2. Background Information

3. Related Work

4. Main Idea of the Proposed Method

5. Example of Synthesis

6. Experimental Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI