Hardware Model Checking Algorithms and Techniques

Cabodi, Gianpiero; Camurati, Paolo Enrico; Palena, Marco; Pasini, Paolo

doi:10.3390/a17060253

Open AccessReview

Hardware Model Checking Algorithms and Techniques

¹

DAUIN, Department of Control and Computer Engineering, Politecnico di Torino, 10129 Turin, Italy

²

CNIT, National Inter-University Consortium for Telecommunications, 10129 Turin, Italy

³

DET, Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Turin, Italy

^*

Authors to whom correspondence should be addressed.

Algorithms 2024, 17(6), 253; https://doi.org/10.3390/a17060253

Submission received: 8 May 2024 / Revised: 5 June 2024 / Accepted: 7 June 2024 / Published: 9 June 2024

(This article belongs to the Special Issue Surveys in Algorithm Analysis and Complexity Theory, Part II)

Download

Browse Figures

Versions Notes

Abstract

:

Digital systems are nowadays ubiquitous and often comprise an extremely high level of complexity. Guaranteeing the correct behavior of such systems has become an ever more pressing need for manufacturers. The correctness of digital systems can be addressed resorting to formal verification techniques, such as model checking. Currently, it is usually impossible to determine a priori the best algorithm to use given a verification task and, thus, portfolio approaches have become the de facto standard in model checking verification suites. This paper describes the most relevant algorithms and techniques, at the foundations of bit-level SAT-based model checking itself.

Keywords:

formal verification; model checking; SAT; Boolean functions

1. Introduction

Ranging from commodity devices to business or safety critical environments, digital systems have proliferated into most aspects of our daily lives. Moreover, thanks to Moore’s law and the advent of high-level synthesis, contemporary hardware designs often comprise an extremely high level of complexity. As a direct consequence, design verification has become one of the most relevant aspects of the design and production flow. In such a scenario, guaranteeing the correct behavior of digital systems, with respect to their specification, is becoming an ever more pressing need for manufacturers [1]. The increasing complexity of hardware designs, together with the demand for short development cycles imposed by ever evolving markets, makes the delivery of defect-free systems an extremely challenging task. This is particularly true when considering sequential designs, in which the behavior of the system depends not only on the inputs applied at a given time but also on the state the system finds itself in. Modern systems are characterized by the presence of several million elements, those being transistors, logic gates, memory elements or other things. In such a scenario, design verification can represent one of the most time consuming, yet necessary, activities to be performed, with reports purporting it in the range between 40% and 70% of the entire cycle’s endeavor [2]. Oftentimes, the usual testing techniques are not enough to prove the correctness of a whole system in a timely fashion, and further problems arise when taking into account the issue of transferring theoretical results to industrial practice [3].

Traditionally, hardware designs are verified through simulation. In simulation, a model of the system is solicited by a set of stimuli while the resulting behavior is checked against the expected behavior. Simulation is scalable, being applicable to designs of virtually any size, but is also costly and incomplete. In fact, it is often unfeasible to completely cover the behaviors of a realistic design through simulation. The incompleteness of simulation is often aggravated by the increasing complexity of hardware designs, which not only increases the number of behaviors to check but may also introduce corner cases difficult to cover. Starting from the late 1980s, formal hardware verification has become an attractive approach to overcome the coverage limitations of simulation. Formal verification is the use of the formal methods of mathematics and logic to prove (or disprove) the correctness of a design with respect to a given formal specification of its expected behavior. During the last two decades, research in the field of formal verification has led to the development of many techniques. Such techniques can be subdivided into three main approaches to formal verification: theorem proving [4,5], equivalence checking [6] and model checking [7,8].

Model checking is the most widely used formal verification approach for sequential hardware designs. In model checking, a model of the design and a formalized representation of its specification are formulated using some mathematically precise and unambiguous language. Then, an algorithmic procedure automatically checks whether or not the model meets the formal specification. Such a procedure exhaustively traverses the modeled behaviors of the system and either confirms that the system behaves correctly or produces a trace that demonstrates a violating behavior (called a counterexample). The main advantages of model checking are its fully automated nature and its ability to produce counterexamples. Usually, the design to be verified is modeled as a transition system, comprising states and transitions between states, and the specification is formalized by writing temporal logic properties. Invariant verification is a particular model checking task in which the system specification is described as an invariant property that must hold true in every reachable state of the model.

From an algorithmic standpoint, in the beginning, model checking was based on explicit state enumeration; thus, its practical application was severely hindered by mere state space representation problems. The introduction of BDD-based representation [9] was the turning point for model checking applicability to industrial-level scenarios. Despite the huge impact of such an approach to model checking, BDD-based representations still suffered from unpredictable memory requirements and, thus, additional paths of research opened up in order to overcome such limits, such as the introduction of SAT-based approaches.

Today, model checking can exploit explicit state enumeration or symbolic state enumeration, via BDD- or SAT-based representations. Furthermore, solutions based on automatic test pattern generations are admissible. Each of these, in turn, has led to a number of distinct model checking algorithms and techniques.

Up until today, there is not a concrete winner among all the available techniques. The most reliable approach is obtained from running in parallel several of these algorithms, exploiting multi-core environments, and halting verification as soon as one of the engines has found a solution to the verification problem at hand [10,11].

Complementary to the verification phase itself, there is then the additional problem of certifying checkers’ results, in order to better support their acquisition in complete design cycles scenarios [12,13].

A recent trend in hardware model checking is to exploit the ability of SMT solvers to perform word-level reasoning based on bit-vectors. This trend is demonstrated by the introduction of word-level tracks in recent editions of the Hardware Model Checking Competition [14]. However, word-level model checking shines for problems with memory modeled with arrays; bit-level solvers are currently the de facto standard in the industry and prove to be still superior on many word-level designs after bit-blasting [15].

This paper (part of the contents of this paper are a redacted version of the introductory chapters of the PhD thesis of two of its authors, Marco Palena [16] and Paolo Pasini [17], which have not been published before besides diffusion within the Politecnico di Torino and the respective thesis defense committees) describes the most relevant algorithms and techniques targeting explicitly bit-level SAT-based hardware model checking, in a formal and theoretically sound fashion. A description of complementary topics and a broader overview of the vast field of model checking and satisfiability is provided in [18,19]. A description of topics concerning model checking, with additional examples and applications, is provided in [8]. Complementary to previous works, a specific focus on the design and verification of cyber-physical systems is provided in [20].

Outline

The rest of the paper is organized as follows:

In Section 2, we provide some preliminary notions with the purpose of presenting the adopted notation and terminology and make this work self-contained. In particular, we first provide some general concepts from the field of logic and computer science and then we discuss in more depth the necessary concept to contextualize model checking in detail.
In Section 3, Section 4, Section 5, Section 6 and Section 7, we present a complete description of the most relevant bit-level SAT-based hardware model checking algorithms.
In Section 8, we present a practical evaluation of model checking algorithms over industrial-level benchmarks.
In Section 9, we summarize the current work and provide some insights concerning future developments in the field.

2. Background and Notation

In this section, we introduce some notions needed to understand the rest of the paper. In particular, we provide some basic concepts and definitions from the fields of logic and computer science relevant to the topics at hand.

2.1. Mathematical Logic

Model checking aims at verifying whether a finite-state model of a system meets a given specification. In order to solve such a problem algorithmically, both the model of the system and its specification need to be expressed in some precise mathematical language. The fields of mathematical logic and formal language theory provide us the tools we need to reason about the system states in a symbolical and rigorous manner. Mathematical logic is the systematic study of reasoning. By reasoning, we intend the abstract capability to provide arguments supporting the fact that a given statement, called the conclusion, follows from some other statements, called the premises. The steps through which conclusions are reached from premises are termed inferences. Mathematical logic relies on formal languages in order to provide a symbolic representation of abstract concepts, like statements about the model of the system we are interested to reason about, that can then be manipulated through formal, mathematical processes. We will rely on common concepts stemming from logic, and we will specifically introduce and define just the ones that are more relevant to this work. We refer the interested reader to [21,22] for a comprehensive exposition of those subjects.

2.2. Formal Languages

A formal language is a mathematical object that describes the syntax of a set of finite strings of symbols. A formal language is a purely syntactic tool and can be completely defined without making any reference to an interpretation for its symbols. In the following, we provide some basic notions from the field of formal language theory (we refer the interested reader to [23] for an more in depth discussion on the subject).

Definition 1

(Alphabet and Words). An alphabet Σ is any non-empty finite set of symbols. A word over Σ is any finite or infinite string of symbols in Σ. The set of all finite words over Σ is denoted by

Σ^{*}

(called the Kleene closure of Σ).

Definition 2

(Formal Languages). Given an alphabet Σ, a formal language

L

over Σ is any set of finite words over Σ, i.e.,

L \subseteq Σ^{*}

.

A formal language

L

is often described by a set of rules that determine how its words are formed (rules of formation). Many formalisms can be used to describe the rules of formation of a formal language, such as formal grammars, regular expressions and automata [23].

Definition 3

(Well-formed Formulas). Given a formal language

L

over Σ, a word

w \in Σ^{*}

is said to be a well-formed formula of

L

, often abbreviated with wff, iff

w \in L

.

Note that a formal language can be identified with the set of its well-formed formulas. Throughout the rest of this paper, we simply use the term formula to refer to a well-formed formula of a given formal language.

Formal languages are entirely syntactic in nature but may be given interpretations that provide meaning to their symbols and sentences. Given a formal language

L

, from the semantic standpoint, the symbols of

L

may be subdivided into logical symbols and non-logical symbols.

Definition 4

(Logical Symbols). A symbol of a formal language is said to be a logical symbol if it always has the same meaning, independently from the language interpretation.

Examples of logical symbols in the context of propositional and first-order logic are logical connectives (¬, ∧, ∨, etc.), quantifiers (∀, ∃, etc.) and the equality predicate (=).

Definition 5

(Non-logical Symbols and Signature). A symbol of a formal language is said to be a non-logical symbol if it has a meaning only when one is assigned to it by means of a language interpretation. Given a formal language

L

, its signature σ is defined as the set of its non-logical symbols.

Note that non-logical symbols may have different meanings under different language interpretations. Examples of non-logical symbols in the context of propositional and first-order logic are variables and symbols for constants, function and relation.

A fundamental problem in formal language theory is that of determining the membership of a given word in a formal language.

Definition 6

(Membership Problem). Given a formal language

L

and a word

w \in Σ^{*}

, the membership problem is the problem to determine whether

w \in L

, i.e., whether w is a wff of

L

.

As we show later in Section 2.6, each problem in the important computational class of decision problems can be represented as a membership problem for a formal language. SAT-based model checking algorithms encode state reachability problems, defined on a model of the system, as instances of a particular decision problem known as Boolean Satisfiability (SAT).

2.3. Logical Systems

A fundamental concept in mathematical logic is that of logical consequence, describing the relationship between statements that hold true when one statement, called the conclusion, logically follows from one or more other statements, called the premises. Historically, the concept of logical consequence has been studied from two different but interrelated perspectives: syntactic and semantic.

Proof theory is the branch of logic that studies logical consequence from the syntactic standpoint. The formalism of formal systems is used to model deductions (a deduction is a kind of inference that is always valid, i.e., its conclusion is always a logical consequence of its premises) as syntactical transformations (inference rules) applied to strings of symbols (formulas). In this context, the concept of logical consequence is related to the ability to derive a proof, that is, a sequence of syntactic transformations that allow us to derive the conclusion from the premises.

Model theory is the branch of logic that studies logical consequence from the semantic standpoint. Semantics is about associating a meaning with the well-formed formulas of some formal language. Formulas are given a meaning by relating their constituents to elements of the domain of discourse by means of an interpretation. Given an interpretation of the symbols of a formal language, each formula can be related to the concept of truth based on whether or not its meaning holds true in the domain of discourse, i.e., the world we are interested to reason about. The set of all possible interpretations of a formal language define a formal semantic for that language. In this context, the concept of logical consequence is related to whether or not a conclusion would be evaluated as true under all the possible interpretations that evaluate the premises as true.

Even if the syntactic and semantic notions of logical consequence have different definitions, the concepts from the two branches of logic are systematically related. We give the following definitions.

Definition 7

(Logical System). Given a formal language

L

, a logical system (or logic) for

L

is defined as a pair

L \overset{d e f}{=} 〈 S, V 〉

, consisting of a formal system

S

and a formal semantics

V

for

L

.

Definition 8

(Soundness and Completeness). Given a formal language

L

, a logical system

L = 〈 S, V 〉

for

L

is sound iff every formula that can be derived from

S

is valid according to

V

. A logical system

L

is complete iff every formula that is valid according to

V

can be derived from

S

.

Therefore, in a sound and complete logical system the concepts of semantic and syntactic logical consequence coincide. The same is true for the semantic and syntactic versions of validity, consistency, necessity and impossibility.

2.4. Propositional Logic

Propositional logic (or Boolean logic) is a sound and complete logical system concerned with the study of propositions. A proposition is an abstract entity bearing a truth value that is expressed as a statement on a domain of discourse. Boolean logic considers (complex) propositions to be composed of atomic propositions connected by means of logical operators. Atomic propositions are propositions whose structure cannot be further decomposed in terms of logical operators. Later, in Section 2.7, we will use Boolean logic to provide a symbolical representation of hardware-derived state transition systems at the bit level.

2.4.1. Syntax of Propositional Logic

From the syntactic standpoint, the language of propositional logic consists of a set of non-logical symbols called propositional variables and a set of logical symbols, comprising logical connectives (¬, ∧ and ∨) and logical constants (⊤ and ⊥).

Definition 9

(Syntax of Propositional Formulas). Given a set of propositional variables

A P

, the syntax of propositional formulas is defined inductively as follows:

⊤ and ⊥ are propositional formulas.
Each $a \in A P$ is a propositional formula.
Given F, a propositional formula, $\neg F$ is a propositional formula.
Given $F_{1}$ and $F_{2}$ , propositional formula $F_{1} \land F_{2}$ is a propositional formula.
Given $F_{1}$ and $F_{2}$ , propositional formula $F_{1} \lor F_{2}$ is a propositional formula.

Note that ⊥ and ∨ can be omitted from Definition 9 without losing expressiveness as they can be easily derived from ⊤, ¬ and ∧. Parentheses “(” and “)” can be used to improve the readability of formulas. We also provide the following definitions about propositional formulas.

Definition 10

(Atomic Formulas). An atomic formula is a propositional formula consisting of a propositional variable only.

Definition 11

(Support). Given a propositional formula F, we call support of F, denoted by

Vars (F)

, the set of propositional variables occurring in F. Accordingly, given a set of propositional formulas Φ, we call support of Φ, denoted by

Vars (Φ)

, the set of propositional variables occurring in at least one formula of Φ:

Vars (Φ) = ⋃_{F_{i} \in Φ} Vars (F_{i})

(1)

2.4.2. Semantics of Propositional Logic

Intuitively, from the semantic point of view, propositional variables represent atomic propositions, logical constants represent special propositions that are either always true (⊤) or always false (

⊥)

and logical connectives represent standard Boolean functions such as logical negation (¬), logical conjunction (∧) and logical disjunction (∨).

Each propositional variable is interpreted as an atomic proposition about the domain of discourse which can either be true or false. Such an interpretation is described by means of a truth assignment defined as follows.

Definition 12

(Truth Assignments). Given a set of propositional variables

A \subseteq A P

, a truth assignment

μ : A \to B

is a function mapping each propositional variable

a \in A

to a truth value in

B

.

Note that, from the model-theoretic perspective, truth assignments can be seen as equivalent representations of structures along with a definition of truth. Since the only non-logical symbols of propositional logic are propositional variables, a structure for the propositional language is an interpretation of those variables as atomic propositions on the domain of discourse. A definition of truth then maps each atomic proposition to a truth value. The composition of the two is equivalent to a truth assignment.

Definition 13

(Complete or Partial Truth Assignments). Given a propositional formula F, a complete truth assignment for F is a function

μ : Vars (F) \to B

assigning a truth value to each variable of F. A partial truth assignment for F is a function

μ : A \subset Vars (F) \to B

assigning a truth value to a subset A of the variables of F.

Each logical connective is given a truth-functional interpretation, mapping it to a specific Boolean function. Logical connectives are therefore thought as having a certain arity, according to the Boolean function they represent: logical conjunction and disjunction are binary connectives whereas negation is a unary connective. Under its truth-functional interpretation, each logical connective of arity n corresponds to a Boolean function

f : B^{n} \to B

mapping the truth value of its operands to the truth value of its result. The truth-functional interpretation of logical connectives is usually described by means of truth tables. Additional logical connectives, representing other standard Boolean functions such as implication (→), bi-implication (↔) and exclusive disjunction (⊕), may be included in the language of propositional logic according to their usual semantics and and syntactic definition, following.

\begin{matrix} F_{1} \to F_{2} \overset{def}{=} \neg F_{1} \lor \neg F_{2} \end{matrix}

(2)

\begin{matrix} F_{1} \leftrightarrow F_{2} \overset{def}{=} (\neg F_{1} \land \neg F_{2}) \lor (F_{1} \land F_{2}) \end{matrix}

(3)

\begin{matrix} F_{1} \oplus F_{2} \overset{def}{=} (\neg F_{1} \land F_{2}) \lor (F_{1} \land \neg F_{2}) \end{matrix}

(4)

As a way of reducing the number of necessary parentheses, the following precedence order between logical operators is usually applied (from highest precedence to lowest precedence): ¬, ∧, ∨, →, ↔ and ⊕.

Each propositional formula, therefore, unambiguously represents a given Boolean function, which can be determined combining the function associated with its logical connectives according to the precedence order.

Definition 14

(Semantics of Propositional Logic). Given a set of propositional variables

A P

, a propositional formula F over

Vars (F)

and a truth assignment μ over

A \subseteq A P

with

Vars (F) \subseteq A

, we give the following inductive definition of when μ satisfies F (or equivalently, F evaluates to ⊤ under μ), denoted as

μ ⊧ F

:

$μ ⊧ ⊤$ and $μ ⊧ ⊥$ .
For each $a \in Vars (F)$ , $μ ⊧ a$ iff $μ (a) = ⊤$ .
$μ ⊧ \neg F$ iff $μ ⊧ F$ .
$μ ⊧ F_{1} \land F_{2}$ iff $μ ⊧ F_{1}$ and $μ ⊧ F_{2}$ .
$μ ⊧ F_{1} \lor F_{2}$ iff $μ ⊧ F_{1}$ or $μ ⊧ F_{2}$ .

The semantics of other logical connectives can easily be inferred from their truth tables. We provide the following useful definitions:

Definition 15

((Propositional) Models). Given a propositional formula F, a truth assignment

μ : Vars (F) \to B

is a model (or satisfying assignment) of F iff

μ ⊧ F

.

Definition 16

((Propositional or Boolean) Satisfiability). A propositional formula F is said to be satisfiable iff it has at least one model, i.e., there exists μ such that

μ ⊧ F

. Otherwise, F is said to be unsatisfiable.

Definition 17

((Propositional) Consequence). Given a set of propositional formulas Φ and a propositional formula F such that

Vars (F) \subseteq Vars (Φ)

, F is said to be a consequence of Φ iff every model of Φ can be restricted to a model for F, denoted as

Φ ⊧ F

.

Definition 18

((Propositional) Equivalence). Given two propositional formulas F and G such that

Vars (F) = Vars (G)

, F is said to be equivalent to G iff every model of F is also a model of G and vice versa, denoted as

F \equiv G

.

Definition 19

((Propositional) Validity). An argument from a set of propositional formulas Φ to a conclusion F is valid iff

Φ ⊧ F

.

Definition 20

((Propositional) Consistency). A set of propositional formulas Φ is consistent iff there is a complete truth assignment μ that is a model for every formula of Φ.

Definition 21

(Valid Propositional Formulas). A propositional formula F is said to be a valid (or a tautology), denoted as

⊧ F

, iff every truth assignment over its variables is a model for it.

Definition 22

(Strength of Propositional Formulas). Given two propositional formulas F and G, if

F ⊧ G

then F is said to be stronger than G and, conversely, G is said to be weaker than F. From the semantics of implication (→), it is clear that, if

⊧ F \to G

, then

F ⊧ G

and, thus, F is stronger than G.

We also provide the following remarks about notation for models and truth assignment.

Notation 1

(Formulas as Set of Models). Given a propositional formula F, we denote with

Mods (F)

the set of all models of F. Any propositional formula F can be seen as a representation of its set of models

Mods (F)

.

Any truth assignment can be represented as a formula satisfied by it or the set of propositional variables it assigns to ⊤.

Notation 2

(Restriction of Models). Given a propositional formula F, a truth assignment

μ : A \to B

, with

Vars (F) \subset A

can be restricted to a model of F iff its restriction

{μ|}_{Vars (F)} ⊧ F

.

Notation 3

(Truth Assignments as Formulas). A truth assignment μ over a set of propositional variables

A \in A P

can be represented as the following propositional formula:

F_{μ} \overset{d e f}{=} \underset{a_{i} \in A}{⋀} l_{i} where \{\begin{matrix} l_{i} = a_{i}, & iff μ (a_{i}) = ⊤ \\ l_{i} = \neg a_{i}, & iff μ (a_{i}) = ⊥ \end{matrix}

(5)

Notation 4

(Truth Assignments as Sets of Propositional Variables). A truth assignment μ over a set of propositional variables

A \in A P

can be represented by the following set of propositional variables:

A_{μ} \overset{d e f}{=} {a \in A | μ (a) = ⊤}

(6)

Vice versa, a set of propositional variables

A \in A P

induces a truth assignment

μ_{A}

equal to the characteristic function of A, i.e.:

μ_{A} (a) = \{\begin{matrix} ⊤ & if a \in A \\ ⊥, & otherwise \end{matrix}

(7)

Following Notation 4, the satisfaction relation for propositional logic (⊧) can be extended to sets of propositional variables:

A ⊧ φ

if and only if

μ_{A} ⊧ φ

.

2.4.3. Conjunctive Normal Form

We provide the following definitions about propositional formulas.

Definition 23

(Literals, Clauses and Cubes). A literal is either an atomic formula or the negation of an atomic formula. A clause is a disjunction of literals, whereas a cube is a conjunction of literals.

Definition 24

(Conjunctive Normal Form). A formula F is said to be in Conjunctive Normal Form (CNF) if is is a conjunction of clauses.

For instance, a, b, c and d being propositional variables, a is an atomic formula, both

\neg a

and b are literals,

(\neg a \lor b \lor \neg c)

is a clause,

(\neg a \land b \land \neg c)

is a cube and

(\neg a \lor b \lor \neg c) \land (\neg b \lor \neg c \lor d)

is a CNF formula. Given a truth assignment

μ

is easy to see the following:

A literal $a$ is satisfied by $μ$ iff $μ (a) = ⊤$ .
A literal $\neg a$ is satisfied by $μ$ iff $μ (a) = ⊥$ .
A clause $C = l_{0} \lor \dots \lor l_{n}$ is satisfied by $μ$ iff $μ ⊧ l_{i}$ for some $0 \leq i \leq n$ .
A cube $Q = l_{0} \land \dots \land l_{n}$ is satisfied by $μ$ iff $μ ⊧ l_{i}$ for each $0 \leq i \leq n$ .
A CNF formula $F = C_{0} \land \dots \land C_{n}$ is satisfied by $μ$ iff $μ ⊧ C_{i}$ for each $0 \leq i \leq n$ .

2.5. Boolean Functions

In this paper we are interested in verifying the properties of hardware digital systems. Such systems can be thought to be made from large assemblies of logic gates which, in turn, are simple electronic implementations of Boolean functions. As described in Section 2.4, formulas of Propositional Logic can be used to represent Boolean functions and reason about them. In this section, we introduce some basic notions about Boolean functions.

Definition 25

(Boolean Function). A Boolean function is a function

f : B^{n} \to B

, where

B = {⊤, ⊥}

is the Boolean domain and

n \in N

identifies the arity of the function.

Definition 26

(Functions Monotone in a Variable). Given an n-ary Boolean function

f : B^{n} \to B

and a value

k \in N

, with

1 \leq k \leq n

, we say that f is monotone in the variable

x_{k}

if

{f |}_{x_{k} = ⊥} {\to f |}_{x_{k} = ⊤}

. Thus, when f is monotone in

x_{k}

, changing the value of

x_{k}

from ⊥ to ⊤ cannot change the value of f from ⊤ to ⊥. The definition here provided of monotonicity in a given variable is also known as unateness or positive monotonicity in that variable.

Definition 27

(Monotone Functions). A Boolean function f is monotone if it is monotone in each of its variables. Thus, changing the value of any of its variables from ⊥ to ⊤ cannot change the value of f from ⊤ to ⊥.

2.6. Decision Problems

In Section 2.1, we have presented logic as a tool to formalize and reason about a world of interest, whereas in Section 2.5 we have provided some background knowledge about commonly used models for representing combinational and sequential hardware circuits. Our aim is to use the tool of logic, and in particular Propositional Logic, to provide a formal representation of hardware sequential systems and reason about some of their properties. In particular, we are interested in asking certain questions about the formalized systems and having those questions answered by algorithmic procedures. We are, thus, interested in solving some computational problems arising from those questions, which typically belong to the class of decision problems.

In this section we provide an overview of decision problems and their properties and we present Boolean Satisfiability (SAT), a fundamental decision problem in computer science of particular interest in the context of this paper.

Definition 28

(Decision Problems and Instances). A decision problem

P

is any arbitrary question admitting a YES or NO answer over an infinite set of inputs. An instance of a decision problem

P

is a particular input for

P

. We distinguish between Yes-instances, instances for which

P

gives a Yes answer, and No-instances, instances for which

P

gives a No answer.

Traditionally, a decision problem is identified by its set of Yes-instances. Instances of a decision problem are usually encoded as strings over an alphabet

Σ

. As a consequence, a decision problem

P

is often defined as a formal language

L (P)

over that alphabet and can be formulated as an instance of the membership problem for

L (P)

.

Theorem 1

(Decision Problems as Membership Problems). Given a decision problem

P

, determining whether the answer to a given instance is Yes is equivalent to determining whether an encoding

w \in Σ^{*}

of that instance over an alphabet Σ is in the corresponding language

L (P)

.

In order to solve a decision problem

P

, we are interested in algorithms that terminate with a correct answer for every input of

P

, as formalized by the following definitions.

Definition 29

(Soundness and Completeness of an Algorithm). An algorithm for a given decision problem

P

is sound iff when it returns a YES answer the input is a YES-instance of

P

. An algorithm for a given decision problem

P

is complete iff it always terminates and it returns a YES answer when the input is a YES-instance of

P

.

Unsound algorithms are rarely of practical interest. Incomplete algorithms may be effective in solving only Yes- or No-instances of a decision problem. Ideally, however, we would like to have an algorithm for a decision problem

P

that is both sound and complete.

Definition 30

(Decision Procedure). A decision procedure is an algorithmic procedure that, given an instance of a decision problem

P

, always terminates with a correct Yes or No answer.

Definition 31

(Decidability of a Decision Problem). A decision problem for which there exists a decision procedure is said to be decidable.

Decidability is an important property to determine whether a given problem may be tackled algorithmically. Many important problems are undecidable, such as the so-called Entscheidungsproblem asking for the validity of a first-order formula in the general case (proved undecidable by Church and Turing).

Sometimes, decision procedures are also asked to produce certificates as a way to prove that the results of their computation are correct.

Definition 32

(Certificates). A certificate is a string of symbols that can be used to prove the answer of a decision procedure on a given instance. We distinguish YES-certificates, produced by the algorithm to certify a YES answer, and NO-certificates, produced by the algorithm to certify a NO answer.

The typical decision problems we are interested in concern the satisfiability or validity of formulas in some logical system

L

. Those two problems are interchangeable, as deciding the satisfiability of a formula

φ

can be seen as the negation of the decision problem to determine whether the formula

\neg φ

is valid. Focusing on the validity problem, we can formulate such a problem as the membership problem of a formula

φ

to the language of logically valid formulas of

L

(or the language theorems, supposing

L

to be sound and complete).

In what follows, we assume

L

to be both sound and complete. This is because we are mainly interested in decision problems over propositional logics, which are both sound and complete under the usual semantics and proof systems. Considering logical systems that are both sound and complete, we equate the problem of validity (checking whether a formula is a tautology in

L

) to the problem of derivability (checking whether the formula is a theorem in

L

).

Propositional (or Boolean) Satisfiability (SAT) is the problem to decide whether a given propositional formula is satisfiable, as defined in Definition 16.

Definition 33

(Boolean Satisfiability Problem). Given a propositional formula φ over the propositional variables V, SAT is the problem to determine whether

\exists μ (V) : μ ⊧ φ

. If that is the case, the formula φ is a Yes-instance of the problem and is said to be satisfiable (Sat). The model μ is a Yes-certificate for the problem on φ. Otherwise, the formula φ is a No-instance of the problem and is said to be unsatisfiable (Unsat).

SAT is a fundamental decision problem in computer science, important from both the theoretical and the practical perspective. From the theoretical standpoint, SAT plays a central role in computational complexity theory. From the practical standpoint, SAT can be used as a modeling framework to encode many computationally hard problems.

SAT is known to be a computationally hard problem: it is, in fact, a member of the NP-complete class of complexity [24]. This briefly means that no algorithm is currently known that is able to solve all instances of SAT in a polynomial amount of time. Despite this theoretical limitation, the SAT problem is considered in many cases to be tractable, thanks to the fact that its instances often present some kind of structure arising from the application domain they originate from. Currently, state-of-the-art SAT solvers are able to handle SAT instances up to millions of variables and constraints.

Definition 34

(SAT-Solving Algorithm). A SAT-solving algorithm is a decision procedure to answer the SAT problem.

A SAT solver is software that runs a SAT-solving algorithm, taking a propositional formula (usually in CNF) as input and determining whether or not such a formula can be satisfied by a truth assignment of its variables. SAT-solving algorithms are usually designed as search procedures over the space of truth assignments, following the highly influential DPLL [25] algorithm. To give a better understanding of the subject at hand, Algorithm 1 highlights the top-level procedure of DPLL; a brief description follows as well as a partial example of the assignment’s space exploration.

Figure 1 depicts a search tree enumerating all the truth assignments, one for each leaf node, over three variables, taking into account

C N F = {{\neg x, y}, {\neg y, \neg z}}

as a target formula. Given the visual representation, it is easy to see how looking for a satisfying assignment is equal to looking for a path, and hence a leaf node, that satisfies a given CNF. Furthermore, taking into account the tree topology and the leaf values, it is possible to identify simplifying steps that reduce the search space, e.g., neither

μ_{3}

nor

μ_{4}

can lead to satisfying assignments; thus, it is irrelevant to evaluate the role of variable z given a partial assignment that reaches such a branch.

In order to evaluate the amount of work that the DPLL algorithm has to perform, a common notion used is that of the termination tree, which evaluates the actual search tree considered during exploration. Figure 2 depicts a partial termination tree taking into account the same example introduced in Figure 1. Within a termination tree, one can find conflicts (✗) and valid (✓) assignments, as well as a trace for the conditioned CNF at each level of the search space.

Most modern state-of-the-art SAT solvers employs a SAT-solving algorithm called Conflict-Driven Clause Learning (CDCL) [26]. CDCL is based on clause learning and non-chronological backtracking in order to avoid repeated exploration of regions of the search space that do not lead to a satisfying assignment. During search, CDCL algorithms maintain a trail of literals representing the current partial assignment. Under a given partial assignment, each clause loaded in the solver can be either satisfied, conflicts, units or unresolved. Clauses are satisfied if at least one of their literals is satisfied. If each literal in a clause is falsified, then the whole clause is a conflict. If all but one literals of the clause are falsified, then the clause is a unit. Otherwise the clause is unresolved. If a clause is a unit, its only non-falsified literal is implied by the current partial assignment; thus, its variable can be assigned so that the literal is satisfied.

Algorithm 1 Top-level procedure of the DPLL algorithm.

Input: C a set of clauses.
Output: res∈ {Sat, Unsat}.

1:: procedure DPLL(C)
2:: while ∃ unit clause $l \in C$ do
3:: $C \leftarrow$ UnitPropagate(C,l)
4:: while ∃ pure literal $l \in C$ do
5:: $C \leftarrow$ PureLiteralAssign(C,l)
6:: if $C = \emptyset$ then
7:: return Sat
8:: if empty clause $\in C$ then
9:: return Unsat
10:: $l \leftarrow$ ChooseLiteral(C)
11:: return Dpll $(C \cup l)$ or Dpll $(C \cup \neg l)$

The search starts by selecting a variable, using some suitable heuristic, and assigning it a truth value. This process is called decision. After a decision has been made, some of the clauses loaded in the solver may have become conflicts or unit clauses. The solver looks up for unit clauses using efficient data structures, called watch lists, in order to perform Boolean Constraint Propagation (BCP). During BCP, the solver finds and assigns every clause that is a unit under the current partial assignment. Such a procedure iterates until either no more unit clauses can be found or the solver has found a conflict. In the first case, the solver proceeds with the next decision. In the second case, the solver analyzes the sequence of assignments that have led to the conflict, a process called conflict analysis [26]. The result of such an analysis is a learned clause catching the causes of the conflict. Such a clause is added to the current formula in order to avoid future iterations to enter the region of the search space that led to the conflict (learning). The algorithm, thus, performs non-chronological backtracking canceling enough of its latest decisions (even all of them) to leave such a region of space and then resume search on another region. Eventually, either a conflict is found when no decisions have been made, meaning that the problem at hand is Unsat, or every variable has been assigned without incurring a conflict, meaning that the problem at hand is Sat.

State-of-the-art SAT solvers are sophisticated artifacts of engineering that often include various techniques and heuristics to enhance their performance. Among such techniques, there are those to periodically restart search in order to better explore the space [27], simplifying the problem at hand either before starting searching for a solution or during search [28] and periodically trimming the database of learned clauses.

In many applications, SAT solvers are required to answer a sequence of related calls. In order to preserve useful information about the problem at hand, maintained in the state of the solver, many SAT solvers expose an incremental interface. Incremental interfaces enable the user to load a formula in the solver, solve the corresponding SAT problem, then modify the formula and solve the related SAT problem without losing inferences or other useful information about the problem stored in the state. Between incremental calls, the formula can usually be modified by either specifying variable assumptions or adding or removing clauses.

Many modern SAT solvers can be instructed to provide a refutation proof in the case a problem is declared to be Unsat. Such a proof can be thought of as a No-certificate that can be used by an independent checker to assess the correctness of the SAT solver’s answer. Refutation proofs are usually produced as either resolution proofs [29] or clausal proofs [30]. A resolution proof is a series of applications of the resolution rule that derives the empty clause from the clauses of an unsatisfiable propositional formula.

Definition 35

(Binary Resolution Rule). Given two clauses

C_{1}

and

C_{2}

, the (binary) resolution rule is the inference rule:

\frac{(C_{1} \lor a) \land (\neg a \lor C_{2})}{(C_{1} \lor C_{2})}

(8)

Modern SAT solvers are capable, without incurring too large an additional cost, to generate a resolution proof from unsatisfiable runs [31]. Clausal proofs can be seen as logs of the sequence of clauses learned by the solver in the order in which they were learned. A resolution proof can be constructed starting from a clausal proof using BCP [32]. Another common feature of modern SAT solvers is the extraction of unsatisfiability cores. Given an Unsat problem, an unsatisfiability core is a subset of the clauses of the original problem that is still unsatisfiable. The set of clauses in a resolution proof can be seen as an unsatisfiability core.

2.7. Model Checking

Model checking is an automated formal verification technique for determining whether a design meets a given specification. The behavior of the design is usually modeled using a transition system. The specification usually consists of a property (or a set of properties) expressed in a temporal logic.

Transition systems and temporal logics are discussed in detail in Section 2.8 and Section 2.9, respectively. Section 2.10 formalizes the model checking problem as well as a particular sub-class of that problem, called invariant verification. In Section 2.11, a way to symbolically represent transition systems is presented, whereas in Section 2.12 the invariant verification problem is reduced to a reachability problem on such a symbolic representation. These concepts form the foundation for the discussion in the remaining sections of this paper, which will each focus on specific model checking algorithms and techniques.

2.8. Transition Systems

Transition systems are often used in computer science as models to describe the behavior of real-world systems. They can be seen as directed graphs, with nodes representing states and edges representing transitions. A state describes some unique configuration of information about a given system at a certain moment of its behavior. Transitions specify how the system can evolve from one state to another.

Definition 36

(Transition System). A transition system (also known in the literature as a Kripke structure [33]) is a tuple

TS \overset{d e f}{=} 〈 S, S_{0}, R, A P, L 〉

, where S is the set of states,

S_{0} \subseteq S

is the set of initial states,

R \subseteq S \times S

is the transition relation,

A P

is a set of propositional variables representing atomic propositions and

L : S \to 2^{A P}

is a labeling function.

For each state

s \in S

, the labeling function

L (s)

denotes the set of atomic propositions that hold in s. A transition system can be either finite or infinite, depending on whether or not it admits an infinite set of states (and consequently transitions). In this paper, we are interested in the model checking of bit-level hardware sequential circuits. Since the behavior of such systems can be modeled using finite transition systems, we restrict the discussion to those models. For this reason, we hereinafter use the term transition system as a shorthand for finite transition systems without risk of ambiguity.

Example 1.

The sequential hardware system described in Figure 3a, representing a 2-bit synchronous counter that counts up to 2, can be modeled as the following transition system:

TS = 〈 S, S_{0}, R, A P, L 〉

(9)

S = {s_{0}, s_{1}, s_{2}, s_{3}}

(10)

S_{0} = {s_{0}}

(11)

R = {(s_{0}, s_{1}), (s_{1}, s_{2}), (s_{1}, s_{0}), (s_{3}, s_{2})}

(12)

A P = {a_{0}, a_{1}}

(13)

L = {(s_{0}, \emptyset), (s_{1}, {a_{0}}), (s_{2}, {a_{1}}), (s_{3}, {a_{0}, a_{1}})}

(14)

States

S = {s_{0}, s_{1}, s_{2}, s_{3}}

represent the four possible states of the counter, corresponding to the combined digital values of the two flip flops;

s_{0}

represent the reset state of the system, in which the output of both flip flops is low, R indicates how the system will transition between states at each clock cycle and the labeling function L maps each state to a set of atomic propositions in

A P = {a_{0}, a_{1}}

each indicating that either

Q_{0}

or

Q_{1}

is high, respectively. The transition system is visualized in Figure 3b. We will use this system and its model as a running example in the rest of the section.

Let

TS = 〈 S, S_{0}, R, A P, L 〉

be a transition system; we provide the following useful definitions.

Definition 37

(Transition). A transition in

TS

is a pair of states

(s_{1}, s_{2})

such that

(s_{1}, s_{2}) \in R

.

Definition 38

(Successor and Predecessor States). Given a pair of states

(p, s)

, p is said to be a predecessor of s iff

(p, s)

is a transition in

TS

. Vice versa, s is said to be a successor of p.

Definition 39

(Paths). A path in

TS

is any sequence of states

π = (s_{0}, s_{1}, \dots)

of arbitrary (even infinite) length

| π |

such that

(s_{i}, s_{i + 1})

is a transition in

TS

for each

0 \leq i < | π |

. A path

π

is said to be finite if

| π | \in N

; otherwise, it is said to be infinite. We denote by

π_{i} = (s_{i}, s_{i + 1}, \dots)

the suffix of

π

starting with the

(i + 1)

-th state of

π

. We also denote by

π [i]

the

(i + 1)

-th state in

π

.

Definition 40

(Initial Paths). An initial path in

TS

is any path

π = (s_{0}, s_{1}, \dots)

such that

s_{0} \in S_{0}

.

Definition 41

(Exact and Bounded Reachability). A state

s \in S

is reachable in exactly k steps in

TS

iff a finite initial path

π = (s_{0}, \dots, s_{k + 1})

of length

k + 1

such that

s_{k + 1} = s

exists. A state

s \in S

is reachable within k steps (or reachable bounded by k) in

TS

iff there is

i \leq k

such that s is reachable in exactly i steps in

TS

.

Definition 42

(Reachability). A state

s \in S

is reachable in

TS

if it is reachable within an arbitrary (finite) number of steps in

TS

.

Definition 43

(Sets of Exactly Reachable and Sets of Bounded Reachable States). We denote with

R_{i}^{E} (TS)

the set of states reachable in exactly i steps in

TS

. We denote with

R_{i} (TS)

the set of states reachable within i steps in

TS

, i.e.,

R_{i} (TS) \overset{def}{=} ⋃_{0 \leq j < i} R_{j}^{E} (TS)

(15)

Definition 44

(Reachability Diameter). We define the reachability diameter of

TS

to be the minimal number

d \in N

of steps required for reaching all reachable states in

TS

:

d \overset{def}{=} \underset{i \in N}{\arg \min} {i | R_{i} (TS) = R_{i + 1} (TS)}

(16)

Definition 45

(Set of Reachable States). We denote with

R (TS)

the set of states reachable in

TS

R (TS) \overset{def}{=} ⋃_{0 \leq j < d} R_{j} (TS)

(17)

where d is the reachability diameter of

TS

.

When considering reachability, we use the term timeframe i, to denote what happens after i transitions from the initial states. It is trivial to prove that, for a finite-state transition system, there always exists a (finite) reachability diameter.

Example 2.

Given the hardware design and its transition system described in Example 1,

π = (s_{0}, s_{1}, s_{2})

is both a path and an initial path, the set of reachable states is

R (TS) = {s_{0}, s_{1}, s_{2}}

and the reachability diameter of the system is

d = 2

.

2.9. Temporal Logics

Temporal logics are extensions of classical logics that are used to represent, and reason about, statements qualified in terms of time. Both propositional and predicate logic can be extended to include temporal modalities. In this dissertation, we are only interested in Propositional Temporal Logics (PTLs), i.e., temporal logics obtained augmenting propositional logic with temporal operators. This is because they are the logics typically used to express the requirements for bit-level sequential circuits. Hereinafter, when referring to those logics, we will omit the prefix “propositional” for conciseness.

Different temporal logics, with different expressiveness, can be defined based on the temporal operators they use. The two temporal logics used the most in practice are Linear Temporal Logic (LTL) and Computation Tree Logic (CTL). LTL adopts a linear-time perspective, in which every moment in time is followed by a single successor moment. A possible behavior in LTL can, therefore, be seen as an infinite, ordered sequence of moments. CTL, instead, adopts a branching-time perspective, in which every moment in time may be followed by many alternative moments representing different courses of execution. A possible behavior in CTL can, therefore, be seen as an infinite, directed tree of moments. Neither of these logics is a subset of the other, as many temporal properties can be expressed only using one of either LTL or CTL.

The model-theoretic semantics usually given to temporal logics is defined in terms of paths and states of transition systems.

2.9.1. Linear Temporal Logic

Linear Temporal Logic (LTL) was introduced by Pnueli [34] for the specification and verification of reactive systems. Formulas of LTL are constructed from a set of propositional variables

A P

using the usual logical connectives (negation ¬, conjunction ∧ and disjunction ∨) and some temporal operators

X

(“next time”),

F

(“eventually”),

G

(“always”) and

U

(“until”). LTL formulas are interpreted in terms of paths, i.e., sequences of states of a transition system. Their semantics is also extended to states and whole transition systems.

Definition 46

(Syntax of LTL Formulas). Given a set of propositional variables

A P

, the syntax of LTL formulas is defined inductively as follows.

Boolean constants ⊤ and ⊥ are LTL formulas.
Each $a \in A P$ is an LTL formula.
Given ψ, an LTL formula $\neg ψ$ is an LTL formula.
Given $ψ_{1}$ and $ψ_{2}$ , LTL formula $ψ_{1} \land ψ_{2}$ is an LTL formula.
Given $ψ_{1}$ and $ψ_{2}$ , LTL formula $ψ_{1} \lor ψ_{2}$ is an LTL formula.
Given ψ, an LTL formula $X ψ$ is an LTL formula.
Given ψ, an LTL formula $F ψ$ is an LTL formula.
Given ψ, an LTL formula $G ψ$ is an LTL formula.
Given $ψ_{1}$ and $ψ_{2}$ , LTL formula $ψ_{1} U ψ_{2}$ is an LTL formula.

Note that ⊥ and ∨ can be omitted from Definition 46 without losing expressiveness as they can be easily derived from ⊤, ¬ and ∧. Similarly,

F

and

G

can also be omitted, as they can be derived from

X

and

U

as

F ψ \overset{def}{=} ⊤ U ψ

and

G ψ \overset{def}{=} \neg F \neg ψ

. Furthermore, the syntax can be extended to other common Boolean operators by means of their usual definitions in Propositional Logic.

Example 3.

Given the hardware design and its transition system described in Example 1, we can model the specification describing how the system will never reach a state in which the output of both flip flops is high as the following property in LTL logic:

ψ = \neg F (a_{0} \land a_{1})

(18)

Intuitively, the semantics of the temporal operators of LTL is the following:

X ψ

holds at the current moment if

ψ

holds at the next moment;

F ψ

holds at the current moment if there is a future moment at which

ψ

holds;

G ψ

holds at the current moment if

ψ

holds at the current moment and at every future moment;

ψ_{1} U ψ_{2}

holds at the current moment if there is some future moment at which

ψ_{2}

holds and

ψ_{1}

holds at each moment until that future moment. Formally, the semantics of LTL formulas is defined with respect to paths, states and transition systems, as follows.

Definition 47

(Semantics of LTL Formulas with Respect to Paths). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let

π = (s_{0}, s_{1}, \dots)

be an infinite path in

TS

and let ψ be an LTL formula over

A P

. We give the following inductive definition of when

π

satisfies ψ, denoted as

π ⊧ ψ

:

$π ⊧ ⊤$ and $π ⊧ ⊥$ .
For each $p \in A P$ , $π ⊧ p$ iff $p \in L (π [0])$ .
$π ⊧ \neg ψ_{1}$ iff $π ⊧ ψ_{1}$ .
$π ⊧ ψ_{1} \land ψ_{2}$ iff $π ⊧ ψ_{1}$ and $π ⊧ ψ_{2}$ .
$π ⊧ ψ_{1} \lor ψ_{2}$ iff $π ⊧ ψ_{1}$ or $π ⊧ ψ_{2}$ .
$π ⊧ X ψ_{1}$ iff $π_{1} ⊧ ψ_{1}$ .
$π ⊧ F ψ_{1}$ iff $π_{i} ⊧ ψ_{1}$ for some $i \geq 0$ .
$π ⊧ G ψ_{1}$ iff $π_{i} ⊧ ψ_{1}$ for every $i \geq 0$ .
$π ⊧ ψ_{1} U ψ_{2}$ iff $π_{i} ⊧ ψ_{2}$ for some $i \geq 0$ and $π_{j} ⊧ ψ_{1}$ for every $0 \leq j < i$ .

Other common Boolean operators have semantics that can easily be inferred from the semantics of ¬, ∧ or ∨.

Definition 48

(Semantics of LTL Formulas with Respect to States). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let

s \in S

be a state and let ψ be an LTL formula over

A P

. We say that s satisfies ψ, denoted as

s ⊧ ψ

, iff

π ⊧ ψ

for each infinite path

π = (s_{0}, s_{1}, \dots)

with

s_{0} = s

.

Definition 49

(Semantics of LTL Formulas with Respect to Transition Systems). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let ψ be an LTL formula over

A P

. We say that

TS

satisfies ψ, denoted as

TS ⊧ ψ

, iff

s_{0} ⊧ ψ

for each

s_{0} \in S_{0}

.

The interpretation of an LTL formula

ψ

in terms of a state s requires that all the computations starting from s satisfy

ψ

, in order for s to satisfy

ψ

. In LTL semantics, therefore, there is an implicit universal quantification over all computations when determining whether a state satisfies a formula. Thus, LTL can be used to express properties holding for every path from a state but it cannot express properties holding only for some of such paths.

2.9.2. Computation Tree Logic

Computation Tree Logic (CTL) was introduced by Clarke et al. [35] to overcome the semantics limitation of LTL. In CTL, two kinds of formulas are distinguished: state formulas and path formulas. CTL also introduces two new temporal operators, called path quantifiers to allow the specification of properties for some or all the paths starting from a state. These quantifiers are

A

(“for all paths”) and

E

(“there exists a path”). CTL formulas are interpreted in terms of trees of paths, i.e., possible alternative paths of a transition system. Their semantics is also extended to states and whole transition systems.

Definition 50

(Syntax of CTL Formulas). Given a set of propositional variables

A P

, the syntax of CTL state formulas is defined inductively as follows.

Boolean constants ⊤ and ⊥ are CTL state formulas.
Each $a \in A P$ is a CTL state formula.
Given ψ, a CTL state formula $\neg ψ$ is a CTL state formula.
Given $ψ_{1}$ and $ψ_{2}$ , CTL state formula $ψ_{1} \land ψ_{2}$ is a CTL state formula.
Given $ψ_{1}$ and $ψ_{2}$ , CTL state formula $ψ_{1} \lor ψ_{2}$ is a CTL state formula.
Given φ, a CTL path formula $A φ$ is a CTL state formula.
Given φ, a CTL path formula $E φ$ is a CTL state formula.

The syntax of CTL path formulas is defined inductively as follows:

Given ψ, a CTL state formula $X ψ$ is a CTL path formula.
Given ψ, a CTL state formula $F ψ$ is a CTL path formula.
Given ψ, a CTL state formula $G ψ$ is a CTL path formula.
Given $ψ_{1}$ and $ψ_{2}$ , CTL state formula $ψ_{1} U ψ_{2}$ is a CTL path formula.

Note that Definition 50 can be restricted to exclude ⊤, ∨,

F

and

G

, without losing expressiveness in the same way as Definition 46. Syntax can also be expanded to other common Boolean operators as usual.

Example 4.

Given the hardware design and its transition system described in Example 1, we can model the specification describing how the system will never reach a state in which the output of both flip flops is high as the following property in CTL logic:

ψ = \neg E F (a_{0} \land a_{1})

(19)

Intuitively, state formulas describe properties of states, whereas path formulas describe properties of (infinite) paths. Path formulas can be transformed into state formulas by prefixing them with a path quantifier. Note that path quantifiers must immediately precede temporal operators in order for the resulting state formula to be well formed. Intuitively, the semantics of path quantifiers is the following:

A φ

holds at the current state if

φ

holds for all paths starting at the current state;

E φ

holds at the current state if

φ

holds for some paths starting at the current state. Formally, the semantics of CTL formulas is defined with respect to paths, states and transition systems, by means of two satisfaction relations: one for state formulas and the other for path formulas.

Definition 51

(Semantics of CTL Path Formulas with Respect to Paths). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let

π = (s_{0}, s_{1}, \dots)

be an infinite path in

TS

and let φ be a CTL path formula over

A P

. We give the following inductive definition of when

π

satisfies φ, denoted as

π ⊧ φ

:

$π ⊧ X ψ$ iff $π [1] ⊧ ψ$ .
$π ⊧ F ψ$ iff $π [i] ⊧ ψ$ for some $i \dots 0$ .
$π ⊧ G ψ$ iff $π [i] ⊧ ψ$ for every $i \geq 0$ .
$π ⊧ ψ_{1} U ψ_{2}$ iff $π [i] ⊧ ψ_{2}$ for some $i \geq 0$ and $π [j] ⊧ ψ_{1}$ for every $0 \leq j < i$ .

Definition 52

(Semantics of CTL State Formulas with Respect to States). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let

s \in S

be a state and let ψ be a CTL state formula over

A P

. We give the following inductive definition of when s satisfies ψ, denoted as

s ⊧ ψ

:

$s ⊧ ⊤$ and $s ⊧ ⊥$ .
For each $p \in A P$ , $s ⊧ p$ iff $p \in L (s)$ .
$s ⊧ \neg ψ_{1}$ iff $s ⊧ ψ_{1}$ .
$s ⊧ ψ_{1} \land ψ_{2}$ iff $s ⊧ ψ_{1}$ and $s ⊧ ψ_{2}$ .
$s ⊧ ψ_{1} \lor ψ_{2}$ iff $s ⊧ ψ_{1}$ or $s ⊧ ψ_{2}$ .
$s ⊧ E φ_{1}$ iff $π ⊧ ψ$ for some infinite initial path $π$ starting in s.
$s ⊧ A φ_{1}$ iff $π ⊧ ψ$ for each infinite initial path $π$ starting in s.

Definition 53

(Semantics of CTL State Formulas with Respect to Transition Systems). Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, let

s \in S

be a state and let ψ be a CTL state formula over

A P

. We say that

TS

satisfies ψ, denoted as

TS ⊧ ψ

, iff

s_{0} ⊧ ψ

for each

s_{0} \in S_{0}

.

2.9.3. Temporal Properties

Definition 54

(Temporal Properties). Given a transition system

TS

, a temporal property is any property ψ, expressed in a temporal logic, that specifies the admissible (or desired) behaviors of

TS

.

Temporal properties typically fall under one of two main categories: safety properties and liveness properties, defined informally as follows.

Definition 55

(Safety Properties). Given a transition system

TS

, a safety property asserts that some (undesired) conditions never happen for

TS

(“nothing bad ever happens”).

Definition 56

(Liveness Properties). Given a transition system

TS

, a liveness property asserts that some (desired) conditions will eventually happen for

TS

(“something good eventually happens”).

Typical examples of safety properties include deadlock freedom and mutual exclusion, whereas a typical example of a liveness property is starvation freedom. Both safety and liveness properties can be expressed using LTL and CTL. The main difference between the two categories of properties is that safety properties can be violated by finite computations of a system, whereas liveness properties can only be violated by infinite computations of a system. In practical applications, safety properties are prevalent. The temporal properties defined in Examples 3 and 4 are safety properties.

We distinguish a kind of safety properties of particular interest, called invariant properties, defined as follows:

Definition 57

(Invariant Properties). Given a transition system

TS

, an invariant property asserts that the condition described by a given propositional formula φ over

A P

, called an invariant condition, always happen for

TS

(“some invariant condition φ always happens”).

Invariant properties can be described in LTL as formulas

G φ

with the invariant condition

φ

a formula over

A P

. Similarly, invariant properties in CTL take the form

A G φ

with the invariant condition

φ

a formula over

A P

.

Example 5.

The corresponding invariant properties of the safety properties defined in Examples 3 and 4 are as follows:

ψ = G (\neg a_{0} \lor \neg a_{1})

(20)

ψ = A G (\neg a_{0} \lor \neg a_{1})

(21)

in LTL and CTL, respectively.

Properties management [36,37] goes beyond the scope of this paper, but it is still a topic as relevant as the proper choice of the actual model checking algorithm to be used given a specific scenario.

2.10. Model Checking Problem

Let

TS = 〈 S, S_{0}, R, A P, L 〉

be a transition system; the model checking problem can be formalized as follows:

Definition 58

(Model Checking Problem). Given a transition system

TS

and a temporal property ψ over

A P

, the decision problem to check whether

TS ⊧ ψ

is called a model checking problem. In the case

TS ⊧ ψ

, the model checking procedure is required to provide a counterexample.

Definition 59

(Counterexample). Given a transition system

TS

and a temporal property ψ over

A P

such that

TS ⊧ ψ

, a counterexample to ψ for

TS

is any initial path

π

of

TS

such that

π ⊧ ψ

.

Model checking is therefore a decision problem, as defined in Section 2.6, that admits a Yes-answer (

ψ

holds for

TS

) or a No-answer (

ψ

does not hold for

TS

).

Definition 60

(Model Checking Algorithm). A model checking algorithm, or model checker, is any decision procedure able to solve a model checking problem.

When a model checking algorithm terminates with a No-answer, it must also emit a No-certificate in the form of a counterexample. Model checking algorithms must systematically traverse all behaviors of the system in order to either confirm that the property holds or provide a counterexample. A model checking algorithm may have either or both of the following qualities.

Definition 61

(Completeness and Soundness (of a Model Checking Algorithm)). A model checking algorithm is said to be complete iff, given a transition system

TS

and a property ψ, it is able to either provide a counterexample of ψ for

TS

or detect the absence of counterexamples of any length. A model checking algorithm is said to be sound iff, given a transition system

TS

and a property ψ, it provides a counterexample of ψ for

TS

only when

TS ⊧ ψ

and it proves the absence of counterexamples of any length only when

TS ⊧ ψ

.

Different classes of model checking problems can be distinguished, depending on the category of temporal properties being checked. In this paper, we are interested only in the model checking of invariant properties, also called invariant verification problems, which we show can be solved via reachability analysis on the transition system under verification.

Definition 62

(Invariant Verification Problem). Given a transition system

TS

and an invariant property ψ over

A P

, the invariant verification problem of ψ for

TS

is the model checking problem to decide

TS ⊧ ψ

or to provide a counterexample.

Definition 63

(Invariant). Given a transition system

TS

and an invariant property ψ over

A P

, if

TS ⊧ ψ

, then the invariant condition φ of ψ is a propositional formula over

A P

and is said to be an invariant of

TS

.

The invariant verification problem can be reduced to a reachability problem over

TS

, as follows.

Theorem 2

(Reduction of Invariant Verification to Reachability). Given a transition system

TS

and an invariant property ψ over

A P

, let φ over

A P

be the invariant condition of ψ; then, the invariant verification problem can be reduced to the problem to check whether a state

s \in S

such that

s ⊧ φ

is reachable in

TS

. If that is the case,

TS ⊧ ψ

and any initial path to s is a counterexample for the invariant verification problem. Otherwise,

TS ⊧ ψ

. Such a problem is called the reachability problem of

\neg φ

in

TS

.

Therefore, the invariant verification problem can be simply solved by means of a procedure that traverses the reachable state space of

TS

and checks whether or not

φ

holds for every traversed state. Since the state space is finite, the procedure eventually terminates either providing a counterexample or confirming

φ

as an invariant of

TS

. In practice, such a state space is usually too large to be explicitly traversed. State-of-the-art model checking methods rely on a symbolic representation of the system instead.

2.11. Symbolic Representation of Transition Systems

The model checking formulation provided in the previous subsection relies on an explicit representation of states and transitions. Such an explicit representation is not adequate to handle large state spaces. In symbolic model checking, the (explicit) transition system is converted into a more concise, implicit representation based on propositional formulas, called symbolic representation. Using a symbolic representation, model checking algorithms can traverse the state space of the system more efficiently, by manipulating sets of states and transitions directly, instead of being forced to operate on one state or transition at a time. In addition, since the symbolic representation relies on propositional formulas, symbolic model checking algorithms can leverage well-developed automatic techniques for manipulating such formulas such as BDDs or SAT solvers.

In this Subsection we show how to encode a transition system into its symbolic representation and how to formulate the invariant verification problem on such a representation. Given a transition system

TS = 〈 S, S_{0}, R, A P, L 〉

, to obtain a symbolic representation of

TS

, we must define how to encode each of its constituents into propositional formulas.

2.11.1. Symbolic Representation of States

In order to symbolically represent the set of states S of a transition system

TS

, we introduce a set of propositional variables V, called state variables, with

| V | = ⌈ l o g_{2} | S | ⌉

. Given a transition system

TS = 〈 S, S_{0}, R, A P, L 〉

and a set of state variables V, we give the following definitions.

Definition 64

(Encoding Function). An encoding function

ϕ : S \to B^{| V |}

over V is an injective function mapping states to complete truth assignments over V.

Definition 65

(Encoding of States). A state

s \in S

is encoded as a complete truth assignment

μ_{s} = ϕ (s)

over V according to an encoding function ϕ. The complete truth assignment

μ_{s}

is called encoding of s over V.

Note that

ϕ

is bijective if and only if

| S | = 2^{| V |}

. Otherwise, some pseudo-state representations are introduced in the symbolic representation of S by encoding. Such pseudo-states can be disregarded and treated as unreachable states of

TS

.

Propositional formulas over V are used to represent sets of states. Any propositional formula

F_{Q}

over V represents the set of states

Q \subseteq S

such that, for each state

q \in Q

, its corresponding encoding

μ_{q}

over V satisfies

F_{Q}

, i.e.,

μ_{q} ⊧ F_{Q}

. Given an encoding function

ϕ

over V, and letting

q \in S

be a state of

TS

, many logically equivalent propositional formulas over V can be used to represent q. We define one example of such a formula as follows.

Definition 66

(Characteristic Formula for States). The characteristic formula of q is the propositional formula

ξ_{q}

over V such that

ξ_{q} \overset{d e f}{=} \underset{v_{i} \in V}{⋀} l_{i} where \{\begin{matrix} l_{i} = v_{i}, & iff μ_{q} (v_{i}) = ⊤ \\ l_{i} = \neg v_{i}, & iff μ_{q} (v_{i}) = ⊥ \end{matrix}

(22)

where

μ_{q} = ϕ (q)

.

Note that

ξ_{q}

is a cube and it is satisfied only by the complete truth assignment

μ_{q}

encoding q.

Let

Q \subseteq S

be a set of states; many logically equivalent propositional formulas over V can be used to represent Q. We define one example of such a formula as follows.

Definition 67

(Characteristic Formula for Set of States). The characteristic formula of Q is the propositional formula

ξ_{Q}

over V such that

ξ_{Q} \overset{d e f}{=} \underset{q \in Q}{⋁} ξ_{q}

(23)

Note that

ξ_{Q}

is satisfied only by the complete truth assignments

μ_{q}

encoding states

q \in Q

.

With abuse of notation, hereinafter, we make no distinction among the following concepts:

A state $s \in S$ , any propositional formula $F_{s}$ that is logically equivalent to its characteristic formula $ξ_{s}$ and the only model for such formula $μ_{s}$ .
A set of states $Q \subseteq S$ , any propositional formula $F_{Q}$ that is logically equivalent to its characteristic formula $ξ_{Q}$ and the set of models for such formula $Mods (F_{Q}) = {μ_{q} | q \in Q}$ .

Initial states

S_{0} \subseteq S

of

TS

can be represented symbolically by a propositional formula

I

such that the encodings

μ_{s_{0}}

over V of each initial state

s_{0} \in S_{0}

satisfy

I

.

Example 6.

Given the hardware design and its transition system described in Example 1, the states

S = {s_{0}, s_{1}, s_{2}, s_{3}}

will be encoded using

| V | = ⌈ \log_{2} | S | ⌉ = 2

state variables

V = {v_{0}, v_{1}}

by the encoding function defined as follows:

ϕ = {(s_{0}, (0, 0)), (s_{1}, (0, 1)), (s_{2}, (1, 0)), (s_{3}, (1, 1))}

(24)

Individual states will be represented symbolically by their characteristic formulas (cubes):

ξ_{s_{0}} = \neg v_{0} \land \neg v_{1} ξ_{s_{1}} = v_{0} \land \neg v_{1} ξ_{s_{2}} = \neg v_{0} \land v_{1} ξ_{s_{3}} = v_{0} \land v_{1}

(25)

The set of initial states will be encoded as the following formula:

I = ξ_{s_{0}} = \neg v_{0} \land \neg v_{1}

(26)

2.11.2. Symbolic Representation of Transitions

In order to symbolically represent transitions (a binary relation on states), we consider two sets of state variables: one to represent the starting state, called present state, and another to represent the final state, called next state, of a transition. The set V is used to encode present states while its primed counterpart

V^{'} = {v^{'} | v \in V}

is used to encode next states. The same primed notation is used to denote the set of next states

S^{'} = {s^{'} | s \in S}

and any propositional formula

F^{'} = F [V \leftarrow V^{'}]

in which the present state variables have been renamed as the next state variables. Furthermore, we use the superscript notation

V^{i}

to denote the set of state variables encoding states at timeframe i. The same superscript notation is used accordingly to denote the sets of states

S^{i}

after i transitions and any propositional formula

F^{i} = F [V \leftarrow V^{i}]

referring to timeframe i. We also use the notation

π^{i, j}

as a shorthand for a truth assignment over

V^{i} \cup \dots \cup V^{j}

.

Definition 68

(Next State Encoding Function). Given a transition system

TS

, a set of present state variables V and a set of next state variables

V^{'}

, let ϕ be an encoding function over S; we define the next state encoding function

ϕ^{'}

as

ϕ^{'} = ϕ [V \leftarrow V^{'}]

.

We give the following definitions.

Definition 69

(Encoding of Transitions). A transition

(s, s^{'}) \in R

is encoded as a complete truth assignment

μ_{s, s^{'}} = ϕ (s) \cup ϕ^{'} (s^{'})

over

V \cup V^{'}

. The complete truth assignment

μ_{s, s^{'}}

is called an encoding of

(s, s^{'})

over

V \cup V^{'}

.

Propositional formulas over

V \cup V^{'}

are used to represent sets of transitions. Any propositional formula

F_{Z}

over

V \cup V^{'}

represents the set of transitions

Z \subseteq R

such that, for each

(z, z^{'}) \in Z

, its corresponding encoding

μ_{z, z^{'}}

over

V \cup V^{'}

satisfies

F_{Z}

, i.e.,

μ_{z, z^{'}} ⊧ F_{Z}

. Given the encoding function

ϕ

and a next state encoding function

ϕ^{'}

, and letting

(z, z^{'}) \in R

be a transition of

TS

, many logically equivalent propositional formulas over

V \cup V^{'}

can be used to represent

(z, z^{'})

. We define one example of such a formula as follows.

Definition 70

(Characteristic Formula for Transitions). The characteristic formula of

(z, z^{'})

is the propositional formula

ξ_{z, z^{'}}

over

V \cup V^{'}

such that

ξ_{z, z^{'}} \overset{d e f}{=} \underset{v_{i} \in V \cup V^{'}}{⋀} l_{i} where \{\begin{matrix} l_{i} = v_{i}, & iff μ_{z, z^{'}} (v_{i}) = ⊤ \\ l_{i} = \neg v_{i}, & iff μ_{z, z^{'}} (v_{i}) = ⊥ \end{matrix}

(27)

where

μ_{z, z^{'}} = ϕ (z) \cup ϕ^{'} (z^{'})

.

Note that

ξ_{z, z^{'}}

is satisfied only by the complete truth assignment

μ_{z, z^{'}}

encoding

(z, z^{'})

.

Let

Z \subseteq R

be a set of transitions of

TS

. Many logically equivalent propositional formulas over

V \cup V^{'}

can be used to represent Z. We define one example of such a formula as follows.

Definition 71

(Characteristic Formula for Sets of Transitions). The characteristic formula of Z is the propositional formula

ξ_{Z}

over

V \cup V^{'}

such that

ξ_{Z} \overset{d e f}{=} \underset{(z, z^{'}) \in P}{⋁} ξ_{z, z^{'}}

(28)

Note that

ξ_{Z}

over

V \cup V^{'}

is satisfied only by the complete truth assignments

μ_{z, z^{'}}

over

V \cup V^{'}

encoding transitions

(z, z^{'}) \in Z

.

With abuse of notation, hereinafter, we make no distinction among the following concepts:

A transition $(s, s^{'}) \in R$ , any propositional formula $F_{s, s^{'}}$ that is logically equivalent to its characteristic formula $ξ_{s, s^{'}}$ and the only model for such formula $μ_{s, s^{'}}$ .
A set of transitions $Z \subseteq R$ , any propositional formula $F_{Z}$ that is logically equivalent to its characteristic formula $ξ_{Z}$ and the set of models for such formula $Mods (F_{Z}) = {μ_{z, z^{'}} | (z, z^{'}) \in Z}$ .

The transition relation R is, thus, encoded as a propositional formula T over

V \cup V^{'}

such that the encoding

μ_{s, s^{'}}

over

V \cup V^{'}

of each transition

(s, s^{'}) \in R

satisfies T.

Example 7.

Given the hardware design and its transition system described in Example 1, the transition relation R will be encoded as the propositional formula T over

V \cup V^{'}

:

\begin{matrix} \begin{matrix} T = & (\neg v_{0} \land \neg v_{1} \land v_{0}^{'} \land \neg v_{1}^{'}) \lor \\ (v_{0} \land \neg v_{1} \land \neg v_{0}^{'} \land v_{1}^{'}) \lor \\ (\neg v_{0} \land v_{1} \land \neg v_{0}^{'} \land \neg v_{1}^{'}) \lor \\ (v_{0} \land v_{1} \land \neg v_{0}^{'} \land v_{1}^{'}) \end{matrix} \end{matrix}

(29)

2.11.3. Symbolic Representation of the Labeling Function

In order to symbolically represent the labeling function L and the set of propositional variables

A P

, first, we need to define the reverse labeling function

L^{- 1}

, mapping each propositional variable

a \in A P

to the set of states in which

a

holds.

Definition 72

(Reverse Labeling). The reverse labeling function of a transition system

TS

is the function

L^{- 1} : A P \to 2^{S}

such that, for each

a \in A P

,

L^{- 1} (a) = {s \in S | a \in L (s)}

(30)

Each propositional variable

a \in A P

is mapped to a set of states by

L^{- 1} (a) \subseteq S

. For each propositional variable

a

, the set of states

L^{- 1} (a)

can be represented as its characteristic formula

ξ_{L^{- 1} (a)}

over V. The labeling function L is therefore symbolically represented as a set of characteristic formulas

ξ_{L^{- 1} (a)}

over V, one for each

a \in A P

. A propositional variable

a \in A P

is then symbolically represented by a propositional formula encoding the set of states

L^{- 1} (a)

for which

a

holds.

Definition 73

(Characteristic Formula of Atomic Propositions). The characteristic formula of an atomic proposition represented by the propositional variable

a \in A P

is the propositional formula

ξ_{a} = ξ_{L^{- 1} (a)}

.

Note that

ξ_{a}

is satisfied only by those states

s \in L^{- 1} (a)

.

Example 8.

Given the hardware design and its transition system described in Example 1, the atomic propositions

{a_{0}, a_{1}}

will be encoded as follows by the reverse labeling function

L^{- 1}

:

L^{- 1} = {(a_{0}, {s_{1}, s_{3}}), (a_{1}, {s_{2}, s_{3}})}

(31)

a_{0} = ξ_{s_{1}} \lor ξ_{s_{3}} = (v_{0} \land \neg v_{1}) \lor (v_{0} \land v_{1}) = v_{0}

(32)

a_{1} = ξ_{s_{2}} \lor ξ_{s_{3}} = (\neg v_{0} \land v_{1}) \lor (v_{0} \land v_{1}) = v_{1}

(33)

2.11.4. Symbolic Representation of Transition Systems

Given a transition system

TS \overset{def}{=} 〈 S, S_{0}, R, A P, L 〉

, we define a symbolic representation of the whole system as follows.

Definition 74

(Symbolic Transition System). A symbolic transition system encoding

TS

is the tuple

M = 〈 V, I, T 〉

, where

V is a set of propositional state variables, with $| V | = ⌈ l o g_{2} | S | ⌉$ , encoding the states S of $TS$ .
$I$ is a propositional formula over V representing the initial states $S_{0}$ of $TS$ .
T is a propositional formula over $V \cup V^{'}$ representing the transition relation R of $TS$ .

All definitions provided in Section 2.8 about reachability on transition systems naturally extends to their symbolic counterparts.

Example 9.

Given the hardware design and its transition system described in Example 1, the overall encoding of the transition system

TS

will be the tuple

M = 〈 V, I, T 〉

defined as follows:

V = {v_{0}, v_{1}}

(34)

I = \neg v_{0} \land \neg v_{1}

(35)

\begin{matrix} \begin{matrix} T = & (\neg v_{0} \land \neg v_{1} \land v_{0}^{'} \land \neg v_{1}^{'}) \lor \\ (v_{0} \land \neg v_{1} \land \neg v_{0}^{'} \land v_{1}^{'}) \lor \\ (\neg v_{0} \land v_{1} \land \neg v_{0}^{'} \land \neg v_{1}^{'}) \lor \\ (v_{0} \land v_{1} \land \neg v_{0}^{'} \land v_{1}^{'}) \end{matrix} \end{matrix}

(36)

2.11.5. Symbolic Representation of Invariant Properties

Given a transition system

TS = 〈 S, S_{0}, R, A P, L 〉

and an invariant property

ψ

with invariant condition

φ

over

A P

, we define a symbolic representation of the invariant condition as follows.

Definition 75

(Symbolic Invariant Property). A symbolic invariant property encoding φ is the propositional formula P over V such that

P \overset{d e f}{=} φ [a_{1} \leftarrow ξ_{a_{1}}, \dots, a_{n} \leftarrow ξ_{a_{n}}]

(37)

for each

a_{i} \in Vars (φ)

, i.e., the formula obtained by substituting each propositional variable

a

of φ with its characteristic formula

ξ_{a}

over V.

Example 10.

Given the hardware design and its transition system described in Example 1, the invariant property ψ defined in Example 5 will have an invariant condition φ encoded as follows:

φ = \neg a_{0} \lor \neg a_{1}

(38)

P = \neg ξ_{a_{0}} \lor \neg ξ_{a_{1}} = \neg v_{0} \lor \neg v_{1}

(39)

2.12. Symbolic Invariant Verification

As shown in Section 2.10, the invariant verification problem can be reduced to a problem of reachability in a transition system. Reachability in a transition system (or equally invariant verification) can, in turn, be reduced to reachability in a corresponding symbolic transition system, as follows.

Theorem 3

(Symbolic Invariant Verification). Given a transition system

TS = 〈 S, S_{0}, R, A P, L 〉

and an invariant property ψ with invariant condition φ over

A P

, the reachability problem of

\neg φ

in

TS

can be reduced to the reachability problem of

\neg P

in

M

where

M = 〈 V, I, T 〉

is a symbolic transition system encoding

TS

and P a symbolic invariant property encoding φ.

In this paper, we are interested in algorithms for symbolic invariant verification. Therefore, with abuse of notation, we hereinafter use the terms “transition system”, “reachability”, “invariant property” and “invariant verification” to denote their symbolic counterparts.

Notation 5

(Target and Bad States). Given a transition system

M = 〈 V, I, T 〉

and an invariant property P, with the notion of target we identify set of states corresponding to

\neg P

. States (or sets of states) which are part of the target, or may reach the target, are called bad states.

Given a transition system

M = 〈 V, I, T 〉

and an invariant property P, we provide the following useful definitions for invariant verification algorithms.

Definition 76

(Path Formula). A path formula of length

k = j - i

, starting from timeframe i and reaching timeframe j, is the propositional formula

Π (i, j)

over

V^{i} \cup \dots \cup V^{j}

:

Π (i, j) = ⋀_{h = i}^{j - 1} T (V^{h}, V^{h + 1})

(40)

Definition 77

(Initial Path Formula). An initial path formula of length k is the following propositional formula:

Π_{0} (k) = I (V^{0}) \land Π (0, k)

(41)

Intuitively, a path formula

Π (i, j)

encodes all paths of length

k = j - i

starting at timeframe i in

M

. An initial path formula

Π_{0} (k)

encodes all paths of length k starting from the initial states in

M

. An initial path formula

Π_{0} (k)

can thus be used to represent the set of states that can be reached from the initial states in exactly k steps, in

M

.

Definition 78

(Bad Cone). A bad cone of length

k = j - i

from timeframe i to timeframe j is the propositional formula

C o n e (i, j)

over

V^{i} \cup \dots \cup V^{j}

:

C o n e (i, j) = Π (i, j) \land ⋁_{h = i}^{j} \neg P (V^{h})

(42)

A bad cone

C o n e (i, j)

encodes all paths starting at timeframe i that can reach the target in at most

k = j - i

steps and, thus, it represents the set of bad states that are backward reachable in at most k steps from the target.

Using initial path formulas the reachability problem of

\neg P

in

M

can be reduced to Boolean satisfiability as follows.

Theorem 4

(Reduction of Reachability to Satisfiability). Given a transition system

M \overset{d e f}{=} 〈 V, I, T 〉

and a property P over V, the reachability of

\neg P

can be reduced to the problem of finding a bound

k \geq 0

such that the propositional formula

Π_{0} (k) \land \neg P (V^{k})

(43)

is satisfiable.

Definition 79

(Induction). Given a transition system

M = 〈 V, I, T 〉

, let F and G be propositional formulas over V and let initiation, consecution and relative consecution be the conditions defined as follows:

\begin{matrix} I \to F & (Initiation) \\ F \land T \to F^{'} & (Consecution) \\ G \land F \land T \to F^{'} & (Relative Consecution) \end{matrix}

A formula F is said to be inductive if it satisfies consecution. A formula F is said to be an inductive invariant if it satisfies both initiation and consecution. A formula F is said to be inductive relative to another formula G if it satisfies relative consecution. A formula F is said to be an inductive invariant relative to another formula G if it satisfies both initiation and relative consecution.

Definition 80

(Safety). Given a transition system

M = 〈 V, I, T 〉

and an invariant property P over V, a propositional formula F over V is said to be safe with respect to F iff F is stronger than P, i.e.,

F \to P

.

Lemma 1

(Inductive Invariants as Overapproximations of Reachable States). Given a transition system

M = 〈 V, I, T 〉

, an inductive invariant F of

M

is an overapproximation of the set of reachable states of

M

, i.e.,

R (M) \to F

.

Intuitively, an inductive invariant expresses a quality of reachable states. Note that

R (M)

can be seen as the strongest inductive invariant of

M

.

Definition 81

(Inductive Strengthening). Given a transition system

M = 〈 V, I, T 〉

and a property P over V, an inductive invariant F of

M

is called an inductive strengthening of P iff it is safe with respect to P.

Lemma 2

(Invariant Verification as Search of an Inductive Strengthening). Given a transition system

M = 〈 V, I, T 〉

and a property P over V, if an inductive strengthening F of P can be found in

M

, then P holds for every reachable state of

M

and

M

is said to be safe.

Many state-of-the-art model checking algorithms are based on a search for an inductive strengthening.

Definition 82

(Traces). Given a transition system

M = 〈 V, I, T 〉

, a trace of length k with respect to

M

is a sequence

F_{k} = (F_{0}, \dots, F_{k})

in which each

F_{i}

is a propositional formula over V, called a frame, such that the following conditions hold:

\begin{matrix} F_{0} = I & (Base) \\ F_{i} \land T \to F_{i + 1}^{'} for 0 \leq i < k & (Image Approximation) \end{matrix}

A trace

F_{k}

may also satisfy one or both of the following additional conditions, with P a property over V:

\begin{matrix} F_{i} \to F_{i + 1} for 0 \leq i < k & (Monotonicity) \\ F_{i} \to P for 0 \leq i < k & (Safety) \end{matrix}

A trace

F_{k}

that satisfies the monotonicity condition is said to be monotonic. A trace

F_{k}

that satisfies the safety condition with respect to the transition system P is said to be safe. In order to provide a more clear understanding of the various traces natures, Figure 4 provides a graphical representation as an overview.

The following lemmas hold for traces.

Lemma 3

(Overapproximation of Sets of Exactly Reachable States). Given a transition system

M

and a trace

F_{k} = (F_{0}, \dots, F_{k})

with respect to

M

, each timeframe

F_{i}

of

F_{k}

, with

0 \leq i < k

, is an over-approximation of the set of states reachable in exactly i steps in

M

, i.e.,

R_{i}^{E} (A P y m) \to F_{i}

.

Lemma 4

(Overapproximation of Sets of Reachable States). Given a transition system

M

and a monotonic trace

F_{k} = (F_{0}, \dots, F_{k})

with respect to

M

, each timeframe

F_{i}

of

F_{k}

, with

0 \leq i < k

, is an overapproximation of the set of states reachable in within i steps in

M

, i.e.,

R_{i} (M) \to F_{i}

.

Lemma 5

(Safety up to Trace Bound). Given a transition system

M

, a property P over V and a safe and monotonic trace

F_{k} = (F_{0}, \dots, F_{k})

with respect to P and

M

, then there does not exist any counterexample to P of length at most

k - 1

in

M

.

Lemma 6

(Inductive Invariants/Strengthening in Traces). Given a transition system

M

and a trace

F_{k} = (F_{0}, \dots, F_{k})

with respect to

M

, let us define

F_{0, i} \overset{d e f}{=} ⋁_{j = 0}^{i} F_{j}

for each

0 \leq i < k

. If there is

F_{0, i}

so that

F_{i + 1} \to F_{0, i}

then

F_{0, i}

is an inductive invariant of

M

. If

F_{k}

is also safe with respect to a property P, then

F_{0, i}

is also an inductive strengthening for P.

Lemma 7

(Inductive Invariants/Strengthening in Monotonic Traces). Given a transition system

M

and a monotonic trace

F_{k} = (F_{0}, \dots, F_{k})

with respect to

M

, if

F_{i + 1} \to F_{i}

for some

0 \leq i < k

then

F_{i} = F_{i + 1}

is an inductive invariant of

M

. If

F_{k}

is also safe with respect to a property P, then

F_{i} = F_{i + 1}

is also an inductive strengthening for P.

These properties makes traces a very useful data structure to aid the search of an inductive strengthening in

M

.

3. Bounded Model Checking

Given a transition system

M = 〈 V, I, T 〉

and an invariant property P, bounded model checking (BMC) [38] is an iterative approach whose purpose is to check whether a counterexample to P of length at most k in

M

exists or to prove its absence. In order to carry this out, BMC simply performs a SAT check on a formula defined as follows.

Definition 83

(BMC Formula). A BMC formula of length k for P in

M

is the propositional formula

BMC (k)

over

V^{0} \cup \dots \cup V^{k}

:

B M C (k) \overset{d e f}{=} Π_{0} (k) \land ⋁_{i = 0}^{k} \neg P (V^{i}) = I \land C o n e (0, k)

(44)

Intuitively, a BMC formula of length k encodes all initial paths in

M

of length at most k capable of reaching a bad state in

\neg P

. If the BMC formula is Sat, then there exists a counterexample to P of length at most k in

M

. Conversely, no such counterexample exists.

BMC tools iteratively solve BMC formulas of increasing bound, until either a counterexample is found or some maximum bound is reached. Due to the way it operates, BMC is effective in finding counterexamples but it is not able to detect whether P holds in

M

. In order to surpass such a limitation, specific techniques are required to support unbounded model checking. The ability to check reachability fix-points and/or to find inductive invariants is thus the main difference, and additional complication, between BMC and UMC techniques, which will be introduced in the following sections.

Algorithm 2 reports the BMC procedure up to a certain bound

k_{m a x}

. First, the algorithm checks whether the property P is satisfied by the initial states (lines 2–3). If an initial state fails to satisfy P, the procedure terminates by returning Fail along with a trivial counterexample. Otherwise, the procedure starts iterating over increasing values of k up to

k_{m a x}

, checking whether there is at least a path satisfying the BMC formula up to bound k (Definition 83) at each iteration (lines 6–8). If a satisfying path

π^{0, k}

is found, the procedure returns Fail along with the path as a counterexample. If the procedure does not find any counterexample of length up to

k_{m a x}

, it returns a Success, indicating that P holds up to the given bound.

Algorithm 2 Top-level procedure of the BMC algorithm.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V;

k_{m a x}

maximum bound.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure BoundedModelChecking( $M$ , P, $k_{m a x}$ )
2:: if $\exists s_{0} ⊧ I (V) \land \neg P (V)$ then
3:: return 〈Fail, (s₀)〉
4:: $k \leftarrow 1$
5:: while $k \leq k_{m a x}$ do
6:: $B M C (k) \leftarrow Π_{0} (k) \land ⋁_{i = 0}^{k} \neg P (V^{i})$
7:: if $\exists π^{0, k} ⊧ B M C (k)$ then
8:: return 〈Fail, π^0,k〉
9:: $k \leftarrow k + 1$
10:: return 〈Success, −〉

As explained above, the search for a counterexample of a given length is inherently incomplete, thus the need for additional techniques in order to overcome such a limitation. Nevertheless, BMC find application in several scenarios, due to its relative ease of application, both in hardware and in software model checking. The minimum requirement is being capable of encoding the transition relation of the instance at hand in a suitable way, e.g., a circuital representation. Furthermore, given a proper encoding, it is possible to combine both hardware and software problems in one, thus making BMC applicable to hybrid scenarios in which HW/SW co-design and co-verification take place. Such scenarios include, but are not limited to, RTL implementations to be checked against golden models, the model checking of properties arising from hardware–software interactions and sequential equivalence checking in between high-level synthesis and hardware description languages.

Example of Bounded Model Checking

Let

M = 〈 V, I, T 〉

be a transition system described by the following:

V = {v_{0}, v_{1}}

(45)

I = \neg v_{0} \land \neg v_{1}

(46)

\begin{matrix} \begin{matrix} T = & (\neg v_{0} \land \neg v_{1} \land v_{0}^{1} \land \neg v_{1}^{1}) \lor \\ (v_{0} \land \neg v_{1} \land \neg v_{0}^{1} \land v_{1}^{1}) \lor \\ (\neg v_{0} \land v_{1} \land v_{0}^{1} \land v_{1}^{1}) \lor \\ (v_{0} \land v_{1} \land \neg v_{0}^{1} \land \neg v_{1}^{1}) \end{matrix} \end{matrix}

(47)

and P be an invariant property over V:

P = \neg v_{0} \lor \neg v_{1}

(48)

Figure 5 illustrates the transition systems here described, for reference. The initial state, 00, is marked with the leftmost ingoing edge. The only bad state, 11, is highlighted with a line hatched filling.

An algorithm based on bounded model checking is capable of identifying a counterexample to the target property in four steps, by applying iteratively the BMC formula described in Definition 83.

In the first step, the algorithm would pose the following query:

(\neg v_{0} \land \neg v_{1}) \land (v_{0} \land v_{1})

(49)

which is UNSAT.

In the second step, the algorithm would pose the following query, in which, for the sake of compactness, we represent with

T (V^{i}, V^{i + 1})

the aforementioned transition relation, at a given timeframe:

\begin{matrix} \begin{matrix} (\neg v_{0} \land \neg v_{1}) & \land \\ T (V^{0}, V^{1}) & \land \\ (v_{0}^{1} \land v_{1}^{1}) \end{matrix} \end{matrix}

(50)

which is still UNSAT.

In the last step, the algorithm would pose the following query:

\begin{matrix} \begin{matrix} (\neg v_{0} \land \neg v_{1}) & \land \\ T (V^{0}, V^{1}) & \land \\ T (V^{1}, V^{2}) & \land \\ T (V^{2}, V^{3}) & \land \\ (v_{0}^{3} \land v_{1}^{3}) \end{matrix} \end{matrix}

(51)

which is SAT, hence identifying the counterexample to the property, which does not hold in state 11.

4. Temporal Induction

BMC, introduced in Section 3, is characterized by being an incomplete method: it is capable of finding counterexamples but it is unsuitable to prove correctness. In order to support proofs for unbounded depths, BMC has to be complemented with additional techniques. As an intuition, if exploration is capable of reaching a deep enough bound, such as to have explored all the behavior of the model at hand, such a bound could be considered a so-called completeness threshold. Such a threshold can be used as an upper bound for the length of the counterexamples that have to be considered before stating whether a proof holds.

In the context of model checking, SAT-based techniques are particularly suitable for performing checks of inductions, i.e., whether the transition system of the model under verification satisfies a given inductive invariant. The underlying question we are posing is whether a given property holds in the initial states and in all the states reachable from states in which the property is satisfied. The main limitation of such reasoning is due to the fact that, if the second condition fails, nothing can be said about the property at hand.

On these premises, temporal induction [39], also known as strong induction or

k -

induction, represents a widely used technique in unbounded model checking. Such a technique is a generalization of the notion of inductive invariants, which provide a strengthening factor to the base intuition presented above.

Given a transition system

M = 〈 V, I, T 〉

and an invariant formula

φ

over V, that formula is said to be

k -

invariant if it is true in the first k steps of

M

.

Definition 84

(k-invariant formula).

Π_{0} (k) \to ⋀_{i = 0}^{k} φ (V^{i})

(52)

A k-invariant formula is a k-inductive invariant if it is

(k - 1)

-invariant and is inductive after k steps of

M

. With respect to simple induction,

k -

induction strengthen the hypothesis in the induction step. We have that the property is assumed to hold in the first

k - 1

steps, starting from 0, and is established in step k.

Whenever we have a k-invariant formula

φ

such that

φ \to P

, we say that

φ

is a safe

k -

inductive invariant with respect to

M

.

Algorithm 3 reports the top-level procedure for a generic k-induction model checking scheme. First, the algorithm checks whether the property P is satisfied by the initial states (lines 3–4). If an initial state fails to satisfy P, the procedure terminates by returning Fail along with a trivial counterexample (lines 5–6). Otherwise, the procedure starts iterating over increasing values of k, checking whether there is at least a path satisfying the induction step formula up to bound k (line 7), and taking into account only non-looping paths (line 8). If a satisfying path is found, the procedure returns Success (lines 9–10), otherwise the bound k is increased and the procedure iterates further.

Algorithm 3 Top-level procedure of the k-induction algorithm.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure TemporalInductionModelChecking( $M$ , P)
2:: $k \leftarrow 0$
3:: while true do
4:: $B A S E (k) \leftarrow Π_{0} (k) \land ⋀_{i = 0}^{k} P (V^{i})$
5:: if $\exists π^{0, k} ⊧ B A S E (k)$ then
6:: return 〈Fail, π^0,k〉
7:: $S T E P (k) \leftarrow Π (0, k) \land ⋀_{i = 0}^{k} P (V^{i}) \land \neg P (V^{k + 1})$
8:: $U N I Q U E (k) \leftarrow \underset{0 \leq i < j \leq k}{⋀} s_{i} \neq s_{j}$
9:: if $\exists π^{0, k} ⊧$ step(k) ∨ unique(k) then
10:: return 〈Success, −〉
11:: $k \leftarrow k + 1$

Theorem 5.

Given a transition system

M

, there exists a safe inductive invariant with respect to

M

iff there exists a safe

k -

inductive invariant with respect to

M

.

Theorem 5 states that k-induction is as complete as 1-induction.

5. Interpolation-Based Model Checking

Craig’s interpolation theorem is a fundamental result in mathematical logic taking into account the relationship between model theory and proof theory. Interpolation, despite its heavy theoretical foundation, has found many practical applications in different areas of computer science. Originally, the formulation of the theorem introduced by Craig [40] was given in the context of first-order logic. Variants of the theorem hold for other logical systems as well, including propositional logic, which is the context we will focus on hereinafter, as it is the one typically encountered in the context of model checking.

Theorem 6

(Propositional Craig’s Interpolation Theorem). Given two propositional formulas A and B, if

A \land B

is unsatisfiable then there is a propositional formula I, called interpolant between A and B, such that

$A \to I$ is valid;
$I \land B$ is unsatisfiable;
$Vars (I) \subseteq Vars (A) \cap Vars (B)$ .

Intuitively, I is an abstraction of A from the standpoint of B. The interpolant summarizes and translates in the common language between the two formulas the reasons why A and B are inconsistent. For ease of notation, we denote with I = Itp(A, B) the procedure that derives a Craig’s interpolant starting from a pair of inconsistent formulas A and B.

Interpolants can be derived from refutation proofs of unsatisfiable SAT-solving runs. Given an unsatisfiable formula

A \land B

, most modern SAT solvers are capable of generating a refutation proof, as described in Section 2.6, either in resolution-based or clausal form. Given a resolution proof, an interpolant I = Itp(A, B) can be derived in the form of an AND/OR combinational circuit in polynomial time and space with respect to the size of the proof. Different algorithms for generating interpolants from resolution proofs have been proposed, such as the ones by Huang [41], Krajícek [42], Pudlák [43] and McMillan [44]. We refer to those algorithms as interpolation systems. Interpolants can also be derived as CNF formulas, instead of circuits, from either resolution proofs [45] or clausal proofs [46]. Although interpolants may be quite large in size, different approaches exist to compact them, also taking into account the relation between the size and strength of the interpolants themselves [47].

In the context of model checking, if A represents a set of reachable states and B represents a set of bad states, then the interpolant I = Itp(A, B) can be deemed as a safe overapproximation of A with respect to B. In turn, such overapproximations can be used to detect a reachability fix-point.

5.1. McMillan’s Interpolation Algorithm

McMillan [44] introduced the first complete algorithm for symbolic model checking based on Craig’s interpolation. In the literature, such an algorithm is called standard interpolation or just ITP for short. The algorithm computes Craig’s interpolants to overapproximate reachable states of the system. The interpolants, as mentioned in the previous section, are computed from refutation proofs of unsatisfiable BMC runs.

The algorithm comprises two nested loops. The outer loop is presented in procedure ItpModelChecking (Algorithm 4) whereas the inner loop is presented in procedure ApproxForwardTraversal (Algorithm 5). At each outer loop iteration, ApproxForwardTraversal is invoked to perform an overapproximated forward traversal of the reachable states while preserving safety with respect to a backward unrolling from the target (bad cone). ApproxForwardTraversal can be seen as computing a safe monotonic trace. Note that such a trace is not explicitly maintained. Only its final frame is kept at each iteration and used as base for computing the next one.

We describe Algorithm 5 first. The procedure ApproxForwardTraversal operates a forward traversal in which interpolation is used as an overapproximated image operator. At each iteration, the procedure checks a BMC formula of fixed length k, composed of two parts:

$A \overset{def}{=} R (V^{0}) \land T (V^{0}, V^{1})$
$B \overset{def}{=} C o n e (1, k) = Π (1, k) \land ⋁_{i = 1}^{k} \neg P (V^{i})$

where R is a set of overapproximated forward reachable states. In such a scenario, A represents the image of the set of states at the current traversal step, whereas B represents the set of backward reachable bad states, reachable in at most

k - 1

transitions from the target.

Algorithm 4 Top-level procedure of McMillan’s interpolation algorithm. This iterates overapproximated forward traversals of reachable states while keeping safety with respect to a bad cone of increasing depth from the target.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure ItpModelChecking( $M$ , P)
2:: if $\exists s_{0} ⊧ I (V) \land \neg P (V)$ then
3:: return 〈Fail, (S₀)〉
4:: $k \leftarrow 1$
5:: while $t r u e$ do
6:: $〈 res, cex 〉 \leftarrow$ ApproxForwardTraversal $(M, P, k)$
7:: if $res$ is Unreach then
8:: return 〈Success, −〉
9:: else if $res$ is Reach then
10:: return 〈Fail, cex〉
11:: $k \leftarrow k + 1$

Algorithm 5 Inner procedure of McMillan’s interpolation algorithm. This operates an overapproximated forward traversal of the reachable state space while keeping safety with respect to a bad cone of fixed depth from the target.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V; k bound of a backward unrolling from the target.
Output:

〈 res, cex 〉

with res ∈ {Reach, Unreach, Undef};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure ApproxForwardTraversal( $M$ , P, k)
2:: $R \leftarrow I$
3:: if $\exists π^{0, k} ⊧ R (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k)$ then
4:: return 〈Reach, π_0,k〉
5:: while ⊤ do
6:: $A \leftarrow R (V^{0}) \land T (V^{0}, V^{1})$
7:: $B \leftarrow C o n e (1, k)$
8:: if $\exists π^{0, k} ⊧ A \land B$ then
9:: return 〈Undef, −〉
10:: else
11:: I ← Itp(A, B)
12:: if $\exists s ⊧ I \land \neg R$ then
13:: return 〈Unreach, −〉
14:: $R \leftarrow R \lor I$

At the first iteration, R is the set of initial states

I

(line 2) and therefore is not overapproximated. In that case, the formula

A \land B

is exactly a BMC formula of length k. If such a formula is satisfiable, the algorithm has found a counterexample

π^{0, k}

of length at most k (lines 3–4). Then, both the nested and the top-level procedures terminate, returning the counterexample found. Otherwise, the algorithm starts traversing the reachable state space, computing overapproximated images by means of interpolation (lines 5–14). After composing A and B (lines 6–7), the algorithm checks whether

A \land B

is satisfiable (line 8). If that is the case, a (possibly spurious) counterexample

π^{0, k}

is found, i.e., a path of length at most k from the overapproximated set R to a bad state in

C o n e (1, k)

. Since R is an overapproximation of the reachable states, the algorithm cannot determine whether

π^{0, k}

is a real counterexample or a spurious counterexample. Therefore, the procedure aborts the current forward traversal and returns the control to the top-level procedure. Otherwise,

A \land B

is Unsat and an interpolant I can be derived from its refutation proof (line 11). Such an interpolant represents a safe overapproximation of the states reachable from R in one transition with respect to the bad cone

C o n e (1, k)

, as illustrated in Figure 6. Therefore, no counterexample can be reached from I in

k - 1

or less transitions, i.e., I is safe with respect to P.

Since the interpolant is an overapproximation of the image of R, it is treated as a candidate inductive invariant. The algorithm checks whether consecution

I \to R

is valid (i.e.,

\neg R \land I

is unsatisfiable). In that case, R is an inductive invariant for

M

and, since R is safe with respect to P, it is also an inductive strengthening for P. Conversely, a new set of overapproximated forward reachable states is computed as

R \lor I

and the algorithm iterates once again. The result of the nested procedure, i.e., the sequence of R composed at each iteration, can be seen as a safe monotonic trace. Monotonicity stems from having R initialized with

I

(line 2) and from the fact that each consecutive R is a disjunction of the previous one and of an overapproximation of the states reachable from the previous (line 14). Safety with respect to P stems from the fact that

I

was proved to be safe (Algorithm 4, lines 2–4) and taking into the account that each interpolant I used to construct R does not intersect

C o n e (1, k)

. From Lemma 7 follows the detection of an inductive strengthening for P from that trace. The sequence of interpolants, instead, can be seen as a non-monotonic safe trace.

Taking into account Algorithm 4, in the beginning, the initial states

I

are checked to be safe (line 2–3). If said check fails, there exists a trivial counterexample consisting of just a single initial state. The procedure can then halt and return the counterexample found. Conversely, the bad cone bound k is initialized to 1 (line 4) and the procedure starts iterating forward overapproximated traversals of reachable states, while keeping safety with respect to a bad cone of increasing depth k from the target (lines 5–11). Increasing the value of k lets the algorithm find real counterexamples and generate more precise overapproximations on subsequent iterations. Since the bad cone from the target unwinds, a few states in the overapproximated images computed at previous iterations might be reached by the unrolling and are therefore excluded from newer images. During each outer loop iteration, the inner procedure starts a forward traversal from scratch from the initial states. As k increases, the algorithm is guaranteed to find a bound k in which the computed interpolants are precise enough to find an inductive strengthening if P holds for the system, or to find a real counterexample otherwise [44].

5.2. Example of Interpolation-Based Model Checking

Let

M = 〈 V, I, T 〉

be a transition system described by the following:

V = {v_{0}, v_{1}, v_{2}}

(53)

I = \neg v_{0} \land \neg v_{1} \land \neg v_{2}

(54)

\begin{matrix} \begin{matrix} T = & (\neg v_{0} \land \neg v_{1} \land \neg v_{2} \land v_{0}^{'} \land \neg v_{1}^{'} \land \neg v_{2}^{'}) \lor \\ (v_{0} \land \neg v_{1} \land \neg v_{2} \land \neg v_{0}^{'} \land \neg v_{1}^{'} \land v_{2}^{'}) \lor \\ (\neg v_{0} \land \neg v_{1} \land v_{2} \land v_{0}^{'} \land \neg v_{1}^{'} \land v_{2}^{'}) \lor \\ (v_{0} \land \neg v_{1} \land v_{2} \land \neg v_{0}^{'} \land \neg v_{1}^{'} \land \neg v_{2}^{'}) \lor \\ (\neg v_{0} \land v_{1} \land \neg v_{2} \land v_{0}^{'} \land v_{1}^{'} \land \neg v_{2}^{'}) \lor \\ (v_{0} \land v_{1} \land \neg v_{2} \land v_{0}^{'} \land v_{1}^{'} \land v_{2}^{'}) \lor \\ (v_{0} \land v_{1} \land v_{2} \land v_{0}^{'} \land v_{1}^{'} \land v_{2}^{'}) \lor \\ (\neg v_{0} \land v_{1} \land v_{2} \land v_{0}^{'} \land v_{1}^{'} \land v_{2}^{'}) \end{matrix} \end{matrix}

(55)

and P be an invariant property over V:

P = \neg v_{0} \lor \neg v_{1} \lor \neg v_{2}

(56)

Figure 7 illustrates different steps of the procedure reported in Algorithms 4 and 5 over

M

. The ItpModelChecking procedure starts by checking whether or not the initial states

I

are safe. In the case of the system at hand, the single initial state 000 satisfies P; therefore, the procedure will continue by invoking ApproxForwardTraversal with

k = 1

. During the first iteration of the traversal procedure, R is set to the initial states

I

, A represents the image of

I

and B corresponds to the negation of the property

\neg P

. Suppose that the interpolant I extracted at line 11 of Algorithm 5 is the one described in Figure 7a. I overapproximates the image of R, comprising both states that are actually reachable from

I

, like 001, and states that are not reachable from

I

, like 010 and 011. Figure 7b describes the situation after the current set of reachable states R has been updated by disjointing it with the interpolant I. During the second iteration of the traversal procedure, an intersection between the image of the current set of reachable states A and the bad states B will be found (state 111, as indicated by both the solid and the dash–dotted line enclosing it in Figure 7b). The traversal procedure will then halt reporting an Undef result: this is the case in which a spurious counterexample to the property has been found, due to the interpolant I being too loose an overapproximation of the states actually reachable from

I

, thus including a state (011) that is backward reachable from a bad state but not forward reachable from the initial states. The ItpModelChecking procedure will then resume by increasing k and invoking the traversal procedure once again with an enlarged cone of bad states. Figure 7c describes the situation during the first step of the new traversal, in which R and A are left unchanged whereas B is enlarged to include the pre-image of

\neg P

(states 011 and 110). Suppose that a new interpolant I is extracted that includes only states 000, 001, 100 and 101; then, after updating the current set of reachable states R and performing a second traversal step, a fix-point will be reached as illustrated in Figure 7d. The overall model checking procedure will end with a Success result.

6. IC3

IC3 [48] is a SAT-based algorithm for symbolic invariant verification. Given a transition system

M = 〈 V, I, T 〉

and an invariant property P over V to be checked, the target of IC3 is finding an inductive strengthening of P for

M

.

In order to carry this out, IC3 maintains two main data structures: a trace and a proof-obligation queue.

The trace

F_{k} = (F_{0}, \dots, F_{k})

is both monotonic and safe with respect to the property P. Each frame

F_{i}

, with

0 \leq i < k

, is a safe overapproximation of the set of states reachable in at most i steps in

M

(see Lemmas 4 and 5). The purpose of the algorithm is to iteratively refine such an

F_{k}

in order to satisfy the condition

F_{i + 1} \to F_{i}

for some

0 \leq i < k

, thus finding an inductive strengthening of

ψ

according to Lemma 6.

IC3 also maintains a second data structure called a proof-obligation queue, which is used to collect sets of states in

F_{k}

that may reach a violation of the property in an arbitrary number of steps. Those sets of states are processed by the algorithm according to a given priority and for each of them IC3 either finds a backward path to the initial states or learns a new inductive lemma that can be used to refine

F_{k}

in order to exclude such states from the overapproximation. In case a path to the initial states is found, the algorithm has found a counterexample to P. In the latter case, the algorithm continues its search of an inductive strengthening of P over a tighter approximation of the reachable state sets.

IC3 needs to solve SAT calls at different stages during its run. Such SAT queries are peculiar [49], in the sense that they are very frequent but involve only a single instance of the transition relation T. Performing many local reachability checks, IC3 achieves better control over the precision of the computed overapproximations.

Given

F_{k}

, each frame

F_{i}

is represented by a set of clauses, denoted by

clauses (F_{i})

, to enable efficient syntactic checks for frame equality. The base condition of the trace is fulfilled by initializing the first frame with

I

. Monotonicity is maintained syntactically by enforcing the condition

clauses (F_{i + 1}) \subseteq clauses (F_{i})

. On the other hand, image approximation and safety are guaranteed explicitly by the algorithm operations.

The main procedure of IC3, described in Algorithm 6, is composed of an initialization phase followed by two nested iterations. During initialization, the algorithm sets up the trace and ensures that the initial states are safe (lines 2–5). If that is not the case, a counterexample of length zero, comprising only the initial state s violating the property, is found and the procedure outputs a failure. During each outer iteration (lines 6–17), the algorithm tries to prove that P is satisfied up to k steps in

M

, for increasing values of k. Inner iterations of the algorithm refine the trace

F_{k}

computed so far by adding new relative inductive clauses to some of its frames (lines 7–11). The algorithm iterates until either an inductive strengthening of the property is generated (line 16) or a counterexample to the property is found (line 11).

Algorithm 6 Top-level procedure of the IC3 algorithm: Ic3 (

M

, P).

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail} and

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure Ic3( $M$ , P)
2:: $k \leftarrow 0$
3:: $F_{0} \leftarrow I$
4:: if $\exists s ⊧ F_{0} \land \neg P$ then
5:: return 〈Fail, (s)〉
6:: repeat
7:: while $\exists s ⊧ F_{k} \land \neg P$ do
8:: $q \leftarrow$ Extend(s)
9:: $〈 res, cex 〉$ ←BlockCube(q, Q, $F_{k}$ )
10:: if res= Fail then
11:: return 〈Fail, ces〉
12:: $F_{k + 1} \leftarrow \emptyset$
13:: $k \leftarrow k + 1$
14:: Propagate( $F_{k}$ )
15:: if $F_{i} = F_{i + 1}$ for some $0 \leq i < k$ then
16:: return 〈Succes, −〉
17:: until forever

At a given outer iteration k, the algorithm has already computed a monotonic and safe trace. From Lemma 5, it follows that P is satisfied up to

k - 1

steps in

M

. IC3 then tries to prove that P is satisfied up to k steps as well, by enumerating states of

F_{k}

that violate P and trying to block them in

F_{k}

.

Definition 85

(Cube Blocking). Blocking a state (or, more generally, a cube) q in a frame

F_{k}

means to prove q unreachable within k steps in

M

and, consequently, to refine

F_{k}

so that q is excluded from its overapproximation.

To enumerate each state of

F_{k}

that violates P (line 7), the algorithm looks for states s in

F_{k} \land \neg P

. Such states are called bad states and can be found as satisfying assignments for the following SAT query:

S A T ? (F_{k} \land \neg P)

(57)

If a bad state s can be found (i.e., Query (57) is SAT), the algorithm tries to block it in

F_{k}

. To increase performance of the algorithm, as suggested in [50], the bad state s found is first extended to a bad cube q by removing, if possible, some of its literals. Figure 8a visualizes this step. The Extend(s) procedure (line 8), not reported here, performs this operation via either ternary simulations [50] or other SAT-based procedures [51]. The resulting cube q is stronger than s and it still violates P; it is thus called a bad cube. The algorithm then tries to block the bad cube q in frame k rather than s. It is shown in [50] that extending bad states into bad cubes before blocking them dramatically improves IC3 performance. The bad cube q is blocked in

F_{k}

calling the BlockCube(q, Q,

F_{k}

) procedure (line 9), described in Algorithm 7.

When no further bad states can be found, the conditions in Definition 82 hold for

k + 1

and IC3 can safely move to the next outer iteration, i.e., trying to prove that P is satisfied up to

k + 1

steps. Before moving to the next iteration, a new empty frame

F_{k + 1}

is created (line 12). Initially,

clauses (F_{k + 1}) = \emptyset

, so that

F_{k + 1} = ⊤

. Note that ⊤ is a valid overapproximation to the set of states reachable within

k + 1

steps in

M

.

After creating a new frame, a phase called clause propagation takes place (line 14). During such a phase, IC3 tries to refine every frame

F_{i}

, with

0 < i \leq k

, by checking whether some of its clauses can be pushed forward to the following frame. Possibly, clause propagation refines the outermost frame

F_{k}

so that an inductive invariant is reached. The propagation phase can lead to two adjacent frames becoming equivalent (see Figure 8g). If that happens, the algorithm has found an inductive strengthening of P for

M

(equal to those frames). Therefore, following Lemma 6, property P holds for every reachable state of

M

and IC3 returns a successful result (line 16). Procedure Propagate(

F_{k}

), described and discussed later, handles the clause propagation phase.

The purpose of procedure BlockCube(q, Q,

F_{k}

) is to refine the trace

F_{k}

in order to block a bad cube q in

F_{k}

. To preserve image approximation, prior to blocking a cube in a certain frame, IC3 has to recursively block predecessors of that cube in the preceding frames. To keep track of the states (or cubes) that must be blocked in certain frames, IC3 uses the formalism of proof obligations.

Definition 86

(Proof Obligation). Given a cube q and a frame

F_{j}

, a proof obligation is a couple

(q, j)

formalizing the fact that q must be blocked in

F_{j}

.

Given a proof obligation

(q, j)

, the cube q can either represent a set of bad states or a set of states that can reach a bad state in some number of transitions. The index j indicates the position in the trace where q must be proved unreachable, or else the property fails.

Definition 87

(Discharging Proof Obligations). A proof obligation

(q, j)

is said to be discharged when s becomes blocked in

F_{j}

.

Algorithm 7 Cube blocking procedure IC3: BlockCube(q, Q,

F_{k}

).

Input: q a bad cube in

F_{k}

; Q the proof-obligation queue;

F_{k}

the trace.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail} and

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure BlockCube(q, Q, $F_{k}$ )
2:: Add proof obligation $(q, k)$ to Q
3:: while Q is not empty do
4:: Extract proof obligation $(q, j)$ with minimum j from Q
5:: if $j > k$ or $q ⊧ F_{j}$ then continue;
6:: if $j = 0$ then return 〈Fail, ReconstructCet(q)〉
7:: if $\exists s, s^{'} ⊧ F_{j - 1} \land \neg q \land T \land q^{'}$ then
8:: $p \leftarrow$ Extend(s)
9:: Add proof obligations $(p, j - 1)$ and $(q, j)$ to Q
10:: else
11:: $c \leftarrow$ Generalize( $j, s, F_{k}$ )
12:: $F_{i} \leftarrow F_{i} \cup c$ for $0 < i \leq j$
13:: Add proof obligation $(j + 1, c)$ to Q
14:: return 〈Success, −〉

In order to discharge a proof obligation, new ones may have to be recursively discharged. This can be carried out through a recursive implementation of the cube blocking procedure. However, in practice, handling proof obligations using a priority queue Q proved to be more efficient [50] and it is thus the commonly used approach in most state-of-the-art IC3 implementations. While blocking a cube, proof obligations

(q, j)

are extracted from Q and discharged for increasing values of j, ensuring that every predecessor of a bad cube q will be blocked in

F_{j}

(

j < k

) before q will be blocked in

F_{k}

. In the BlockCube(q, Q,

F_{k}

) procedure, described in Algorithm 7, the queue of proof obligations is initialized with

(q, k)

, encoding the fact that q must be blocked in

F_{k}

(line 2). Then, proof obligations are iteratively extracted from the queue and discharged (lines 3–14).

Prior to discharging a proof obligation

(q, j)

, IC3 checks whether that proof obligation still needs to be discharged. It is in fact possible for an enqueued proof obligation to become discharged as a result of some previous proof-obligation discharging. To perform this check, the algorithm tests whether q is still included in

F_{j}

(line 5). This can be performed by posing the following SAT query:

S A T ? (F_{j} \land q)

(58)

If q is in

F_{j}

(i.e., Query (58) is SAT), the proof obligation

(q, j)

still needs to be discharged. Otherwise, q has already been blocked in

F_{j}

and the procedure can move on to the next iteration.

If the proof obligation

(q, j)

still needs to be discharged, then IC3 checks whether

F_{j}

is the initial frame (line 6). If so, the states represented by q are initial states that can reach a violation of property P. Thus, a counterexample to P can be constructed by following the chain of proof obligations that led to

(q, 0)

. This is performed by means of the procedure ReconstructCex(q), not described here. In that case, the procedure terminates with a failure and returns the counterexample found.

To discharge a proof obligation

(q, j)

, i.e., to block a cube q in

F_{j}

, IC3 tries to derive a clause c such that

c \subseteq \neg q

and c is inductive relative to

F_{j - 1}

. Figure 8b visualizes this step. The initiation condition of induction (

I \to \neg q

) holds by construction, whereas the algorithm must check whether or not the relative consecution condition (

F_{j - 1} \land \neg q \land T \to \neg q^{'}

) holds. This can be performed by proving that the following SAT query is unsatisfiable (line 7):

S A T ? (F_{j - 1} \land \neg q \land T \land q^{'})

(59)

If relative consecution holds (i.e., Query (59) is UNSAT), then the clause

\neg q

is inductive relative to

F_{j - 1}

and can be used to refine

F_{j}

, ruling out q (lines 10–13). To pursue a stronger refinement of

F_{j}

, the inductive clause found undergoes a process called inductive generalization (line 11) prior to being added to the frame. Inductive generalization is carried out by the Generalize(

j, q, F_{k}

) procedure, described in Algorithm 8, which tries to minimize the number of literals in a clause

c = \neg q

while maintaining its inductiveness relative to

F_{j - 1}

, in order to preserve monotonicity.

The resulting clause is added not only to

F_{j}

, but also to every frame

F_{i}, 0 < i < j

(line 12). Doing so discharges the proof obligation

(q, j)

ruling out q from every

F_{i}

with

0 < i \leq j

. Since the sets

F_{i}

with

i > j

are larger than

F_{j}

, q may still be present in one of them and

(q, j + 1)

may become a new proof obligation. To address this issue, Algorithm 7 adds

(q, j + 1)

to the proof-obligations queue (line 13). Figure 8e,f visualize this step.

Otherwise, if the relative consecution does not hold (see Figure 8c), there is a predecessor s of q in

F_{j - 1} \land \neg q

. Such predecessors are called counterexamples to the inductiveness (CTIs). To preserve image approximation, before blocking a cube q in a frame

F_{j}

, every CTI of q must be blocked in

F_{j - 1}

(Figure 8d). Therefore, the CTI s is first extended into a cube p (line 8), and then both proof obligations

(p, j - 1)

and

(q, j)

are added to the queue (line 9).

The Generalize(

j, q, F_{k}

) procedure (Algorithm 8) performs inductive generalization, which is a fundamental step of the algorithm. During inductive generalization, given a clause

\neg q

relative inductive to

F_{j - 1}

, IC3 tries to compute a clause c subset of

\neg q

such that c is still inductive relative to

F_{j - 1}

. Acting this way, and using c to refine the frames, the algorithm can block not only the original bad cube q but potentially also other states, which in turn may lead to faster convergence. Inductive generalization works by dropping literals from the input clause while maintaining relative inductiveness with respect to

F_{j - 1}

. The procedure computes a minimal inductive sub-clause, i.e., a sub-clause that is inductive relative to

F_{j - 1}

and no further literals can be dropped without forgoing inductiveness [52]. In the general case, though, finding a minimal inductive sub-clause is often inefficient [48] and, thus, most implementations of IC3 rely on an approximate version of such a procedure. Approximate inductive generalization is significantly less expensive, yet still able to drop a reasonable number of literals.

Algorithm 8 Iterative inductive generalization procedure: Generalize (j, s,

F_{k}

)

Input: j a frame index; q a cube such that

F_{j - 1} \land \neg q \land T \to \neg q^{'}

;

F_{k}

the trace.
Output: c a clause such that

c \subseteq \neg q

and

F_{j - 1} \land c \land T \to c^{'}

.

1:: procedure Generalize(j, q, $F_{k}$ )
2:: $c \leftarrow \neg q$
3:: for all literals l in c do
4:: $t \leftarrow$ $c ∖ l$
5:: if $\exists s, s^{'} ⊧ F_{j - 1} \land t \land T \land \neg t^{'}$ then
6:: if $\exists s ⊧ I \land \neg t$ then
7:: $c \leftarrow t$
8:: return c

In the Generalize (

j, q, F_{k}

) procedure, a clause c initialized with

\neg q

(line 2) represents the current inductive sub-clause. For each literal of c, the candidate clause t is obtained by discarding that literal from c (line 4). Dropping literals from a relative inductive clause can violate both initiation and relative consecution conditions of induction. The candidate clause t must thus be checked for inductiveness relative to

F_{j - 1}

. IC3 checks whether the relative consecution condition (

F_{j - 1} \land t \land T \to t^{'}

) keeps holding for t by posing the following SAT query:

S A T ? (F_{j - 1} \land t \land T \land \neg t^{'})

(60)

If the relative consecution condition holds (i.e, Query (60) is UNSAT), the algorithm needs to prove the initiation condition of induction (

I \to t

). This check (line 6) can be performed either syntactically or semantically. If

I

can be described as a cube, it is enough to check whether at least one of the literals of

I

appears in t with opposite polarity. In such a case,

\neg t

does not intersect

I

and the initiation condition holds. Otherwise, the initiation of induction must be checked explicitly by posing the following SAT query:

S A T ? (I \land \neg t)

(61)

If Query (61) is UNSAT, the initiation condition holds for t. If both the relative consecution and the initiation conditions still hold for the candidate clause t, the current inductive sub-clause c can be updated with t (line 7).

The Propagate (

F_{k}

) procedure (Algorithm 9) manages the propagation phase. For every clause c of each frame

F_{j}

, with

0 \leq j < k - 1

, it checks whether c can be pushed forward to

F_{j + 1}

. This is performed by checking whether c is inductive relative to

F_{j}

(line 4). The procedure must check whether relative consecution

F_{j} \land c \land T \to c^{'}

holds, by answering the following SAT query:

S A T ? (F_{j} \land c \land T \land \neg c^{'})

(62)

If relative consecution holds (i.e., Query (62) is UNSAT), then c is relative inductive to

F_{j}

and can be pushed forward to

F_{i + 1}

. Otherwise, c cannot be pushed forward and the algorithm moves to the next iteration.

Algorithm 9 Clause propagation procedure: Propagate (

F_{k}

).

Input:

F_{k}

: the current trace.

1:: procedure Propagate( $F_{k}$ )
2:: for $j = 0$ to $k - 1$ do
3:: for all $c \in F_{j}$ do
4:: if $\exists s, s^{'} ⊧ F_{j} \land c \land T \land \neg c^{'}$ then
5:: $F_{j} \leftarrow F_{j} ∖ {c}$
6:: $F_{j + 1} \leftarrow F_{j + 1} \cup {c}$

Property Directed Reachability

Property Directed Reachability (PDR) is a variant implementation of IC3 proposed by Een et al. in [50]. PDR differs from the original IC3 implementation by Bradley [48] as follows:

The trace is represented as sets of blocked cubes rather than learned clauses. Furthermore, to avoid duplication, PDR only stores a cube in the last frame where it holds. PDR also adds a special frame $F_{\infty}$ which will hold cubes that have been proved unreachable from the initial states by any number of transitions.
PDR uses a slightly modified trace semantics with respect to IC3, by not requiring the last frame of the trace to entail the property. Een et al. show that this behavior can be emulated by PDR by preprocessing the input system using a one-step target enlargement of P. This results in a small performance gain, simplifies the implementation and has the extra benefit that $F_{\infty}$ is a proper invariant, which can be used to strengthen other proof engines, or exploited for synthesis. Furthermore, in principle, the same target-enlargement idea can be generalized and applied for any number $k \geq 1$ of steps.
PDR proposes the use of ternary simulation as a method to shrink proof obligations before blocking them (implemented as the Extend procedure in Algorithm 7). In [50], ternary simulation is shown to have a big impact on performance.

7. IGR

IGR [53], Interpolation with Guided Refinement, is a model checking algorithm than can be seen as a variant of standard interpolation. It incorporates, within an interpolation scheme, explicit trace computation and refinement, images and cones simplification under observability don’t care and guided cone unwinding/rewinding.

The main purpose is to improve the standard interpolation in order to support the incremental computation of reachable states sets and dynamic tuning of the backward unrolling from the target. Incremental data structures are used to enable the reuse of previously computed overapproximations and make interpolant-based algorithms better suited for the verification of multiple properties or for integration with abstraction/refinement approaches. Furthermore, maintaining overapproximated reachable state information in an incremental data structure allows the algorithm to dynamically adjust the bound of the BMC formulas checked during each traversal step, for better controlling the precision of the computed image overapproximations.

In order to better describe the proposed variation, it is necessary to introduce first the foundations on which it is built upon, in order to better characterize then the overall algorithm.

7.1. Incremental State Sets in Interpolation

In order to enable the reuse of previously computed interpolants, in IGR, a trace of overapproximations to reachable states is maintained and incrementally refined. Since interpolants are safe image overapproximations with respect to the property under verification, the trace maintained is safe as well.

At each i-th iteration of standard ITP’s inner loop, from the refutation proof of a BMC formula, an overapproximation of the states reachable in i steps in the system is computed by extracting an interpolant. Such an interpolant is then discarded at the end of the iteration. Furthermore, when a spurious counterexample is found, the current forward traversal is interrupted, the backward cone from the target is unwound by one step and the forward traversal of reachable states restarts once again from the initial states. One of the focus point in IGR is to keep track of the overapproximations computed during each run of ApproxForwardTraversal, in order to enable their reuse in further iterations of the outer loop. It is thus necessary to extend the standard interpolation algorithm to keep track of a trace of reachable states. As the bound k of the cone increases, stronger overapproximations are computed at each traversed time frame and used to refine the trace.

A trace

F_{k} = (F_{0}, \dots, F_{k})

is used in order to keep track of previously computed overapproximations. From Lemma 3, it follows that each timeframe

F_{i}

of

F_{k}

, with

0 \leq i < k

, is an overapproximation of the set of states that are reachable in exactly i steps. The trace is constructed so that it is safe with respect to P.

The proposed ITP variant, informally called IncrItpModelChecking, is sketched in Algorithms 10 and 11. The differences between the variant and standard interpolation are the following:

The trace is initialized with $F_{0} = I$ before starting the first forward traversal (Algorithm 10, line 5).
Each time the forward traversal reaches the end of the current trace, a new frame $F_{i + 1}$ is instantiated equal to ⊤ and added to the trace (Algorithm 11, lines 8–10).
Every time a new interpolant, overapproximating states reachable in $i + 1$ steps, is computed, the corresponding frame $F_{i + 1}$ in the trace is refined (Algorithm 11, lines 17).

Refinement is a strengthening step carried out by conjoining the previous set with a new term.

Algorithm 10 Top-level procedure of the proposed ITP variant that keeps track of the computed interpolants using a trace.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure IncrItpModelChecking( $M$ , P)
2:: if $\exists s_{0} : s_{0} ⊧ I (V) \land \neg P (V)$ then
3:: return 〈Fail, (s₀)
4:: $F_{k} [0] \leftarrow I$
5:: $k \leftarrow 1$
6:: while true do
7:: $〈 res, cex 〉 \leftarrow$ IncrApproxForwardTraversal $(M, P, F_{k}, k)$
8:: if $res$ is Unreach then
9:: return 〈Success, −〉
10:: else if $res$ is Reach then
11:: return 〈Fail, cex〉
12:: $k \leftarrow k + 1$

Algorithm 11 Inner procedure that keeps track of the computed interpolants using a trace.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V;

F_{k}

a trace; k bound of a backward unrolling from the target.
Output:

〈 res, cex 〉

with res ∈ {Reach, Unreach, Undef};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure IncrApproxForwardTraversal( $M$ , P, $F_{k}$ , k)
2:: $R \leftarrow F_{0}$
3:: if $\exists π^{0, k} ⊧ F_{0} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k)$ then
4:: return 〈Reach, π^0,k〉
5:: $i \leftarrow 0$
6:: while ⊤ do
7:: if $i = | F_{k} |$ then
8:: $F_{k} [i + 1] \leftarrow ⊤$
9:: $A \leftarrow F_{i} (V^{0}) \land T (V^{0}, V^{1})$
10:: $B \leftarrow C o n e (1, k)$
11:: if $\exists π^{0, k} ⊧ A \land B$ then
12:: return 〈Undef, −〉
13:: else
14:: I ← Itp(A, B)
15:: $F_{i + 1} \leftarrow F_{i + 1} \land I$
16:: if $\exists s ⊧ F_{i + 1} \land \neg R$ then
17:: return 〈Unrea, −〉
18:: $R \leftarrow R \lor F_{i + 1}$
19:: $i \leftarrow i + 1$

7.2. Frames and Cone Simplification

In order to keep cones and overapproximations of reachable states contained in size, the algorithms exploit an optimization that aims at simplifying the representation of frames and/or bad cones through ad hoc redundancy removal, exploiting the general notion of redundancy removal under observability don’t cares. In order to understand the approach, let us denote simplification under a care set as the function Simplify

(F, C)

, where F is a formula over V to be simplified and C is another formula over V to be used as a care set for the simplification of F. The care set is defined with respect to a reference formula G over

V \cup W

in which F appears as a subformula, as follows.

Definition 88

(Care Set and Don’t Care Set). Given a propositional formula F over V and another formula G over

V \cup W

such that F is a subformula of G, we define the care set

C_{F}^{G}

of F with respect to G as the set of assignments over V under which the value of F affects the value of G. A care set for F with respect to G can be represented by the formula:

C_{F}^{G} = G \oplus G [F \leftarrow \neg F]

(63)

The complement of

C_{F}^{G}

is called the don’t care set of F with respect to G. It represents the set of assignments over V under which the value of F does not affect the value of G.

Simplification of a formula F under care set

C_{F}^{G}

with respect to a reference formula G can entail the application of any number of equivalence-preserving or strengthening transformations over F, as long as the following constraint is preserved:

G \equiv G [F \leftarrow S IMPLIFY (F, C_{F}^{G})]

(64)

Computation of care sets and don’t care sets can be costly [54]. Considering a conjunction

F = A \land B

as a reference formula, the following lemma describes two straightforward ways to obtain care sets for either one of the conjoined formulas. We focus on B, since the same dual reasoning applies to A.

Lemma 8

(Care Sets for Conjunctive Reference Formulas). Given A and B, two propositional formulas over V, and given

F = A \land B

, then A is a care set for B with respect to F. Given C, a propositional formula over V, such that

A \to C

, then C is a care set for B with respect to F.

Figure 9a,b illustrate the simplification of a formula B with respect to a reference formula

F = A \land B

under care sets as defined by Lemma 8, in terms of sets of assignments. Given two propositional formulas

A = a

and

B = (a \lor b)

, the knowledge that any assignment satisfying their conjunction

F = A \land B

must be a satisfying assignment of either one can be used to simplify the other. Using A as a care set for B, since

F = a \land (a \lor b)

is satisfied only by assignments satisfying A (i.e., assignments

μ

such that

μ (a) = ⊤

) it is possible to simplify B through the injection of a constant ⊤ prior to conjoining it to A, obtaining

⊤ \lor b \equiv b

. The resulting formula after simplification

A \land

Simplify

(B, A) = a \land b

is syntactically more compact than the equivalent

A \land B = a \land (a \lor b)

.

Despite the fact that many redundancy removal techniques could be used to perform simplification under a care set, e.g., latch correspondence, signal correspondence and equivalence to constants, most of them are too expensive to be performed at each forward traversal iteration within an ITP scheme. For the sake of having a fast operator, simplify is limited to the removal of equivalences between state variables, i.e., latch correspondences. Given a target formula B and a care set A, the Simplify operator identifies pairs of state variables

(v_{1}, v_{2})

such that

A \to (v_{1} \leftrightarrow v_{2})

and then B is simplified as

B [v_{1} \leftarrow v_{2}]

.

Simplification under a care set can be used on frames during refinement steps (Algorithm 11, line 17), or to simplify the bad cone

C o n e (1, k)

prior to checking its intersection with the image of

F_{i}

(Algorithm 11, line 13). During forward overapproximated traversal, when checking whether the image of the current set of reachable states

F_{i} \land T

intersects with the bad cone

C o n e (1, k)

, it is possible to simplify

C o n e (1, k)

using any overapproximation of the states reachable in the next k transitions that is already in the trace. Each frame

F_{j}

in

F_{k}

, with

i < j < i + k

, can be used as a care set to simplify

C o n e (1, k)

. With TraceSimplify(

C o n e (1, k)

,

F_{k}

, i, k) we identify the function applying Simplify(Cone(1,k), F_j) for each

i < j < i + k

, i.e., the function applying latch correspondences substitution at each intermediate transition relation boundaries in

C o n e (1, k)

. In this way, reachability information computed during previous iterations of the algorithm are used to simplify the formula before injecting it into the SAT solver.

7.3. Cone Unwinding and Rewinding

In standard interpolation, when finding a spurious counterexample, the current forward traversal is restarted from the initial states after the bad cone has been expanded by one step. IGR, instead, dynamically unwinds or rewinds the cone during forward traversal, in order to guarantee the refinement of some previously computed overapproximation of reachable states. The depth of the cone is therefore guided by the frames in the trace so that it can lead to a strengthening refinement for some of them.

Supposing that, at a given point during the execution of IncrItpModelChecking, the algorithm has computed a trace

F_{k}

, two approaches can be pursued with the goal of potentially expanding and/or refining

F_{k}

:

Cone Unwinding.: When the forward traversal hits the cone, it starts a new traversal at an intermediate step in order to guarantee the refinement of the trace.
Cone Rewinding.: When the forward traversal hits the cone, it continues the traversal with iteratively smaller cones in order to refine and expand the trace.

Overall, guided cone unwinding/rewinding allows IGR to dynamically tune the unrolling from the target and therefore to have better control over the precision of the computed overapproximations (interpolants). In this respect, standard interpolation is too rigid, as overapproximations are always strengthened expanding the cone by one and restarting the traversal from scratch.

7.3.1. Cone Unwinding

At a given iteration i of the forward traversal, given k the bound of the cone, if the formula

F_{i} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k)

(65)

is Sat, then a possibly spurious counterexample is found. In such a scenario, standard interpolation would unwind the bad cone by one step and then start a new traversal from the initial states. When step i is then reached once again in the traversal, the overapproximation

F_{i}

could have been strengthened enough to exclude the previously found spurious counterexample. If that is not the case, the algorithm restarts the traversal again, incrementing k until either the spurious counterexample is excluded from the overapproximation or the counterexample is confirmed to be a concrete one. With cone unwinding, the cone is unwound of the minimum depth necessary to strengthen a frame

F_{j}

, with

0 < j \leq i

, in order to eliminate the spurious counterexample directly.

Upon encountering a spurious counterexample, the BMC-like problem described in Equation (65) is Sat. Therefore, the algorithm starts an iterative process checking BMC problems of fixed bound in which the cone is unwound by one step and the algorithm moves to the previous frame. Starting from frame

F_{i}

and cone

C o n e (1, k)

, at each j-th iteration it considers the BMC-like problem:

F_{i - j} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k + j)

(66)

with

0 < j \leq i

. At each iteration it swaps an overapproximated image in

F_{i - j}

with an exact image in

C o n e (1, k + j)

. If the BMC-like problem in Equation (66) is Sat at a given j, then

F_{i - j}

is not strong enough to refute the counterexample and the process iterates. Eventually, either it finds Unsat thus refuting the spurious counterexample, or it reaches the initial states thus confirming the counterexample as a concrete one. In fact, for

j = i

we have exactly bmc(i + k). Assuming that it can find a j, with

0 < j < i

, such that Equation (66) is Unsat, then it can restart the forward traversal from

F_{i - j}

with

C o n e (1, k + j)

. This is guaranteed to generate an interpolant and therefore refine the current trace.

7.3.2. Cone Rewinding

When, at a given step i, the forward traversal hits the cone of depth k, rather than starting a new traversal with an unwound cone, the algorithm can optionally continue the current traversal with a sequence of iterations, called refinement sequence. During refinement, each iteration uses a cone of decreasing depth for which the safety of the current reachable state overapproximation is guaranteed.

Assume that, at a given step i,

F_{i} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k)

is Sat. Since

F_{i}

was proved to be safe with respect to

C o n e (1, k)

at iteration

i - 1

, there does not exist any counterexample of length at most k starting from i, and hence it is guaranteed that

F_{i + j} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k - j - 1)

(67)

is Unsat for each

0 \leq j < k - 1

. A new interpolant

I_{j}

can be generated from each of such calls to be used to refine

F_{i + j + 1}

. Alternatively, one could compute an interpolation sequence directly from the BMC call (67) and use it to refine the trace.

To summarize, once a cone of depth k is hit at step i of the forward traversal, the algorithm can compute a sequence of k interpolants, each representing an overapproximation of states reachable in

i + j + 1

transitions, with

0 \leq j < k - 1

, that can be used to refine, or generate, the frames

(F_{i + 1}, \dots, F_{i + k})

of the trace. The main purpose of such a refinement sequence is to let future traversals operate over more precise overapproximations of the reachable states.

On such premises, the overall IGR algorithm can now be presented.

The top-level procedure of IGR is reported in Algorithm 12. The algorithm first checks the safety of the initial states (lines 2–3) and initializes the trace

F_{k}

(line 4). Indexes

i_{h i t}

and

k_{h i t}

are also initialized (line 5–6) and they are used to keep track of the traversal step and cone depth at which a cone was hit during the previous traversal, if any. Past initialization, the outer loop starts iterating overapproximated forward traversals (lines 7–13). Procedure Unwind is invoked first to seek the best frame at which to start the next traversal (line 8), to perform cone unwinding (as described in Section 7.3.1) starting from the step

i_{h i t}

and cone depth

k_{h i t}

, until either a concrete counterexample or a frame that could be refined by computing a new interpolant is found. In the first case, Unwind returns a Reach result and a counterexample. The algorithm, thus, terminates with Fail producing the counterexample as output (lines 9–10). In the second case, Unwind returns an Undef result, a step i and a cone depth k to be used for the next traversal. Note that, during the first iteration,

i_{h i t}

and

k_{h i t}

are initialized to zero and one, respectively; therefore, Unwind simply checks whether the initial states can reach the target in one transition. If no concrete counterexample is found, the algorithm starts a new forward overapproximated traversal by calling the IgrApproxForwardTraversal routine (line 11). Upon termination of such a procedure, if the result is Unreach then an inductive invariant has been found during traversal and thus the algorithm terminates with Success (lines 12–13). Conversely, the cone was hit and the traversal procedure returns an Undef result together with the step

i_{h i t}

and cone depth

k_{h i t}

at which that occurred. In such a scenario, it means that a spurious counterexample was found during traversal and the algorithm then iterates to perform a new traversal.

On these premises, we can summarize the overall scheme of IgrModelChecking as follows:

Iteratively choose a starting frame $F_{i}$ and a cone $C o n e (1, k)$ to be unwound with guidance throughout the overapproximated trace $F_{k}$ .
Initiate a forward traversal from $F_{i}$ with $C o n e (1, k)$ aiming at refining $F_{k}$ and filtering out the latest spurious counterexample found within $F_{i_{h i t}}$ .

The two sub-tasks are performed by Unwind and IgrApproxForwardTraversal, respectively.

Algorithm 12 Top-level procedure of IGR.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V.
Output:

〈 res, cex 〉

with res ∈ {Success, Fail};

cex

a (possibly empty) initial path representing a counterexample.

1:: procedure IgrModelChecking( $M$ , P)
2:: if $\exists s_{0} : s_{0} ⊧ I (V) \land \neg P (V)$ then
3:: return 〈Fail, (s0)〉
4:: end if
5:: $F_{k} [0] \leftarrow I$
6:: $i_{h i t} \leftarrow 0$
7:: $k_{h i t} \leftarrow 1$
8:: $〈 res, cex, i, k 〉$ ← Unwind $(M, P, F_{k}, i_{h i t}, k_{h i t})$
9:: if $res$ is Reach then
10:: return 〈Fail, cex〉
11:: $〈 res, i_{h i t}, k_{h i t} 〉$ ← IgrApproxForwardTraversal $(M, P, F_{k}, i, k)$
12:: if $res$ is Unreach then
13:: return 〈Success, −〉

Procedure IgrApproxForwardTraversal, described in Algorithm 13, performs an overapproximated forward traversal of reachable states. The procedure starts from a given frame in the trace while preserving safety with respect to a cone of a given depth. It first computes the current overapproximated set of reachable states at step i by disjoining the first i frames (line 2). During each iteration of the traversal loop (lines 6–29), the algorithm then proceeds in two different ways taking into account whether or not cone rewinding (see Section 7.3.2) has been triggered. If it has not been triggered, the algorithm performs a traversal step using a cone of bound k. Conversely, the procedure decreases the bound of the cone at each iteration to perform rewinding (lines 9–10). The procedure then performs cone simplification (line 12), as described in Section 7.2, and checks whether the current set of overapproximated reachable states R hits the cone (line 13), during each traversal step. In case of a hit, the algorithm saves the current step and cone bound in

i_{h i t}

and

k_{h i t}

, respectively, and triggers a refinement sequence (lines 14–16). Conversely, a new overapproximated image is computed through interpolation (lines 18) and the current frame

F_{i}

is refined and simplified (line 19) as described in Section 7.2. The algorithm then checks whether the overapproximation is an inductive invariant, returning Unreach if that is the case (lines 20–21). If no inductive invariant has been found, the new set of overapproximated forward reachable states to be used for the next iteration is computed as

R \lor F_{i + 1}

(line 22). Each time the forward traversal reaches the end of the current trace, a new frame

F_{i + 1}

is instantiated equal to ⊤ and added to the trace (lines 7–8). At the end of each iteration, if a given depth threshold D has been reached, the algorithm forces rewinding (line 24–27). When rewinding is triggered, either after finding a spurious counterexample or by force, the algorithm continues the traversal decreasing the cone bound at each iteration. When the cone has been completely rewound, the algorithm terminates returning Undef alongside the step and cone bound at which either a spurious counterexample was found (lines 15–16) or rewinding has been forced (lines 26–27).

The aforementioned D threshold heuristically controls activation of cone rewinding. Whenever

D = 0

, rewinding is always active, so the approach obtains a minimal refinement while mimicking the effect of interpolation sequences [55]. High values of D keep the k value constant until a hit, mimicking a scheme which would result far closer to standard interpolation. Empirically, it occurs that small values tend to be usually better at small sequential depths, as they can enact more, relatively inexpensive, refinement steps.

Algorithm 13 Inner procedure that keeps track of the computed interpolants using a trace.

Input:

M = 〈 V, I, T 〉

a transition system; P a property over V;

F_{k}

a trace; i start step of the traversal; k bound of a backward unrolling from the target.
Output:

〈 res, i_{h i t}, k_{h i t} 〉

with res ∈ {Unreach, Undef};

i_{h i t}

the step (if any) at which the cone is hit during traversal;

k_{h i t}

the depth of the cone hit.

1:: procedure IgrApproxForwardTraversal( $M$ , P, $F_{k}$ , k)
2:: $R \leftarrow ⋁_{j = 0}^{i} F_{j}$
3:: $r e w i n d \leftarrow ⊥$
4:: $i_{h i t} \leftarrow i$
5:: $k_{h i t} \leftarrow k$
6:: while ⊤ do
7:: if $i = | F_{k} |$ then
8:: $F_{k} [i + 1] \leftarrow ⊤$
9:: if $r e w i n d \land k > 0$ then
10:: $k \leftarrow k - 1$
11:: $A \leftarrow F_{i} (V^{0}) \land T (V^{0}, V^{1})$
12:: B ← TraceSimplify $(C o n e (1, k), F_{k}, i + 1, k)$
13:: if $\exists π^{0, k} ⊧ A \land B$ then
14:: $r e w i n d \leftarrow ⊤$
15:: $i_{h i t} \leftarrow i$
16:: $k_{h i t} \leftarrow k$
17:: else
18:: I ← Itp(A, B)
19:: $F_{i + 1}$ ← Simplify(F_i+1,I) ∧ I
20:: if $\exists s ⊧ F_{i + 1} \land \neg R$ then
21:: return 〈Unreach, −, −〉
22:: $R \leftarrow R \lor F_{i + 1}$
23:: $i \leftarrow i + 1$
24:: if $\neg r e w i n d \land i > D$ then
25:: $r e w i n d \leftarrow ⊤$
26:: $i_{h i t} \leftarrow i$
27:: $k_{h i t} \leftarrow k$
28:: else if $r e w i n d \land k = 0$ then
29:: return 〈Undef, i_hit, k_hit〉

At each iteration, the Unwind procedure described in Algorithm 14 computes i and k, starting from

i_{h i t}

and

k_{h i t}

, relative to the previous spurious counterexample. Following the strategy described in Section 7.3.1, the cone bound k is extended while i is decremented (lines 5–6), until either the algorithm encounters an Unsat BMC check or the initial states are reached (line 4). In the former case, Unwind managed to find a frame at which starting a traversal is expected to lead to a

F_{k}

refinement and to filtering out the last spurious counterexample found. The procedure then returns an Undef result alongside the corresponding values for i and k (line 9). In the latter case, the procedure has detected an actual counterexample as a side effect. The procedure then returns a Reach result alongside the counterexample (line 8).

Algorithm 14 Unwinding procedure of IGR.

Input:

M \overset{def}{=} 〈 V, I, T 〉

a transition system; P a property over V;

i_{h i t}

the step at which the cone was hit during traversal;

k_{h i t}

the depth of the cone hit.
Output:

〈 res, cex, i, k 〉

with res ∈ {Undef, Reach};

cex

a (possibly empty) initial path representing a counterexample; i step at which resume forward traversal from; k cone depth to use in the resumed traversal.

1:: procedure Unwind( $M$ , P, $i_{h i t}$ , $k_{h i t}$ )
2:: $i \leftarrow i_{h i t}$
3:: $k \leftarrow k_{h i t}$
4:: while $i \geq 0 \land \exists π^{0, k} ⊧ F_{i} (V^{0}) \land T (V^{0}, V^{1}) \land C o n e (1, k)$ do
5:: $i \leftarrow i - 1$
6:: $k \leftarrow k + 1$
7:: if $i < 0$ then
8:: return 〈Reach, π^0,k, −, −〉
9:: return 〈Undef,−,i,k〉

8. Comparative Analysis and Experimental Evaluation

In order to compare and evaluate the various model checking algorithms, we here present the data gathered taking into account our own implementation of said techniques on top of the PdTRAV tool [56], a state-of-the-art model checking academic tool.

The benchmark set considered was derived from past HWMCC suites [14,57], starting from the 2014 edition until the most recent one. The set includes hardware- as well as software-derived verification problems, mostly stemming from industrial-level verification instances (such as IBM and intel benchmarks).

Experiments were run on an Intel Core i7-1370, with 16 CPUs running at

3.4

GHz hosting a Ubuntu

22.04

LTS Linux distribution. All the experiment were run taking into account a time limit of 3600 s and a memory limit of 32 GB.

Experimental results are reported in Table 1 and Table 2, which share the same structure. Both tables highlights solves, split between SAT and UNSAT instances (Result), for each main engine (Engine) of our verification suite. For each group of results, we provide the average number of memory elements (

a v g_{l}

), input variables (

a v g_{i}

) and gates (

a v g_{g}

) of the benchmarks belonging to each set, to provide some insight into the size of the instances at hand. As summary statistics, we also provide average solving time, both in term of CPU time (

a v g_{t}

) and wall-clock time (

a v g_{r}

), and memory requirements (

a v g_{m}

) for each group. Table 2 provides a subset of the results of Table 1, filtering out easier instances, i.e., instances that require 60 seconds or less to be solved.

In the tables one can find both SAT-based model checking techniques, which are the focus of this paper, such as IC3 in its PDR variant, BMC, induction-based approaches exploiting lemmas (LMS), interpolation and IGR, as well as some solutions obtained through auxiliary techniques, such as BDD-based reachability, logic synthesis, simulation and combinatorial techniques to circuit manipulation and checking, which go past the scope of the current work. It is worth remembering that the portfolio-based approach awards the solution for a specific instance to the first engine capable of solving the corresponding verification task, for the sake of efficiency. In such a scenario, after (at least) one engine manages to solve a problem, all the concurrent running task are brought to a halt, in order to spare resources.

As mentioned before, there is not currently a definitive winning strategy when it come to bit-level hardware model checking, so portfolio approaches are the de facto standard in the field. Since its introduction, IC3/PDR has proven itself time and time again as one of the most proficient strategies in the state of the art, as also shown by the results here provided. Nevertheless, its contribution need to be complemented with interpolation-based strategies in order to provide better coverage. This is far more evident when we shift the focus to harder-to-solve instances, in which the subset of benchmarks solved by interpolation-based strategies stays basically the same, whereas the number significantly decreases for IC3. IC3 proves itself a high-performing engine, with a significant edge on easier and mid-tier instances. Interpolation-based techniques, in turn, complement it providing orthogonal approaches. The same consideration holds for BMC-based approaches, that focus on falsification with the aim of finding an actual counterexample to the property under verification. Their targets stay basically unchanged, with a scope on instances with deeper unrollings of the corresponding transition relation, which other engines may struggle to tackle.

When taking into account BDD-based approaches, which are not directly the focus of this paper, one can take into account both similarities and differences with their SAT-based counterparts. On the one hand, they both try to address problems encoded in Boolean form but where SAT checkers just need to focus on finding a single satisfying assignment for the formula under verification, BDD-based ones need to encode a function describing all the satisfying assignments. Once such an encoding is available, BDDs are capable of performing tasks which are not supported by SAT-based solutions, such as focusing on optimality given a cost function, provide counting, or supporting variable quantification, at the cost of being able to build such a data structure, though. In general SAT checkers relying on some variant of the DPLL algorithm outperform BDD-based solutions; nevertheless, there are niche scenarios in which BDDs are still the winning strategies, such as specific instances of equivalence and parity checking.

Complementary to all of the techniques referenced above, we have logic synthesis, simulation and combinatorial-based approaches which tackle verification problems from a more circuital and structural point of view. They tend to be applicable to smaller instances or instances with very specific patterns and a large number of equivalent elements, which can be collapsed in a few equivalence classes and/or abstracted away from the actual model under verification.

Practical Considerations Concerning Model Checking Algorithms

The algorithms we have described, falling into the SAT-based and symbolic model checking category, are usually quite resilient to the state space explosion problem, which can be problematic when dealing with large designs, comprising millions, or more, variables and constraints. In such a scenario, though, the trade-off has to be taken into account, which lies in the inherent time complexity, as SAT-based problems are known to be NP-complete.

One the one hand, we have that worst-case complexity indeed falls into exponential time; on the other hand, we have that, exploiting the proper modeling of systems, representation of data and simplification steps, SAT-based model checking algorithms are applicable to real-world scenarios.

One of the major issues with tuning resources for running verification instances is that it is almost impossible, in the general case, to foresee what would be the appropriate amount of time and memory to dedicate to a given task. The experimental evaluation proposed above follows the same setup of model checking competitions to provide comparable results with a well-defined baseline, but at the same time such a setup is nothing more than an executive choice on the matter.

It is not uncommon for industrial-level instances and more complex benchmarks to have to account for timeouts in the order of days, or more, in order to hopefully obtain usable results. Let us also point out once more how there is absolutely no guarantee that any of the engines may actually be able to solve a given instance, regardless of the time spent on the problem. On these premises, there are situations in which the aim is to go as deep as possible in the model exploration, trying to verify as much of the behavior as possible, without any claim of completeness, such as in pure BMC-based runs, where engines are left to run until resource depletion, either in terms of time or memory.

Because of this, orthogonal to the research and development of new techniques, a vast amount of research is targeted at improving the existing base of knowledge in order to make current techniques better performing and more applicable. Such an effort may be directed, among other things, at improving formalism and encoding styles and rules for the models [58], improving SAT-solver management strategies [49] or introducing different strategies for task management or exploiting partial results in a collaborative fashion among engines in order to increase their chance of success [36,37].

A far more in-depth analysis of practical solutions and techniques to improve the applicability and scalability of SAT-based verification is presented in [59].

9. Conclusions

This paper provides an in-depth overview of the most common SAT-based algorithms usually applied in bit-level hardware model checking scenarios. In the last two decades, SAT-based model checking brought about a leap in the performance and scalability of symbolic model checking algorithms. SAT-based model checking tools have managed to almost supplant BDD-based ones in both industry and academia. Model checking algorithms based on SAT solving, however, still suffer from non-negligible scalability issues when confronted with the complexity of many industrial-scale designs. Improving the performance and scalability of such algorithms is a very challenging, yet very active, research path.

This paper strives to provide both a thorough overview of the background required to understand the topics at hand, and a complete description of the aforementioned algorithms and their intricacies. Based on the current state of the art, a selection of the most relevant methods has been identified and described, taking into account both bounded and unbounded model checking, with a focus on safety properties. For the sake of generality, this work provides a description of each algorithm from a theoretical standpoint, trying to avoid details which would turn out to be too implementation oriented in favour of the soundness and comprehensiveness of their description and discussion.

As mentioned in the introductory section, as of today, there is no a single winning strategy applicable to the broad range of verification problems one may face. It has thus become almost mandatory to tackle verification tasks using a portfolio of techniques, running in a concurrent/collaborative fashion, in order to maximize the chance of success. Furthermore, additional factors, such as model pre-processing and simplification, property management, engine orchestration, etc., play a role as important as the actual model checking algorithm of choice in determining the success of the verification procedure. This is the main reason why the current research path in this field usually reaches across several different directions, targeting different aspects of the verification chain, rather than focusing on a single specific step. Nevertheless, it is absolutely crucial to develop a firm and sound understanding of the available techniques in order to properly orchestrate verification efforts.

For the future, the current path of research is aimed at supporting hierarchical verification for top-down design, embedding support for verification in the whole design chain, starting from specification languages. Currently, there is a need for being able to write a verifiable high-level design, that can then be refined and implemented, while guaranteeing that it is still consistent with its abstraction. The main aim is for properties that have been verified at a given step of the design chain to hold at any other successive level of refinement. Such a desideratum leads to hybrid and top-down/bottom-up design methodologies, with emphasis on verification as grounding for correctness, as well as introducing the need for some sort of certificate, to guarantee the correctness of each step, transformation and refinement involved in the process. Alongside hierarchical verification, the integration of model checking within design becomes even more pressing, where model checkers represent auxiliary tools within the design chain, to support and validate correctness. Despite its potential usefulness, the adoption of model checking in industrial scenarios is still an ongoing process, where funding and methodology changes are the main limiting factors, in conjunction with the absence of a definitively best approach at tackling verification problems. Because of this, we can foresee that for the coming years portfolio-based approaches, exploiting divide-and-conquer strategies among different verification engines, will still be the most consistent way to address such tasks.

Author Contributions

Conceptualization, P.E.C., G.C., M.P. and P.P.; methodology, M.P. and P.P.; software, G.C.; validation, M.P. and P.P.; formal analysis, P.E.C. and G.C.; investigation, G.C., M.P. and P.P.; resources, P.E.C. and G.C.; data curation, M.P. and P.P.; writing—original draft preparation, M.P. and P.P.; writing—review and editing, P.E.C., G.C., M.P. and P.P.; visualization, M.P. and P.P.; supervision, P.E.C. and G.C.; project administration, P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

This work was supported in part by Partenariato Esteso-RESTART “RESearch and innovation on future Telecommunications systems and networks, to make Italy more smART”-PE00000001.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

BMC	Bounded Model Checking
CNF	Conjunctive Normal Form
CTI	Counterexample To Induction
HW	Hardware
HWMCC	Hardware Model Checking Competition
ITP	Interpolant
IC3	Incremental Construction of Inductive Clauses for Indubitable Correctness
IGR	Interpolation with Guided Refinement
MC	Model Checking
PDR	Property Directed Reachability
RTL	Register Transfer Level
UMC	Unbounded Model Checking
WFF	Well-Formed Formula

References

Foster, H. 2022 Wilson Research Group FPGA Functional Verification Trends. Available online: https://www.innofour.com/static/default/files/documents/pdf/fpga-trend-report_2022-wilson-research-verification-study_hfoster.pdf (accessed on 6 June 2024).
Rashinkar, P.; Paterson, P.; Singh, L. System-on-a-Chip Verification: Methodology and Techniques; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Kaufmann, M.; Moore, J.S. Some key research problems in automated theorem proving for hardware and software verification. Math. J. Span. R. Acad. Sci. 2004, 98, 181–195. [Google Scholar]
Boyer, R.S.; Moore, J.S. A Computational Logic Handbook, 1st ed.; Academic Press Professional, Inc.: San Diego, CA, USA, 1988. [Google Scholar]
Harrison, J. Theorem Proving for Verification (Invited Tutorial). In Proceedings of the 20th International Conference on Computer Aided Verification, CAV, Princeton, NJ, USA, 7–14 July 2008; Volume 5123, pp. 11–18. [Google Scholar] [CrossRef]
Kuehlmann, A.; Somenzi, F.; Hsu, C.J.; Bustan, D. Equivalence Checking. In Electronic Design Automation for IC Implementation, Circuit Design, and Process Technology; CRC Press: Boca Raton, FL, USA, 2016; pp. 77–108. [Google Scholar] [CrossRef]
Clarke, E.M.; Emerson, E.A. Design and synthesis of synchronization skeletons using branching time temporal logic. In Logics of Programs; Kozen, D., Ed.; Springer: Berlin/Heidelberg, Germany, 1981; Volume 131, pp. 52–71. [Google Scholar] [CrossRef]
Clarke, E.M.; Grumberg, O.; Peled, D.A.; Veith, H. Model Checking, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Bryant, R.E. Graph-Based Algorithms for Boolean Function Manipulation. IEEE Trans. Comput. 1986, 35, 677–691. [Google Scholar] [CrossRef]
Mishchenko, A. ABC: A System for Sequential Synthesis and Verification. Available online: http://people.eecs.berkeley.edu/~alanmi/abc/ (accessed on 6 June 2024).
Cabodi, G.; Nocco, S.; Quer, S. PdTRAV: Politecnico di Torino Reachability Analysis & Verification. Available online: https://github.com/polito-fmgroup/pdtools (accessed on 6 June 2024).
Griggio, A.; Roveri, M.; Tonetta, S. Certifying proofs for SAT-based model checking. Form. Methods Syst. Des. 2021, 57, 178–210. [Google Scholar] [CrossRef]
Yu, E.; Froleyks, N.; Biere, A.; Heljanko, K. Towards Compositional Hardware Model Checking Certification. In Proceedings of the Formal Methods in Computer-Aided Design, FMCAD 2023, Ames, IA, USA, 24–27 October 2023; pp. 1–11. [Google Scholar] [CrossRef]
Biere, A.; Jussila, T. The Model Checking Competition. Available online: http://fmv.jku.at/hwmcc (accessed on 8 June 2024).
Biere, A. Tutorial on World-Level Model Checking. In Proceedings of the 2020 Formal Methods in Computer Aided Design (FMCAD), Online, 21–24 September 2020; p. 1. [Google Scholar] [CrossRef]
Palena, M. Exploiting Boolean Satisfiability Solvers for High Performance Bit-Level Model Checking. Ph.D. Thesis, Politecnico di Torino, Turin, Italy, 2017. [Google Scholar]
Pasini, P. Improving Bit-Level Model Checking Algorithms for Scalability through Circuit-Based Reasoning. Ph.D. Thesis, Politecnico di Torino, Turin, Italy, 2017. [Google Scholar]
Clarke, E.M.; Henzinger, T.A.; Veith, H.; Bloem, R. Handbook of Model Checking, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Biere, A.; Heule, M.; van Maaren, H.; Walsh, T. (Eds.) Handbook of Satisfiability, 2nd ed.; Frontiers in Artificial Intelligence and Applications; IOS Press: Amsterdam, The Netherlands, 2021; Volume 336. [Google Scholar] [CrossRef]
Seshia, S.A.; Hu, S.; Li, W.; Zhu, Q. Design Automation of Cyber-Physical Systems: Challenges, Advances, and Opportunities. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2017, 36, 1421–1434. [Google Scholar] [CrossRef]
Barwise, J. (Ed.) Handbook of Mathematical Logic; Elsevier: Amsterdam, The Netherlands, 1977. [Google Scholar]
Hunter, G. Metalogic: An Introduction to the Metatheory of Standard First Order Logic, 1st ed.; Macmillan Student Editions; University of California Press: Berkeley, CA, USA, 1973. [Google Scholar]
Hopcroft, J.E.; Motwani, R.; Ullman, J.D. Introduction to Automata Theory, Languages, and Computation, 3rd ed.; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 2006. [Google Scholar]
Cook, S.A. The Complexity of Theorem-proving Procedures. In Third Annual ACM Symposium on Theory of Computing; Harrison, M.A., Banerji, R.B., Ullman, J.D., Eds.; ACM: New York, NY, USA, 1971; pp. 151–158. [Google Scholar] [CrossRef]
Davis, M.; Logemann, G.; Loveland, D. A Machine Program for Theorem-proving. Commun. ACM 1962, 5, 394–397. [Google Scholar] [CrossRef]
Marques Silva, J.P.; Sakallah, K.A. Grasp—A New Search Algorithm for Satisfiability. In Proceedings of the The Best of ICCAD: 20 Years of Excellence in Computer-Aided Design, San Jose, CA, USA, 10–14 November 1996; pp. 73–89. [Google Scholar] [CrossRef]
Huang, J. The Effect of Restarts on the Efficiency of Clause Learning. In Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI, Hyderabad, India, 6–12 January 2007; pp. 2318–2323. [Google Scholar]
Biere, A. Preprocessing and Inprocessing Techniques in SAT. In Proceedings of the 7th International Haifa Verification Conference on Hardware and Software: Verification and Testing, HVC, Haifa, Israel, 6–8 December 2011; p. 1. [Google Scholar] [CrossRef]
Beame, P.; Kautz, H.; Sabharwal, A. Towards Understanding and Harnessing the Potential of Clause Learning. J. Artif. Intell. Res. 2004, 22, 319–351. [Google Scholar] [CrossRef]
Goldberg, E.; Novikov, Y. Verification of Proofs of Unsatisfiability for CNF Formulas. In Proceedings of the 2003 Design, Automation and Test in Europe Conference, DATE, Munich, Germany, 3–7 March 2003; p. 10886. [Google Scholar] [CrossRef]
Zhang, L.; Malik, S. Validating SAT Solvers Using an Independent Resolution-Based Checker: Practical Implementations and Other Applications. In Proceedings of the 2003 Design, Automation and Test in Europe Conference, DATE, Munich, Germany, 3–7 March 2003; pp. 10880–10885. [Google Scholar] [CrossRef]
Wetzler, N.; Heule, M.J.H.; Hunt, W.A. DRAT-trim: Efficient Checking and Trimming Using Expressive Clausal Proofs. In Proceedings of the Theory and Applications of Satisfiability Testing—SAT 2014: 17th International Conference, Vienna, Austria, 14–17 July 2014; Volume 8561, pp. 422–429. [Google Scholar] [CrossRef]
Kripke, S.A. A completeness theorem in modal logic. J. Symb. Log. 1959, 24, 1–14. [Google Scholar] [CrossRef]
Pnueli, A. The Temporal Logic of Programs. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, FOCS, Providence, RI, USA, 31 October–2 November 1977; pp. 46–57. [Google Scholar] [CrossRef]
Clarke, E.M.; Emerson, E.A.; Sistla, A.P. Automatic Verification of Finite-state Concurrent Systems Using Temporal Logic Specifications. Acm Trans. Program. Lang. Syst. 1986, 8, 244–263. [Google Scholar] [CrossRef]
Cabodi, G.; Camurati, P.E.; Loiacono, C.; Palena, M.; Pasini, P.; Patti, D.; Quer, S. To split or to group: From divide-and-conquer to sub-task sharing for verifying multiple properties in model checking. Int. J. Softw. Tools Technol. Transf. 2018, 20, 313–325. [Google Scholar] [CrossRef]
Dureja, R.; Baumgartner, J.; Kanzelman, R.; Williams, M.; Rozier, K.Y. Accelerating Parallel Verification via Complementary Property Partitioning and Strategy Exploration. In Proceedings of the 2020 Formal Methods in Computer Aided Design (FMCAD), Online, 21–24 September 2020; pp. 16–25. [Google Scholar] [CrossRef]
Biere, A.; Cimatti, A.; Clarke, E.; Zhu, Y. Symbolic Model Checking without BDDs. In Proceedings of the 5th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS, Amsterdam, The Netherlands, 22–28 March 1999; Volume 1579, pp. 193–207. [Google Scholar] [CrossRef]
Sheeran, M.; Singh, S.; Stålmarck, G. Checking Safety Properties Using Induction and a SAT-Solver. In Proceedings of the 3rd International Conference on Formal Methods in Computer-Aided Design, FMCAD, Austin, TX, USA, 1–3 November 2000; Volume 1954, pp. 127–144. [Google Scholar] [CrossRef]
Craig, W. Three Uses of the Herbrand-Gentzen Theorem in Relating Model Theory and Proof Theory. J. Symb. Log. 1957, 22, 269–285. [Google Scholar] [CrossRef]
Huang, G. Constructing Craig Interpolation Formulas. In Proceedings of the First Annual International Conference on Computing and Combinatorics, COCOON, Xi’an, China, 24–26 August 1995; Volume 959, pp. 181–190. [Google Scholar] [CrossRef]
Krajícek, J. Interpolation Theorems, Lower Bounds for Proof Systems, and Independence Results for Bounded Arithmetic. J. Symb. Log. 1997, 62, 457–486. [Google Scholar] [CrossRef]
Pudlák, P. Lower Bounds for Resolution and Cutting Plane Proofs and Monotone Computations. J. Symb. Log. 1997, 62, 981–998. [Google Scholar] [CrossRef]
McMillan, K.L. Interpolation and SAT-Based Model Checking. In Proceedings of the 15th International Conference on Computer Aided Verification, CAV, Boulder, CO, USA, 8–12 July 2003; Volume 2725, pp. 1–13. [Google Scholar] [CrossRef]
Vizel, Y.; Nadel, A.; Ryvchin, V. Efficient generation of small interpolants in CNF. Form. Methods Syst. Des. 2015, 47, 51–74. [Google Scholar] [CrossRef]
Gurfinkel, A.; Vizel, Y. DRUPing for Interpolants. In Proceedings of the 14th International Conference on Formal Methods in Computer-Aided Design, FMCAD, Lausanne, Switzerland, 21–24 October 2014; pp. 99–106. [Google Scholar] [CrossRef]
Cabodi, G.; Camurati, P.E.; Palena, M.; Pasini, P.; Vendraminetto, D. Reducing Interpolant Circuit Size Through SAT-Based Weakening. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2020, 39, 1524–1531. [Google Scholar] [CrossRef]
Bradley, A.R. SAT-Based Model Checking without Unrolling. In Proceedings of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI, Austin, TX, USA, 23–25 January 2011; Volume 6538, pp. 70–87. [Google Scholar] [CrossRef]
Cabodi, G.; Camurati, P.E.; Mishchenko, A.; Palena, M.; Pasini, P. SAT solver management strategies in IC3: An experimental approach. Form. Methods Syst. Des. 2017, 50, 39–74. [Google Scholar] [CrossRef]
Eén, N.; Mishchenko, A.; Brayton, R. Efficient Implementation of Property Directed Reachability. In Proceedings of the 11th International Conference on Formal Methods in Computer-Aided Design, FMCAD, Austin, TX, USA, 30 October–2 November 2011; pp. 125–134. [Google Scholar]
Chockler, H.; Ivrii, A.; Matsliah, A.; Moran, S.; Nevo, Z. Incremental Formal Verification of Hardware. In Proceedings of the 11th International Conference on Formal Methods in Computer-Aided Design, FMCAD, Austin, TX, USA, 30 October–2 November 2011; pp. 135–143. [Google Scholar]
Bradley, A.R.; Manna, Z. Checking Safety by Inductive Generalization of Counterexamples to Induction. In Proceedings of the 7th International Conference on Formal Methods in Computer-Aided Design, FMCAD, Austin, TX, USA, 11–14 November 2007; pp. 173–180. [Google Scholar] [CrossRef]
Cabodi, G.; Camurati, P.E.; Palena, M.; Pasini, P. Interpolation with guided refinement: Revisiting incrementality in SAT-based unbounded model checking. Form. Methods Syst. Des. 2022, 60, 117–146. [Google Scholar] [CrossRef]
Mishchenko, A.; Brayton, R.K. SAT-Based Complete Don’t-Care Computation for Network Optimization. In Proceedings of the 2005 Design, Automation and Test in Europe Conference, DATE, Munich, Germany, 7–11 March 2005; pp. 412–417. [Google Scholar] [CrossRef]
Cabodi, G.; Nocco, S.; Quer, S. Interpolation sequences revisited. In Proceedings of the Design, Automation and Test in Europe, DATE 2011, Grenoble, France, 14–18 March 2011; pp. 316–322. [Google Scholar] [CrossRef]
Cabodi, G.; Nocco, S.; Quer, S. Benchmarking a model checker for algorithmic improvements and tuning for performance. Form. Methods Syst. Des. 2011, 39, 205–227. [Google Scholar] [CrossRef]
Cabodi, G.; Loiacono, C.; Palena, M.; Pasini, P.; Patti, D.; Quer, S.; Vendraminetto, D.; Biere, A.; Heljanko, K.; Baumgartner, J. Hardware Model Checking Competition 2014: An Analysis and Comparison of Solvers and Benchmarks. J. Satisf. Boolean Model. Comput. 2016, 9, 135–172. [Google Scholar] [CrossRef]
Tseitin, G.S. On the Complexity of Derivation in Propositional Calculus. In Automation of Reasoning: 2: Classical Papers on Computational Logic 1967–1970; Springer: Berlin/Heidelberg, Germany, 1983; pp. 466–483. [Google Scholar] [CrossRef]
Ganai, M.K.; Gupta, A. SAT-Based Scalable Formal Verification Solutions; Series on Integrated Circuits and Systems; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar] [CrossRef]

Figure 1. Search tree enumerating all possible assignments for three variables x, y and z with respect to

C N F = {{\neg x, y}, {\neg y, \neg z}}

. Solid (dashed) edges correspond to

t r u e

(

f a l s e

) assignments.

Figure 1. Search tree enumerating all possible assignments for three variables x, y and z with respect to

C N F = {{\neg x, y}, {\neg y, \neg z}}

. Solid (dashed) edges correspond to

t r u e

(

f a l s e

) assignments.

Figure 2. Partial termination tree associated with the example of Figure 1 where each node is labeled with the corresponding CNF. Crosses denote contradiction within a given path whereas check marks denote admissible assignments.

Figure 3. (a) A simple sequential hardware design implementing a 2-bit counter from 0 to 2 using D-type flip flops. (b) The model of the counter as a transition system.

Figure 4. Different types of traces with respect to a given

M \overset{d e f}{=} 〈 V, I, T 〉

.

Figure 4. Different types of traces with respect to a given

M \overset{d e f}{=} 〈 V, I, T 〉

.

Figure 5. Visualization of states in a transition system

M

. States are labeled with their corresponding truth-value assignments to state variables V.

Figure 5. Visualization of states in a transition system

M

. States are labeled with their corresponding truth-value assignments to state variables V.

Figure 6. Interpolation as an overapproximated image operator. Interpolant I is an overapproximation of the image of R that does not intersect

C o n e (1, k)

.

Figure 6. Interpolation as an overapproximated image operator. Interpolant I is an overapproximation of the image of R that does not intersect

C o n e (1, k)

.

Figure 7. Different steps of McMillan’s interpolation-based model checking procedure visualized as groupings of states in a transition system

M

. States are labeled with their corresponding truth-value assignments to state variables V. States enclosed by solid lines represent the set B of bad states or states that are backward reachable from a bad state. States enclosed by dotted lines represent the current set R of states that are reachable from the initial states. States enclosed by dash–dotted lines represent the set A of states in the image of R. States enclosed by dashed lines represent the interpolant I that overapproximates A. In (a) the initial scenario is depicted, where I overapproximates the image of R including both reachable and unreachable states (from

I

). (b) depicts the situation after the current set of reachable states R has been updated by disjointing it with the interpolant I and the subsequent traversal lead to a spurious counterexample. (c) depicts a new traversal after overapproximation refinement. (d) depicts the convergence fix-point for the algorithm.

Figure 7. Different steps of McMillan’s interpolation-based model checking procedure visualized as groupings of states in a transition system

M

. States are labeled with their corresponding truth-value assignments to state variables V. States enclosed by solid lines represent the set B of bad states or states that are backward reachable from a bad state. States enclosed by dotted lines represent the current set R of states that are reachable from the initial states. States enclosed by dash–dotted lines represent the set A of states in the image of R. States enclosed by dashed lines represent the interpolant I that overapproximates A. In (a) the initial scenario is depicted, where I overapproximates the image of R including both reachable and unreachable states (from

I

). (b) depicts the situation after the current set of reachable states R has been updated by disjointing it with the interpolant I and the subsequent traversal lead to a spurious counterexample. (c) depicts a new traversal after overapproximation refinement. (d) depicts the convergence fix-point for the algorithm.

Figure 8. Visual representation of different steps in the IC3 procedure. In (a), a bad cube s that violates P is found in

F_{k}

and extended to q. (b) shows the beginning of the cube blocking procedure for q, in which a clause

\neg q

is checked for induction relative to

F_{k - 1}

. (c) shows the case in which relative induction of

\neg q

does not hold and thus a predecessor cube s is found to be a CTI and extended into cube p that needs to be blocked in

F_{k - 1}

. (d) shows the cube blocking procedure for p, in which a clause

\neg p

is checked for induction relative to

F_{k - 2}

. In (e),

\neg p

is found to be inductive relative to

F_{k - 2}

; thus,

\neg p

undergoes inductive generalization and the result is used to refine the trace. In (f), after refining the trace,

\neg q

becomes inductive relative to

F_{k - 1}

; the proof obligation of q can thus be discharged by generalizing

\neg q

and refining the trace once more. In (g), a fix-point is found by detecting

F_{k} = F_{k + 1}

after propagation.

Figure 8. Visual representation of different steps in the IC3 procedure. In (a), a bad cube s that violates P is found in

F_{k}

and extended to q. (b) shows the beginning of the cube blocking procedure for q, in which a clause

\neg q

is checked for induction relative to

F_{k - 1}

. (c) shows the case in which relative induction of

\neg q

does not hold and thus a predecessor cube s is found to be a CTI and extended into cube p that needs to be blocked in

F_{k - 1}

. (d) shows the cube blocking procedure for p, in which a clause

\neg p

is checked for induction relative to

F_{k - 2}

. In (e),

\neg p

is found to be inductive relative to

F_{k - 2}

; thus,

\neg p

undergoes inductive generalization and the result is used to refine the trace. In (f), after refining the trace,

\neg q

becomes inductive relative to

F_{k - 1}

; the proof obligation of q can thus be discharged by generalizing

\neg q

and refining the trace once more. In (g), a fix-point is found by detecting

F_{k} = F_{k + 1}

after propagation.

Figure 9. Examples of simplification under a care set. (a) Simplification of B using A as care set. The Simplify procedure simplifies B without affecting its conjunction F with A. (b) Simplification of B using C as care set, with

A \to C

. The Simplify procedure simplifies B without affecting its conjunction F with A.

Figure 9. Examples of simplification under a care set. (a) Simplification of B using A as care set. The Simplify procedure simplifies B without affecting its conjunction F with A. (b) Simplification of B using C as care set, with

A \to C

. The Simplify procedure simplifies B without affecting its conjunction F with A.

Table 1. Results on the full benchmark set of the number of instances solved by different engines.

Engine	Result	Solutions	${avg}_{i}$	${avg}_{l}$	${avg}_{g}$	${avg}_{t}$ [s]	${avg}_{r}$ [s]	${avg}_{m}$ [MB]
BDD	SAT	12	16.25	307.33	9458.17	1014.53	175.71	1518.29
BDD	UNS	91	83.81	222.68	2723.90	1959.57	353.31	1921.69
BMC	SAT	185	5811.45	16,011.02	201,773.13	1409.58	330.53	3314.58
COM	SAT	1	122.00	381.00	2526.00	0.59	0.69	58.60
COM	UNS	121	6500.72	24,608.60	215,393.50	28.80	29.16	462.13
IGR	UNS	32	1367.50	1463.75	42,964.44	3359.04	657.87	4796.34
ITP	UNS	86	2728.55	6079.29	74,085.99	3294.79	670.03	3923.90
LMS	UNS	19	47.26	584.32	3687.63	731.97	124.60	1464.49
PDR	SAT	43	579.37	1188.88	18,866.86	496.82	95.26	1482.05
PDR	UNS	233	558.95	3599.55	36,061.80	462.56	97.19	1792.21
RED	UNS	2	203.00	271.00	4840.00	1.54	2.64	113.35
SIM	SAT	36	1011.22	4596.92	45,924.17	224.19	78.73	1742.41
SYN	UNS	91	596.62	819.20	14,008.29	10.36	7.69	395.29

Table 2. Results on the filtered benchmark set of the number of instances solved by different engines, discarding easy-to-solve instances.

Engine	Result	Solutions	${avg}_{i}$	${avg}_{l}$	${avg}_{g}$	${avg}_{t}$ [s]	${avg}_{r}$ [s]	${avg}_{m}$ [MB]
BDD	SAT	10	18.30	334.50	10,029.30	1217.21	210.41	1698.32
BDD	UNS	46	77.20	315.65	2711.20	3869.45	695.41	2972.38
BMC	SAT	157	6580.46	17,892.90	229,695.41	1657.67	387.19	3698.95
COM	UNS	17	651.53	8537.94	71,562.18	151.05	148.73	258.09
IGR	UNS	32	1367.50	1463.75	42,964.44	3359.04	657.87	4796.34
ITP	UNS	81	2850.21	6407.47	78,112.98	3495.62	710.60	4096.33
LMS	UNS	14	48.29	674.14	4114.29	986.12	166.85	1645.11
PDR	SAT	14	1216.57	1452.29	37,274.07	1493.70	274.34	2255.86
PDR	UNS	98	1060.56	6139.47	63,953.65	1075.83	220.33	2591.29
SIM	SAT	24	853.25	5563.62	52,609.83	323.82	112.88	2034.62
SYN	UNS	5	4104.20	2679.60	76,255.80	150.76	103.28	840.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cabodi, G.; Camurati, P.E.; Palena, M.; Pasini, P. Hardware Model Checking Algorithms and Techniques. Algorithms 2024, 17, 253. https://doi.org/10.3390/a17060253

AMA Style

Cabodi G, Camurati PE, Palena M, Pasini P. Hardware Model Checking Algorithms and Techniques. Algorithms. 2024; 17(6):253. https://doi.org/10.3390/a17060253

Chicago/Turabian Style

Cabodi, Gianpiero, Paolo Enrico Camurati, Marco Palena, and Paolo Pasini. 2024. "Hardware Model Checking Algorithms and Techniques" Algorithms 17, no. 6: 253. https://doi.org/10.3390/a17060253

APA Style

Cabodi, G., Camurati, P. E., Palena, M., & Pasini, P. (2024). Hardware Model Checking Algorithms and Techniques. Algorithms, 17(6), 253. https://doi.org/10.3390/a17060253

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hardware Model Checking Algorithms and Techniques

Abstract

1. Introduction

Outline

2. Background and Notation

2.1. Mathematical Logic

2.2. Formal Languages

2.3. Logical Systems

2.4. Propositional Logic

2.4.1. Syntax of Propositional Logic

2.4.2. Semantics of Propositional Logic

2.4.3. Conjunctive Normal Form

2.5. Boolean Functions

2.6. Decision Problems

2.7. Model Checking

2.8. Transition Systems

2.9. Temporal Logics

2.9.1. Linear Temporal Logic

2.9.2. Computation Tree Logic

2.9.3. Temporal Properties

2.10. Model Checking Problem

2.11. Symbolic Representation of Transition Systems

2.11.1. Symbolic Representation of States

2.11.2. Symbolic Representation of Transitions

2.11.3. Symbolic Representation of the Labeling Function

2.11.4. Symbolic Representation of Transition Systems

2.11.5. Symbolic Representation of Invariant Properties

2.12. Symbolic Invariant Verification

3. Bounded Model Checking

Example of Bounded Model Checking

4. Temporal Induction

5. Interpolation-Based Model Checking

5.1. McMillan’s Interpolation Algorithm

5.2. Example of Interpolation-Based Model Checking

6. IC3

Property Directed Reachability

7. IGR

7.1. Incremental State Sets in Interpolation

7.2. Frames and Cone Simplification

7.3. Cone Unwinding and Rewinding

7.3.1. Cone Unwinding

7.3.2. Cone Rewinding

8. Comparative Analysis and Experimental Evaluation

Practical Considerations Concerning Model Checking Algorithms

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI