1. Introduction
Satisfiability Problem (SAT) is a central problem in theoretical computer science of deciding whether a given Conjunction Normal Formula (CNF) is satisfiable. The
k-SAT is a satisfiability problem where every clause has exactly
k distinct variables, and was proved to be a NP-complete problem for
in [
1]. That is, SAT problem should be a computationally hard problem. However, modern SAT solvers are able to efficiently solve some formulas with millions of variables, such as MiniSat [
2], Glucose [
3], Maple [
4]. The conflict-driven clause learning technique is an important algorithm to improve the efficiency of these SAT solver. Yet, how these solvers can be so successful has remained elusive. In order to analyze and improve SAT solvers, some random SAT models were propose.
A natural measure of the solution space is the number of solutions. Unique
k-SAT denotes the promise search problem of
k-SAT where the number of solutions is either 0 or 1. The harder instances should have fewer solutions. But Calabro and Paturi in [
5] proved that the exponential complexity of deciding whether a
k-CNF formula has a solution is the same as that of deciding whether it has exactly one solution, both when it is promised and when it is not promised that the input formula has a solution. Thus, the research of uniquely satisfiable SAT instances is a very significant work.
The (
)-SAT denotes the family of satisfiability problems restricted to CNF formulas with exactly
k distinct variables per clause and at most
s occurrences of each variable. Regular (
)-SAT is a class of special (
)-SAT which each variable occurs in exactly
s clauses. By some polynomial time reductions, it is discovered that some SAT problems with regular structures are NP-complete, such as (3,4)-SAT problem in [
6] and regular (3,4)-SAT problem in [
7]. Experimental results and theoretical analysis on a random
k-SAT problem showed that the constrained density
of a CNF formula is an important parameter affecting the formula satisfiability and the solving difficulty in [
8,
9,
10,
11]. There is a phase transition point
on a random
k-SAT problem such that
- (i)
all random k-CNF instances with are satisfiable with high probability;
- (ii)
all random k-CNF instances with are unsatisfiable with high probability.
But every regular (
)-CNF formula has a fixed constrained density
(the clause-to-variable ratio), such as regular (3,4)-CNF formula corresponding to 4/3. The constrained density of the regular (3,4)-CNF is much smaller than the SAT-UNSAT phase transition point of the random 3-SAT problem
in [
12]. This shows that a random regular (3,4)-CNF formula is satisfiable with high probability, but the regular (3,4)-SAT problem is NP-complete. Obviously, it is not enough to describe structural features of the CNF formula merely by the constrained density
.
In [
13,
14], M. Wahlström presented a definition of (
)-variable to classify all variables in a CNF formula, and designed two algorithms for solving a CNF formula with at most
d occurrences per variable. Here, an (
)-variable is a variable which occurs positively in
a clauses and negatively in
b clauses. In [
15], Johannsen, Razgon and Wahlström presented an algorithm for solving a CNF formula in which the number of occurrences of each literal is at most
d. Their results demonstrated that the CNF formulas with some restrictions on the number of occurrences (positive or negative) of each variable have its own characteristics.
In order to further study SAT problems with regular structures, we introduced
d-regular (
)-CNF formula in [
16,
17]. The regular (
)-CNF formula requires that each clause contains exactly
k variables and each variable occurs in exactly
s clauses. The
d-regular (
)-CNF formula also requires that the absolute value of the difference between positive and negative occurrences of each variable is no more than a nonnegative integer
d. In this paper, we investigate the existence condition of uniquely satisfiable
d-regular (
)-SAT Instances, and present a method to construct a uniquely satisfiable
d-regular (
)-formula. We also give a parsimonious reduction from
k-CNF to
d-regular (
)-CNF, and further explain the constrained density is not enough to describe the structural features of a CNF formula.
2. Related Works
Unique SAT is the promised version of the SAT, where a given CNF formula has 0 or 1 solution. Valiant and Vazirani in [
18] gave a randomized polynomial time reduction from SAT to Unique SAT, and showed that deciding whether a CNF formula has zero or one solution is essentially as difficult as SAT in general. Calabro et al. in [
19] proved that Unique
k-SAT is no easier than
k-SAT, not just for polynomial time algorithms but also super-polynomial time algorithms. They in [
5] pointed out it does not matter whether there has a promise that a formula has a solution. Matthews in [
20] studied the complexity of UNIQUE-(
)-SAT and proved that
for
, where
is the minimal value of
s so that uniquely satisfiable (
)-CNF formulas exist and
represents the maximal value of
s such that all (
)-CNF formulas are satisfiable. The exact values of
are only known for
and
, because
,
were shown in [
21]. In [
22,
23,
24,
25], it showed that the upper and lower bounds for
,
are described as follows
Encoding into a CNF formula is a common way to solve a practical problem. These CNF formulas often have some special structures and properties. It is important to design some random SAT models that are similar to reality. Markström in [
26] proposed a constructor method of SAT instance based on Eulerian graphs, and discussed how a solver can try to avoid at least some of the pitfalls presented by these instances. Giraldez-Cru and Levy in [
27] proposed a new model of generation of random SAT instances with community structure, and showed that modern solvers do actually exploit this community structure. In [
28], they presented a random SAT instances generator based on the notion of locality, and showed that CDCL SAT solvers take advantage of both popularity and similarity. In [
29,
30], it showed that SAT instances with less solutions tend to be harder for stochastic local search methods. In [
31], Žnidarič gave an experimental evaluation of uniquely satisfiable 3-SAT instances obtained by simply filtering randomly generated formulas.
In this paper, we investigate a uniquely satisfiable d-regular ()-SAT Instances, and show that , for , and for . Here denotes the minimal value of s such that uniquely satisfiable d-regular ()-CNF formulas exist, and denotes the maximal value of s such that all d-regular ()-CNF formulas are satisfiable. We demonstrate that for and , there is a uniquely satisfiable d-regular ()-CNF formula. We also reveal that for , if a d-regular ()-CNF formula is unsatisfiable, then . Finally, for , and , we give a parsimonious reduction from a k-CNF formula to a d-regular ()-CNF formula. Constructing uniquely satisfiable d-regular ()-CNF formulas from an unsatisfiable d-regular ()-CNF formula is a key component of our reduction.
3. Notations
A literal is a boolean variable x or a negated boolean variable . x is called a positive literal, and is called a negative literal. A clause C is a disjunction of literals, or . A formula F in the conjunctive normal formula is a conjunction of clauses, or . denotes the set of boolean variables occurring in a formula F, and refers to the number of variables occurring in F. denotes the number of clauses of F, and () refers to the number of positive (negative) occurrences of a variable x in F. () denotes the number of positive (negative) literals in F, and () refers to the number of positive (negative) occurrences of all variables of the variable set X in F.
A truth assignment is a function which assigns to each boolean variable v a unique value . A CNF formula F is satisfiable, if a truth assignment with exists. Such a truth assignment is called a satisfying assignment. We divide boolean variables in these formulas into forced variables or unforced variables. If every satisfying assignment of a formula sets a variable to the same value, we call it a forced variable. Otherwise, the variable is regarded as an unforced variable.
If the formulas and are either satisfiable at the same time or not, they are called SAT-. This implies that, is satisfiable if and only if is satisfiable. A formula is called the disjoint copy of a CNF formula F, if is a copy of F and their variable sets are disjoint. A uniquely satisfiable d-regular ()-CNF formula is a d-regular ()-CNF with only one solution. A CNF formula F is a minimal unsatisfiable formula (MU), if F is unsatisfiable and is satisfiable for any clause . For a given unsatisfiable formula F, a minimal unsatisfiable formula can be obtained by removing some clauses from F.
Definition 1. For each , is defined as the maximal value of s such that all ()-CNF formulas are satisfiable, is defined as the maximal value of s such that all d-regular ()-CNF formulas are satisfiable, is defined as the minimal value of s such that uniquely satisfiable ()-CNF formulas exist, and is defined as the minimal value of s such that uniquely satisfiable d-regular ()-CNF formulas exist.
Definition 2. A k-CNF formula F is called a k-forced-once d-regular ()-CNF formula if
- (i)
there exist k variables that only occur once;
- (ii)
except for the k variables, every variable occurs in exactly s clauses, and the absolute value of the difference between positive and negative occurrences of every variable is no more than the nonnegative integer d.
- (iii)
F is satisfiable and for any truth assignment τ satisfying F, it holds that
We can represent a CNF formula as a matrix. Each variable corresponds to a row of the matrix and each clause corresponds to a column of the matrix. For each variable , if its positive (resp., negative) literal is in the clause , then (resp., ); otherwise, 0.
Let
F is a CNF formula with 15 variables
and 25 clauses
. The representation matrix of the formula
F is
Clearly, F is a 3-forced 0-regular (3,6)-CNF formula. Each of the three variables occurs in exactly one clause in F and is forced to be .
Definition 3. In the context of SAT, a reduction M is identified to be parsimonious if x and have the same number of satisfying assignments for any one formula x.
Lemma 1 ([
32])
. Let ()-CNF be a class of satisfiable formulas, then all ()-CNF formulas are satisfiable for any nonnegative integer r ( denotes the integral part of x). Lemma 2 ([
17])
. If the representation matrix of a formula F isthen the formula is satisfiable and every satisfying assignment forces all variables to a same value. 4. Uniquely Satisfiable d-Regular ()-CNF Formula
The d-regular ()-CNF formula has stronger regular constraints than the regular ()-CNF formula. It limits the absolute value of the difference between positive and negative occurrences of each variable. The uniquely satisfiable d-regular ()-CNF formula refers to a d-regular ()-CNF formula with only one solution. We investigate the existence conditions of the uniquely satisfiable d-regular ()-CNF formula.
Theorem 1. For all , and .
Proof. Because denotes the maximal value of s such that all d-regular ()-CNF formulas are satisfiable, we usually construct an unsatisfiable d-regular ()-CNF formula to find the upper bound of .
Let . Because denotes the minimal value of s such that uniquely satisfiable d-regular ()-CNF formulas exist, there must be a uniquely satisfiable d-regular ()-CNF formula F. Obviously, by adding a clause to F which is violated by the unique satisfying assignment, the formula F can become an unsatisfiable formula. Suppose the formula F has n variables. We give two methods to construct unsatisfiable instances.
Method 1: We introduce new variables and add new clauses to F, which contains at least one clause violated by the unique satisfying assignment. Let each original variable occurs twice in the new clauses (one negative occurrence and another positive occurrence), and each new variable occurs times in the new clauses (the number of positive and negative occurrences of every new variable is nearly equal). That is, each variable occurs times in F and the absolute value of the difference between positive and negative occurrences of each variable is no more than d. Therefor, F is turned into an unsatisfiable d-regular ()-CNF formula. It can be seen that .
Method 2: We introduce new variables and add new clauses to F, which contains at least one clause violated by the unique satisfying assignment. Let each original variable occur once in the new clauses, and each new variable occurs times in the new clauses (the number of positive and negative occurrences of every new variable is nearly equal). That is, each variable occurs times in F and the absolute value of the difference between positive and negative occurrences of each variable is no more than . Therefor, F is turned into an unsatisfiable ()-regular (k,)-CNF formula. It can be seen that . □
Lemma 3. If and s are two nonnegative integers such that an unsatisfiable d-regular ()-CNF formula exists, there exists a k-forced-once d-regular ()-CNF formula.
Proof. Let be an unsatisfiable d-regular ()-CNF formula. Obviously, the number of positive occurrences and negative occurrences of every variable in are all no more than . By removing some clauses of , a minimal unsatisfiable ()-CNF formula can be obtained. It is easy to get that, the number of positive occurrences and negative occurrences of every variable in are all no more than .
Let , where is the unsatisfiable ()-CNF formula obtained by removing some clauses of , and is a conjunction of the removed clauses. Suppose contains clauses and literals. Let be the clause set of and be the clause set of . A variable y of and a clause c containing are randomly selected. Define , with , where x is a new extra variable that does not occur in . Define . Clearly, the variable x is forced to be .
Let be disjoint copies of the formula with the variable of being renamed as in , and be disjoint copies of the formula , for . In addition, we ensure that every variable occurring both in and is renamed as a same new variable in and , respectively, for .
Introduce a new boolean variable set which does not occur in , . The k-CNF formula is constructed using , the literals of and the variables of Z, for . And it shall meet the following limits.
- (i)
Every variable of Z occurs positively in clauses and negatively in clauses;
- (ii)
All literals of and occur exactly once in , ;
- (iii)
Every clause of must have at least one positive occurrence of any one of Z.
Define .
Obviously, condition (i) and (ii) of Definition 2 hold in (note from the unsatisfiability of ). is satisfiable and forces the variable to be . Because every variable of Z does not occur in , is satisfiable (let the value of every variable of Z be ) without affecting . So it can be concluded that is satisfiable and forces to be . , and only contain x and all literals of . Except for x, every variable of , and occur in s clause, and meet the d-regularity ( is a d-regular ()-CNF formula). Hence, , and meet these requirements (by the definition of disjoint copy). Every variable of Z occurs positively in clauses and negatively in clauses. Thus, occur only once in . Except for the k variables, every variable occurs in exactly s clauses, and the absolute value of the difference between positive and negative occurrences of every variable is no more than d. Therefore, we claim that is a k-forced-once d-regular ()-CNF formula.
Next, we will assess the feasibility of the construction of . If an unsatisfiable d-regular ()-CNF formula exists, should be easily constructed. The number of literals of is , and the number of positive occurrences of the variables of Z in is . The number of clauses of is . For , we obtain , . For , we obtain . As a result, the number of positive occurrences of Z in is greater than that of clauses of . The construction of is almost random (First let each clause get a positive literal of Z, then randomly arrange other literals). Therefore, can be constructed in polynomial time. □
Lemma 4. For , and , we can transform a k-forced-once d-regular ()-CNF formula with n unforced variables into a -forced-once d-regular ()-CNF formula with n unforced variables.
Proof. Let
be a
k-forced-once
d-regular (
)-CNF formula with
n unforced variables, and
denote
k forced variables that only occur once. That is,
are forced to be
. Let
where every
is a fresh variable.
We construct a k-CNF formula with the variable set and the variable set , for , , which meets the following restrictions.
- (i)
every variable of
X and
Y occurs in exactly
clauses of
,
- (ii)
Every clause of must have at least one positive occurrence of any one of these variables.
Define .
Obviously, and are forced to be for (this ensures that is satisfiable). In these forced variables, and occur exactly in one clause of . Except for the variables, every variable occurs in exactly s clauses, and the absolute value of the difference between positive and negative occurrences of every variable is at most d. The number of unforced variables in is still n. So is a -forced-once d-regular ()-CNF formula with n unforced variables.
Next, we will prove that the construction of is feasible. We focus on the satisfiability of the condition ii.
The variables of
consists of two parts:
X and
Y. The variable set
Y has
variables. The variable set
X has
variables. Every variable of
X and
Y occurs in exactly
clauses of
. Obviously, the number of literals of
is
. The number of clauses of
is
When
, all literals in
are positive literal and must satisfy the condition iii. When
, the number of positive occurrences of the variables in
is
So . That indicates that the number of positive literals is more than that of clauses. That is, we can arrange a positive literal for every clause of , then randomly arrange other literals. Hence, can be constructed. □
Theorem 2. For and and , there exists a uniquely satisfiable d-regular ()-CNF formula.
Proof. We will show a way to construct a uniquely satisfiable d-regular ()-CNF formula.
By Lemma 3 and Lemma 4, for , , and , we can construct a -forced-once d-regular ()-CNF formula . It is assumed that the forced variables which occur only once are . Without loss of generality, we assume that forcing n of unforced variables to be can turn into a uniquely satisfiable formula. Let denote the n unforced variables. Let . Constructing a uniquely satisfiable d-regular ()-CNF formula is based on four stages, which are described as follows.
Step 1 Divide the variables arbitrarily into t variable sets of size . Some variables of forced to be are added, so that every variable set contains exactly variables (a variable forced to be can be transformed to a variable forced to be by flipping all occurrences of the variable). The variables are arbitrarily divided into variable sets . Moreover, it should be guaranteed that any one of has variables, any one of has variables and includes the rest. When m is appropriately chosen, the partition is feasible. Now assume contains r variables.
Step 2 For each , we will construct a formula using the variable sets and .
For simplicity, let , , , and . For each , we introduce a new boolean variable set which does not occur in and perform the following steps to construct .
- (i)
Let replace any one of positive occurrences of , and replace any one of negative occurrences of in , for . If does not occur as a positive literal, then we let replace one of other negative occurrences of in and flip all occurrences of in the following formulas . If does not occur as a negative literal, then we perform similar operations.
- (ii)
Define . The new formula with all substitutions performed on is denoted as .
Step 3 We will make up the gap of the number of occurrences of every variable. Using the variables in sets X and , we construct a formula that satisfies the following conditions.
- (i)
For , each in the variable set occurs in exactly clause of and .
- (ii)
For , each in the variable set occurs in exactly clauses of and .
- (iii)
Each variable
x in
occurs in exactly
clauses of
,
- (iv)
Each variable
x in
occurs in exactly
clauses of
and
- (v)
Every clause of must have at least one positive occurrence of any one of the variables.
Step 4 Let .
Clearly, is a d-regular ()-CNF formula. All variables in the set X are forced to be . Hence, is forced to be by . By Lemma 2, and are forced to be the same value. Given that, every variable in Y and Z is forced to be , too. Because all variables in X and Z are forced to be , is apparently satisfiable. Thus, it can be concluded that has only forced variables and the unique solution. That is, is a uniquely satisfiable d-regular ()-CNF formula.
Next, we will discuss the feasibility of constructing . We focus on the formula . For , the number of positive literals should be more than that of clauses.
The variable set
Z generates
literals in
. The variable set
X generate
literals in
. The number of clauses of
is
Every variable of
Z generates
positive literals in
, and very variable of
generates
positive literals in
. About the number of positive literals of
, there are two situations. When
, the number of positive literals of
is
When
, the number of positive literals of
is
Since and , we get . To construct , We first arrange a positive literal for every clause, then randomly arrange other literals. That is, can be constructed in polynomial time.
and can obviously be constructed in polynomial time. Therefore, we can construct a uniquely satisfiable d-regular ()-CNF formula in polynomial time. □
In the previous proof, we construct a uniquely satisfiable d-regular ()-CNF formula by using a ()k-forced-once d-regular ()-CNF formula . m determines the number of forced variables of that only occurs once. If let m be 1 more than our demand, then can preserve k forced variables that only occurs once. Therefore, we get the following lemma.
Lemma 5. For , and , there exists a k-forced-once d-regular ()-CNF formula Ψ where every variable is forced.
Lemma 6. For , if a d-regular ()-CNF formula is unsatisfiable then .
Proof. By
in [
22], if a (7,
s)-CNF formula is unsatisfiable, then
. That is, for any integer
, if a
d-regular (7,
s)-CNF formula is unsatisfiable, then
. It implies that for
, if a
d-regular (
)-CNF formula is unsatisfiable, we can obtain that
.
By
in [
22], all (8,24)-CNF formulas are satisfiable. That is, for any integer
, if a
d-regular (8,
s)-CNF formula is unsatisfiable, then
. As for
, if a
d-regular (
)-CNF formula is unsatisfiable, we get
again.
Using Lemma 1, all ()-CNF formulas are satisfiable for any nonnegative integer r. That is to say, if a ()-CNF formula is unsatisfiable, then for any nonnegative integer r. For , we obtain that for if a ()-CNF formula is unsatisfiable, then . □
Theorem 3. For all and , there exist uniquely satisfiable d-regular ()-CNF formulas.
Proof. By the definition of , if and , there exists an unsatisfiable d-regular ()-CNF formula. Using Lemma 6, we get . By Theorem 2, we obtain that there exist uniquely satisfiable d-regular ()-CNF formulas. □
By Theorem 3, for
, we get
. Matthews in [
20] showed that
. Using Theorem 2, it is easy to achieve
.
Theorem 4. For all , .
Proof. Let
d be a infinite integer. That is, any one of (
)-CNF formulas is a
d-regular (
)-CNF formula and any one of
d-regular (
)-CNF formulas is a (
)-CNF formula. It holds that
and
. Using Theorem 2, for a infinite integer
d,
and
, there exists a uniquely satisfiable
d-regular (
)-CNF formula. Obviously, a uniquely satisfiable
d-regular (
)-CNF formula must be a uniquely satisfiable (
)-CNF formula. In other words, for
and
, there exists a uniquely satisfiable (
)-CNF formula. By
in [
20], we obtain that
,
. □
Corollary 1. For and , there exists a k-forced-once d-regular ()-CNF formula Ψ that has exactly one satisfying assignment.
Proof. The statement follows directly from Lemmas 5 and 6. □
5. A Parsimonious Polynomial Time Reduction
In [
20], Matthews presented a parsimonious reduction from SAT to (
)-SAT for any
and
. We will transform parsimoniously a
k-CNF formula into a
d-regular (
)-CNF formula.
Theorem 5. For any constants , and , there exists a parsimonious polynomial time reduction from k-CNF to d-regular ()-CNF.
Proof. Let be an arbitrarily k-CNF formula. It is supposed that contains m clauses. Obviously, contains literals . We will construct a d-regular ()-CNF formula that is SAT-equivalent with the formula , and they have the same number of solutions. Based on Lemma 5, we first construct a k-forced-once d-regular ()-CNF formula where every variable is forced. It is assumed that k forced variables that occur only once are .
The reduction method has five steps, which are described as follows.
Step 1 We introduce a new boolean variable set
to replace
literals in
in order to construct a new formula
.
Here, is the jth literal of the ith clause of .
Step 2 Let be disjoint copies of the formula with the variables of being renamed as in . All of are renumbered and formed a variable set . Let .
Step 3 Let , and . Here and if replaces a variable v in , then will point to the next variable in Z that replaces v (if is the last variable in Z that replaces v, then will point to the first variable in Z that replaces v). The variables in Z are sorted by their subscripts.
Step 4 We construct a k-CNF formula with two variable sets X and Z, satisfying the following conditions.
- (i)
Every variable
of the variable set
Z occurs in exactly
clauses of
, and if
occurs negatively in
,
- (ii)
For
, every variable
of
X occurs in exactly
clauses of the formula
, and
- (iii)
For
, every variable
of the variable set
X occurs in exactly
clauses of the formula
, and
- (iv)
Every clause of must have at least one positive occurrence of any one of the variable set X.
Step 5 We construct the formula .
Obviously, every variable of occurs in exactly s clauses, and the absolute value of the difference between positive and negative occurrences of every variable of is at most d. Therefore, is a d-regular ()-CNF formula. Next, we will evaluate the feasibility of , SAT-equivalent with and , the parsimony of the reduction.
First, we focus on the feasibility of . The formulas apparently can be constructed in polynomial time. With respect to the formula , we need to consider the condition iv. That is to say, the number of positive occurrences of X in should be more than the number of clauses of .
The variables of
consists of two parts:
X and
Z. The variable set
X generates
literals in
. The variable set
Z generates
literals in
. The number of clauses of
For
,
and
. So,
For
,
and
. So,
,
. The number of positive occurrences of
X in
For
and
, we get
Obviously, the number of positive literals of X is more than the number of clauses in . To construct , We first arrange a positive literal for every clause, then randomly arrange other literals. That is, the formula can be constructed in polynomial time.
Second, we will prove that the formula is satisfiable if and only if the formula is satisfiable.
It is assumed that
is satisfied by a truth assignment
on
and
is satisfied by a truth assignment
on
for
. Because
forces the variable
to be
,
must be
. A truth assignment
is defined by
Obvious, the truth assignment can satisfy these formulas . Every clause of must have at least one positive occurrence of any one of X. As a result, also can satisfy the formula . The formula is a conjunction of . Thus, can satisfy the formula certainly.
It is assumed that
is satisfied by a truth assignment
over
. Obviously, the truth assignment
can satisfy these formulas
. For
, the truth assignment
can satisfy these formulas
. Because
forces the variable
to be
,
We substitute Equation (
1) into
, and simplify
. The simplified
contains some similar structure that are mentioned in Lemma 2. According to Lemma 2, if
and
replace the same variable of
,
. Therefore, we define a truth assignment
on
by
Obviously, the truth assignment can satisfy the formula , and the formula is satisfiable.
Therefore, is SAT-equivalent with .
Finally, we will explain why the polynomial-time reduction is parsimonious. If is satisfiable, all variables in X are forced to be . Due to the formula , all variables of Z that replaced the same variable of are forced to be the same value in every satisfying assignment. Thus, the number of satisfying assignments cannot be changed by introducing new variable set Z. Due to only one solution of , must not influence the number of satisfying assignments. Therefore, has as many satisfying assignments as the formula . □