Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes

Ostonov, Azimkhon; Moshkov, Mikhail

doi:10.3390/e26060519

Open AccessArticle

Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes

by

Azimkhon Ostonov

^*

and

Mikhail Moshkov

Computer, Electrical and Mathematical Sciences & Engineering Division and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Entropy 2024, 26(6), 519; https://doi.org/10.3390/e26060519

Submission received: 13 May 2024 / Revised: 14 June 2024 / Accepted: 15 June 2024 / Published: 17 June 2024

(This article belongs to the Section Information Theory, Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we consider classes of decision tables with many-valued decisions closed under operations of the removal of columns, the changing of decisions, the permutation of columns, and the duplication of columns. We study relationships among three parameters of these tables: the complexity of a decision table (if we consider the depth of the decision trees, then the complexity of a decision table is the number of columns in it), the minimum complexity of a deterministic decision tree, and the minimum complexity of a nondeterministic decision tree. We consider the rough classification of functions characterizing relationships and enumerate all possible seven types of relationships.

Keywords:

closed classes of decision tables; deterministic decision trees; nondeterministic decision trees

1. Introduction

In this paper, we consider closed classes of decision tables with many-valued decisions and study the relationships among three parameters of these tables: the complexity of a decision table (if we consider the depth of decision trees, then the complexity of a decision table is the number of columns in it), the minimum complexity of a deterministic decision tree, and the minimum complexity of a nondeterministic decision tree.

A decision table with many-valued decisions is a rectangular table in which columns are labeled with attributes, rows are pairwise different, and each row is labeled with a nonempty, finite set of decisions. Rows are interpreted as tuples of values of the attributes. For a given row, it is required to find a decision from the set of decisions attached to the row. To this end, we can use the following queries: we can choose an attribute and ask what is the value of this attribute in the considered row. We study two types of algorithms based on these queries: deterministic and nondeterministic decision trees. One can interpret nondeterministic decision trees for a decision table as a way to represent an arbitrary system of true decision rules for this table that covers all rows. We consider in some sense arbitrary complexity measures that characterize the time complexity of decision trees. Among them, we distinguish so-called limited complexity measures, for example, the depth of decision trees.

Decision tables with many-valued decisions often appear in data analysis, where they are known as multilabel decision tables [1,2,3]. Moreover, decision tables with many-valued decisions are common in such areas as combinatorial optimization, computational geometry, and fault diagnosis, where they are used to represent and explore problems.

Decision trees [4,5,6,7] and decision rule systems [8,9,10,11,12] are widely used as classifiers as a means for knowledge representation and as algorithms for solving various problems of combinatorial optimization, fault diagnosis, etc. Decision trees and rules are among the most interpretable models in data analysis [13].

The depth of deterministic and nondeterministic decision trees for computation Boolean functions (variables of a function are considered as attributes) has been studied quite intensively [14,15,16]. Note that the minimum depth of a nondeterministic decision tree for a Boolean function is equal to its certificate complexity [17].

We study classes of decision tables with many-valued decisions closed under four operations: the removal of columns, the changing of decisions, the permutation of columns, and the duplication of columns. The most natural examples of such classes are closed classes of decision tables generated by information systems [18]. An information system consists of a set of objects (universe) and a set of attributes (functions) defined on the universe and with values from a finite set. A problem over an information system is specified by a finite number of attributes that divide the universe into nonempty domains in which these attributes have fixed values. A nonempty finite set of decisions is attached to each domain. For a given object from the universe, it is required to find a decision from the set attached to the domain containing this object.

A decision table with many-valued decisions corresponds to this problem in a natural way: the columns of this table are labeled with the considered attributes, and the rows correspond to domains and are labeled with sets of decisions attached to domains. The set of decision tables corresponding to problems over an information system forms a closed class generated by this system. Note that the family of all closed classes is essentially wider than the family of closed classes generated by information systems. In particular, the union of two closed classes generated by two information systems is a closed class. However, generally, there is not an information system that generates this class.

Various classes of objects that are closed under different operations have been intensively studied. Among them, in particular, are classes of Boolean functions closed under the operation of superposition [19], minor-closed classes of graphs [20], classes of read-once Boolean functions closed under the removal of variables and the renaming of variables, languages closed under taking factors, etc. Decision tables represent an interesting mathematical object deserving mathematical research, particularly regarding the study of closed classes of decision tables.

This paper continues the study of closed classes of decision tables that started with the work of [21] and that were frozen for various reasons for many years. In [21], we studied the dependence of the minimum depth of deterministic decision trees and the depth of deterministic decision trees constructed by a greedy algorithm on the number of attributes (columns) for conventional decision tables from classes closed under operations of the removal of columns and the changing of decisions.

In the present paper, we study so-called t pairs

(C, ψ)

, where

C

is a class of decision tables closed under the considered four operations, and

ψ

is a complexity measure for this class. The t pair is called limited if

ψ

is a limited complexity measure. For any decision table

T \in C

, we have three parameters:

$ψ^{i} (T)$ —The complexity of the decision table T. This parameter is equal to the complexity of a deterministic decision tree for the table T, which sequentially computes the values of all attributes attached to columns of T.
$ψ^{d} (T)$ —The minimum complexity of a deterministic decision tree for the table T.
$ψ^{a} (T)$ —The minimum complexity of a nondeterministic decision tree for the table T.

We investigate the relationships between any two such parameters for decision tables from

C

. Let us consider, for example, the parameters

ψ^{i} (T)

and

ψ^{d} (T)

. Let

n \in N

. We study relations of the kind

ψ^{i} (T) \leq n \Rightarrow ψ^{d} (T) \leq u

, which are true for any table

T \in C

. The minimum value of u is the most interesting for us. This value (if it exists) is equal to

U_{C ψ}^{d i} (n) = max \{ψ^{d} (T) : T \in C, ψ^{i} (T) \leq n\} .

We also study relations of the kind

ψ^{i} (T) \geq n \Rightarrow ψ^{d} (T) \geq l

. In this case, the maximum value of l is the most interesting for us. This value (if it exists) is equal to

L_{C ψ}^{d i} (n) = min \{ψ^{d} (T) : T \in C, ψ^{i} (T) \geq n\} .

The two functions

U_{C ψ}^{d i}

and

L_{C ψ}^{d i}

describe how the behavior of the parameter

ψ^{d} (T)

depends on the behavior of the parameter

ψ^{i} (T)

for tables from

C

.

There are 18 similar functions for all ordered pairs of parameters

ψ^{i} (T)

,

ψ^{d} (T)

, and

ψ^{a} (T)

. These 18 functions well describe the relationships among the considered parameters. It would be very interesting to point out the 18 tuples of these functions for all t pairs and all limited t pairs. But, this is a very difficult problem.

In this paper, instead of functions, we study types of functions. With any partial function

f : N \to N

, we associate its type from the set

{α, β, γ, δ, ϵ}

. For example, if the function f has an infinite domain, and it is bounded from above, then its type is equal to

α

. If the function f has an infinite domain, is not bounded from above, and the inequality

f (n) \geq n

holds for a finite number of

n \in N

, then its type is equal to

β

. Thus, we enumerate the 18 tuples of the types of functions. These tuples are represented in tables called the types of t-pairs. We prove that there are only seven realizable types of t pairs and only five realizable types of limited t pairs.

First, we study 9 tuples of the types of functions

U_{C ψ}^{b c}

,

b, c \in {i, d, a}

. These tuples are represented in tables called upper types of t pairs. We enumerate all the realizable upper types of t pairs and limited t pairs. After that, we extend the results obtained for the upper types of t pairs to the case of the types of t pairs. We also define the notion of a union of two t pairs and study the upper type of the resulting t pair, thus depending on the upper types of the initial t pairs.

The obtained results allow us to point out cases where the complexity of deterministic and nondeterministic decision trees is essentially less than the complexity of the decision table (see Section 2.3). This finding may prove useful in related applications.

This paper is based on the work of [22], in which similar results were obtained for classes of problems over information systems. We have generalized proofs from [22] to the case of decision tables from closed classes and use some results from this paper to prove the existence of t pairs and limited t pairs with given upper types.

In our previous work [7], we considered functions characterizing the growth in the worst case of the minimum complexity of deterministic and nondeterministic decision trees with the growth of the complexity of the set of attributes attached to columns of the conventional decision table and also obtained preliminary results on the behavior of the function characterizing the relationship between the former two parameters. In the current work, we mainly focus on the rough classification of types.

The paper consists of eight sections. In Section 2, the basic definitions are considered. In Section 3, we provide the main results related to the types of t pairs and limited t pairs. In Section 4, Section 5 and Section 6, we study the upper types of t pairs and the limited t pairs. Section 7 contains proofs of the main results, and Section 8 provides short conclusions.

2. Basic Definitions

2.1. Decision Tables and Closed Classes

Let

N = {0, 1, 2, \dots}

be the set of non-negative integers. For any

k \in N ∖ {0, 1}

, let

E_{k} = {0, 1, \dots, k - 1}

. The set of nonempty finite subsets of the set

N

will be denoted by

P (N)

. Let F be a nonempty set of attributes (really, the names of attributes).

Definition 1.

We now define the set of decision tables

M_{k} (F)

. An arbitrary decision table T from this set is a rectangular table with

n \in N ∖ {0}

columns labeled with attributes

f_{1}, \dots, f_{n} \in F

, where any two columns labeled with the same attribute are equal. The rows of this table are pairwise different and are filled in with numbers from

E_{k}

. Each row is interpreted as a tuple of values of attributes

f_{1}, \dots, f_{n}

. For each row in the table, a set from

P (N)

is attached, which is interpreted as a set of decisions for this row.

Example 1.

Three decision tables

T_{1}

,

T_{2}

, and

T_{3}

from the set

M_{2} (F_{0})

, where

F_{0} = {f_{1}, f_{2}, f_{3}}

, are shown in Figure 1.

We correspond to the table T the following problem: for a given row of T, we should recognize a decision from the set of decisions attached to this row. To this end, we can use queries about the values of the attributes for this row.

We denote as

A t (T)

the set

{f_{1}, \dots, f_{n}}

of attributes attached to the columns of T.

Π (T)

denotes the intersection of the sets of decisions attached to the rows of T, and by

Δ (T)

, we denote the set of rows of the table T. Decisions from

Π (T)

are called common decisions for T. The table T will be called degenerate if

Δ (T) = ⌀

or if

Π (T) \neq ⌀

. We denote as

M_{k}^{c} (F)

the set of degenerate decision tables from

M_{k} (F)

.

Example 2.

Two degenerate decision tables,

D_{1}

and

D_{2}

, are shown in Figure 2.

Definition 2.

A subtable of the table T is a table obtained from T through the removal of some of its rows. Let

Θ (T) = {(f, δ) : f \in A t (T), δ \in E_{k}}

and

Θ^{*} (T)

be the set of all finite words in the alphabet

Θ (T)

, including the empty word λ. Let

α \in Θ^{*} (T)

. We now define a subtable

T α

of the table T. If

α = λ

, then

T α = T

. Let

α = (f_{i_{1}}, δ_{1}) \dots (f_{i_{m}}, δ_{m})

. Then,

T α

consists of all the rows of T that, in the intersection with columns

f_{i_{1}}, \dots, f_{i_{m}}

, have values

δ_{1}, \dots, δ_{m}

, respectively.

Example 3.

Two subtables of the tables

T_{1}

and

T_{2}

(depicted in Figure 1) are shown in Figure 3.

We now define four operations on the set

M_{k} (F)

of decision tables:

Definition 3.

Removal of columns: We can remove an arbitrary column in a table T with at least two columns. As a result, the obtained table can have groups of equal rows. We keep only the first row in each such group.

Definition 4.

Changing of decisions: In a given table T, we can change in an arbitrary way sets of decisions attached to rows.

Definition 5.

Permutation of columns: We can swap any two columns in a table T, including the attached attribute names.

Definition 6.

Duplication of columns: For any column in a table T, we can add its duplicate next to that column.

Definitions 5 and 6 characterize the two most natural examples of operations applied to information systems. Definitions 3 and 4 allows us to say that we cover important classes of information systems (see Section 2.4).

Example 4.

Decision tables

T_{1}^{'}

,

T_{2}^{'}

,

T_{1}^{''}

, and

T_{2}^{''}

depicted in Figure 4 are obtained from decision tables

T_{1}

and

T_{2}

shown in Figure 1 by operations of changing the decisions, removal of columns, permutation of columns, and duplication of columns, respectively.

Definition 7.

Let

T \in M_{k} (F)

. The closure of the table T is a set, which contains all the tables that can be obtained from T by the operations of the removal of columns, the changing of decisions, the permutation of columns, and the duplication of columns using only such tables. We denote the closure of the table T by

[T]

. It is clear that

T \in [T]

.

Definition 8.

Let

C \subseteq M_{k} (F)

. The closure

[C]

of the set

C

is defined in the following way:

[C] = ⋃_{T \in C} [T]

. We say that

C

is a closed class if

C = [C]

. In particular, the empty set of tables is a closed class.

Example 5.

We now consider a closed class

C_{0}

of decision tables from the set

M_{2} ({f_{1}, f_{2}})

, which is equal to

[Q]

, where the decision table Q is depicted in Figure 5. The closed class

C_{0}

contains all the tables depicted in Figure 6 and all the tables that can be obtained from them by the operations of the duplication of columns and the permutation of columns.

If

C_{1}

and

C_{2}

are closed classes belonging to

M_{k} (F)

, then

C_{1} \cup C_{2}

is also a closed class. We can consider closed classes

C_{1}

and

C_{2}

belonging to different sets of decision tables. Let

C_{1} \subseteq M_{k_{1}} (F_{1})

and

C_{2} \subseteq M_{k_{2}} (F_{2})

. Then,

C_{1} \cup C_{2}

is a closed class, and

C_{1} \cup C_{2} \subseteq M_{max (k_{1}, k_{2})} (F_{1} \cup F_{2})

.

2.2. Deterministic and Nondeterministic Decision Trees

A finite directed tree with the root is a finite directed tree in which exactly one node has no entering edges. This node is called the root. Nodes of the tree, which have no outgoing edges, are called terminal nodes. Nodes that are neither the root nor the terminal are called worker nodes. A complete path in a finite directed tree with the root is any sequence of nodes and edges starting from the root node and ending with a terminal node

ξ = v_{0}, d_{0}, \dots, v_{m}, d_{m}, v_{m + 1}

, where

d_{i}

is the edge outgoing from the node

v_{i}

and entering the node

v_{i + 1}, i = 0, \dots, m

.

Definition 9.

A decision tree over the set of decision tables

M_{k} (F)

is a labeled finite directed tree with the root with at least two nodes (the root and a terminal node) possessing the following properties:

•: The root and the edges outgoing from the root are not labeled.
•: Each worker node is labeled with an attribute from the set F.
•: Each edge outgoing from a worker node is labeled with a number from $E_{k}$ .
•: Each terminal node is labeled with a number from $N$ .

We denote as

T_{k} (F)

the set of decision trees over the set of decision tables

M_{k} (F)

.

Definition 10.

A decision tree from

T_{k} (F)

is called deterministic if it satisfies the following conditions:

•: Exactly one edge leaves the root.
•: The edges outgoing from each worker node are labeled with pairwise different numbers.

Let

Γ

be a decision tree from

T_{k} (F)

. Denote as

A t (Γ)

the set of attributes attached to the worker nodes of

Γ

. Set

Θ (Γ) = {(f, δ) : f \in A t (Γ), δ \in E_{k}}

. Denote as

Θ^{*} (Γ)

the set of all finite words in the alphabet

Θ (Γ)

, including the empty word

λ

. We correspond to an arbitrary complete path

ξ = v_{0}, d_{0}, \dots, v_{m}, d_{m}, v_{m + 1}

in

Γ

, as well as a word

π (ξ)

. If

m = 0

, then

π (ξ) = λ

. Let

m > 0

and, for

i = 1, \dots, m

, the node

v_{i}

is labeled with an attribute

f_{j_{i}}

, and the edge

d_{i}

is labeled with the number

δ_{i}

. Then,

π (ξ) = (f_{j_{1}}, δ_{1}) \dots (f_{j_{m}}, δ_{m})

. We denote as

τ (ξ)

the number attached to the terminal node of the path

ξ

. We denote as

P a t h (Γ)

the set of complete paths in the tree

Γ

.

Definition 11.

Let

T \in M_{k} (F)

. A nondeterministic decision tree for the table T is a decision tree Γ over

M_{k} (F)

satisfying the following conditions:

$A t (Γ) \subseteq A t (T) .$
$⋃_{ξ \in P a t h (Γ)} Δ (T π (ξ)) = Δ (T) .$
For any row $r \in Δ (T)$ and any complete path $ξ \in P a t h (Γ)$ , if $r \in Δ (T π (ξ))$ , then $τ (ξ)$ belongs to the set of decisions attached to the row r.

Example 6.

Nondeterministic decision trees

Γ_{1}

and

Γ_{2}

for decision tables

T_{1}

and

T_{2}

shown in Figure 1 are depicted in Figure 7.

Definition 12.

A deterministic decision tree for the table T is a deterministic decision tree over

M_{k} (F)

, which is a nondeterministic decision tree for the table T.

Example 7.

Deterministic decision trees

Γ_{1}^{'}

and

Γ_{2}^{'}

for decision tables

T_{1}

and

T_{2}

shown in Figure 1 are depicted in Figure 8.

2.3. Complexity Measures

Denote as

F^{*}

the set of all finite words over the alphabet F, including the empty word

λ

.

Definition 13.

A complexity measure over the set of decision tables

M_{k} (F)

is any mapping

ψ : F^{*} \to N

.

Definition 14.

The complexity measure ψ will be called limited if it possesses the following properties:

(a): $ψ (α_{1} α_{2}) \leq ψ (α_{1}) + ψ (α_{2})$ for any $α_{1}, α_{2} \in F^{*}$ .
(b): $ψ (α_{1} α_{2} α_{3}) \geq ψ (α_{1} α_{3})$ for any $α_{1}, α_{2}, α_{3} \in F^{*}$ .
(c): For any $α \in F^{*}$ , the inequality $ψ (α) \geq | α |$ holds, where $| α |$ is the length of α.

We extend an arbitrary complexity measure

ψ

onto the set

T_{k} (F)

in the following way. Let

Γ \in T_{k} (F)

. Then,

ψ (Γ) = max {ψ (φ (ξ)) : ξ \in Path (Γ)}

, where

φ (ξ) = λ

if

π (ξ) = λ

and

φ (ξ) = f_{1} \dots f_{m}

if

π (ξ) = (f_{1}, δ_{1}) \dots (f_{m}, δ_{m})

. The value

ψ (Γ)

will be called the complexity of the decision tree Γ.

We now consider an example of a complexity measure. Let

w : F \to N ∖ {0}

. We define the function

ψ^{w} : F^{*} \to N

in the following way:

ψ^{w} (α) = 0

if

α = λ

and

ψ^{w} (α) = \sum_{i = 1}^{m} w (f_{i})

if

α = f_{1} \dots f_{m}

. The function

ψ^{w}

is a limited complexity measure over

M_{k} (F)

, and it is called a weighted depth. If

w \equiv 1

, then the function

ψ^{w}

is called the depth and is denoted by h.

Let

ψ

be a complexity measure over

M_{k} (F)

and T be a decision table from

M_{k} (F)

, in which rows are labeled with attributes

f_{1}, \dots, f_{n}

. The value

ψ^{i} (T) = ψ (f_{1} \dots f_{n})

is called the complexity of the decision table T. We denote by

ψ^{d} (T)

the minimum complexity of a deterministic decision tree for the table T. We denote by

ψ^{a} (T)

the minimum complexity of a nondeterministic decision tree for the table T.

2.4. Information Systems

Let A be a nonempty set and F be a nonempty set of functions from A to

E_{k}

.

Definition 15.

Functions from F are called attributes, and the pair

U = (A, F)

is called an information system.

Definition 16.

A problem over U is any

(n + 1)

tuple

z = (ν, f_{1}, \dots, f_{n})

, where

n \in N ∖ {0}

,

ν : E_{k}^{n} \to P (N)

, and

f_{1}, \dots, f_{n} \in F

.

The problem z can be interpreted as a problem of searching for at least one number from the set

z (a) = ν (f_{1} (a), \dots, f_{n} (a))

for a given

a \in A

. We denote as

P r o b l (U)

the set of problems over the information system U.

We correspond to the problem z a decision table

T (z) \in M_{k} (F)

. This table has n columns labeled with attributes

f_{1}, \dots, f_{n}

. A tuple

\bar{δ} = (δ_{1}, \dots, δ_{n}) \in E_{k}^{n}

is a row of the table

T (z)

if and only if the system of equations

{f_{1} (x) = δ_{1}, \dots, f_{n} (x) = δ_{n}}

has a solution from the set A. This row is labeled with the set of decisions

ν (\bar{δ})

. Let

T a b (U) = {T (z) : z \in P r o b l (U)}

. One can show that the set

T a b (U)

is a closed class of decision tables.

Closed classes of decision tables based on information systems are the most natural examples of closed classes. However, the notion of a closed class is essentially wider. In particular, the union

T a b (U_{1}) \cup T a b (U_{2})

, where

U_{1}

and

U_{2}

are information systems, is a closed class, but generally, we cannot find an information system U such that

T a b (U) = T a b (U_{1}) \cup T a b (U_{2})

.

2.5. Types of T Pairs

First, we define the notion of a t pair.

Definition 17.

A pair

(C, ψ)

, where

C

is a closed class of decision tables from

M_{k} (F)

, and ψ is a complexity measure over

M_{k} (F)

, is called a test pair (or t pair for short). If ψ is a limited complexity measure, then t pair

(C, ψ)

will be called a limited t pair.

Let

(C, ψ)

be a t pair. We have three parameters

ψ^{i} (T), ψ^{d} (T)

, and

ψ^{a} (T)

for any decision table

T \in C

. We now define functions that describe the relationships among these parameters. Let

b, c \in {i, d, a}

.

Definition 18.

We define the partial functions

U_{C ψ}^{b c} : N \to N

and

L_{C ψ}^{b c} : N \to N

as

\begin{matrix} U_{C ψ}^{b c} (n) = max \{ψ^{b} (T) : T \in C, ψ^{c} (T) \leq n\}, \\ L_{C ψ}^{b c} (n) = min \{ψ^{b} (T) : T \in C, ψ^{c} (T) \geq n\} . \end{matrix}

If the value

U_{C ψ}^{b c} (n)

is definite, then it is the unimprovable upper bound on the values

ψ^{b} (T)

for tables

T \in C

satisfying

ψ^{c} (T) \leq n

. If the value

L_{C ψ}^{b c} (n)

is definite, then it is the unimprovable lower bound on the values

ψ^{b} (T)

for tables

T \in C

satisfying

ψ^{c} (T) \geq n

.

Let g be a partial function from

N

to

N

. We denote as

Dom (g)

the domain of g. Denote

{Dom}^{+} (g) = {n : n \in Dom (g), g (n) \geq n}

and

{Dom}^{-} (g) = {n : n \in

Dom (g), g (n) \leq n}

.

Definition 19.

Now, we define the value

typ (g) \in {α, β, γ, δ, ϵ}

as the type of g. Then, we have the following:

If $Dom (g)$ is an infinite set and g is bounded from the above function, then $typ (g) = α$ .
If $Dom (g)$ is an infinite set, ${Dom}^{+} (g)$ is a finite set, and g is unbounded from the above function, then $typ (g) = β$ .
If both sets ${Dom}^{+} (g)$ and ${Dom}^{-} (g)$ are infinite, then $typ (g) = γ$ .
If $Dom (g)$ is an infinite set and ${Dom}^{-} (g)$ is a finite set, then $typ (g) = δ$ .
If $Dom (g)$ is a finite set, then $typ (g) = ϵ$ .

Example 8.

One can show that

typ (1) = α

,

typ (⌈ {log}_{2} n ⌉) = β

,

typ (n) = γ

,

typ (n^{2}) = δ

, and

typ (\frac{1}{⌊ 1 / n ⌋}) = ϵ

.

Definition 20.

We now define the table

typ (C, ψ)

, which is called the type of t pair

(C, ψ)

. This is a table with three rows and three columns, in which the rows from top to bottom and the columns from left to right are labeled with the indices

i, d, a

. The pair

typ (L_{C ψ}^{b c}) typ (U_{C ψ}^{b c})

is in the intersection of the row with index

b \in {i, d, a}

and the column with index

c \in {i, d, a}

.

3. Main Results

The main problem investigated in this paper is finding all the types of t pairs and limited t pairs. The solution to this problem describes all the possible (in terms of functions

U_{C ψ}^{b c}, L_{C ψ}^{b c}

and types,

b, c \in {i, d, a}

) relationships among the complexity of decision tables, the minimum complexity of the nondeterministic decision trees for them, and the minimum complexity of the deterministic decision trees for these tables. We now define seven tables:

Theorem 1.

For any t pair

(C, ψ)

, the relation

typ (C, ψ) \in {T_{1}, T_{2}, T_{3}, T_{4}, T_{5}, T_{6}, T_{7}}

holds. For any

i \in {1, 2, 3, 4, 5, 6, 7}

, there exists a t pair

(C, ψ)

such that

typ (C, ψ) = T_{i}

.

Theorem 2.

For any limited t pair

(C, ψ)

, the relation

typ (C, ψ) \in {T_{2}, T_{3}, T_{5}, T_{6}, T_{7}}

holds. For any

i \in {2, 3, 5, 6, 7}

, there exists a limited t pair

(C, h)

such that

typ (C, h) = T_{i}

.

4. Possible Upper Types of T Pairs

We begin our study by considering the upper type of t pair, which is a simpler object than the type of t pair.

Definition 21.

Let

(C, ψ)

be a t pair. We now define table

{typ}_{u} (C, ψ)

, which will be called the upper type of t pair

(C, ψ)

. This is a table with three rows and three columns, in which the rows from top to bottom and the columns from left to right are labeled with the indices

i, d, a

. The value

typ (U_{C ψ}^{b c})

is in the intersection of the row with index

b \in {i, d, a}

and the column with index

c \in {i, d, a}

. The table

{typ}_{u} (C, ψ)

is called the upper type of t pair

(C, ψ)

.

In this section, all possible upper types of t pairs are enumerated. We now define seven tables:

Proposition 1.

For any t pair

(C, ψ)

, the relation

{typ}_{u} (C, ψ) \in {t_{1}, t_{2}, t_{3}, t_{4}, t_{5}, t_{6}, t_{7}}

holds.

Proposition 2.

For any limited t pair

(C, ψ)

, the relation

{typ}_{u} (C, ψ) \in {t_{2}, t_{3}, t_{5}, t_{6}, t_{7}}

holds.

We divide the proofs of the propositions into a sequence of lemmas.

Lemma 1.

Let T be a decision table from a set of decision tables

M_{k} (F)

, and let ψ be a complexity measure over

M_{k} (F)

. Then, the inequalities

ψ^{a} (T) \leq ψ^{d} (T) \leq ψ^{i} (T)

hold.

Proof.

Let the columns of table T be labeled with the attributes

f_{1}, \dots, f_{n}

. It is not difficult to construct a deterministic decision tree

Γ_{0}

for table T, which sequentially computes the values of attributes

f_{1}, \dots, f_{n}

. Evidently,

ψ (Γ_{0}) = ψ^{i} (T)

. Therefore,

ψ^{d} (T) \leq ψ^{i} (T)

. If a decision tree

Γ

is a deterministic decision tree for T, then

Γ

is a nondeterministic decision tree for T. Therefore,

ψ^{a} (T) \leq ψ^{d} (T)

. □

Let

(C, ψ)

be a t pair,

n \in N

, and

b, c \in {i, d, a}

. The notation

U_{C ψ}^{b c} (n) = \infty

means that the set

X = {ψ^{b} (T) : T \in C, ψ^{c} (T) \leq n}

is infinite. The notation

U_{C ψ}^{b c} (n) = ⌀

means that the set X is empty. Evidently, if

U_{C ψ}^{b c} (n) = \infty

, then

U_{C ψ}^{b c} (n + 1) = \infty

. It is not difficult to prove the following statement.

Lemma 2.

Let

(C, ψ)

be a t pair, and

b, c \in {i, d, a}

. Then, we have the following:

(a) If there exists

n \in N

such that

U_{C ψ}^{b c} (n) = \infty

, then

typ (U_{C ψ}^{b c}) = ϵ

.

(b) If there is no

n \in N

such that

U_{C ψ}^{b c} (n) = \infty

, then

Dom (U_{C ψ}^{b c}) = {n : n \in

N, n \geq n_{0}\}

, where

n_{0} = min {ψ^{c} (T) : T \in C}

.

Let

(C, ψ)

be a t pair, and

b, c, e, f \in {i, d, a}

. The notation

U_{C ψ}^{b c} ◃ U_{C ψ}^{e f}

means that, for any

n \in N

, the following statements hold:

(a) If the value

U_{C ψ}^{b c} (n)

is definite, then either

U_{C ψ}^{e f} (n) = \infty

or the value

U_{C ψ}^{e f} (n)

is definite, and the inequality

U_{C ψ}^{b c} (n) \leq U_{C ψ}^{e f} (n)

holds.

(b) If

U_{C ψ}^{b c} (n) = \infty

, then

U_{C ψ}^{e f} (n) = \infty

.

Let ⪯ be a linear order on the set

{α, β, γ, δ, ϵ}

such that

α ⪯ β ⪯ γ ⪯ δ ⪯ ϵ

.

Lemma 3.

Let

(C, ψ)

be a t pair. Then,

typ (U_{C ψ}^{b i}) ⪯ typ (U_{C ψ}^{b d}) ⪯ typ (U_{C ψ}^{b a})

and

typ (U_{C ψ}^{a b}) ⪯ typ (U_{C ψ}^{d b}) ⪯ typ (U_{C ψ}^{i b})

for any

b \in {i, d, a}

.

Proof.

From the definition of the functions

U_{C ψ}^{b c}, b, c \in {i, d, a}

and from Lemma 1, it follows that

U_{C ψ}^{b i} ◃ U_{C ψ}^{b d} ◃ U_{C ψ}^{b a}

and

U_{C ψ}^{a b} ◃ U_{C ψ}^{d b} ◃ U_{C ψ}^{i b}

for any

b \in {i, d, a}

. Using these relations and Lemma 2, we obtain the statement of the lemma. □

Lemma 4.

Let

(C, ψ)

be a t pair, and

b, c \in {i, d, a}

. Then, we have the following:

(a)

typ (U_{C ψ}^{b c}) = α

if and only if the function

ψ^{b}

is bounded from above on the closed class

C

.

(b) If the function

ψ^{b}

is unbounded from above on

C

, then

typ (U_{C ψ}^{b b}) = γ

.

Proof.

The statement (a) is obvious. For (b), let the function

ψ^{b}

be unbounded from above on

C

. One can show that in this case the equality

U_{C ψ}^{b b} (n) = n

holds for infinitely many

n \in N

. Therefore,

typ (U_{C ψ}^{b b}) = γ

. □

Corollary 1.

Let

(C, ψ)

be a t pair, and

b \in {i, d, a}

. Then,

typ (U_{C ψ}^{b b}) \in {α, γ}

.

Lemma 5.

Let

(C, ψ)

be a t pair, and

typ (U_{C ψ}^{i i}) \neq α

. Then,

typ (U_{C ψ}^{i d}) = typ (U_{C ψ}^{i a}) = ϵ .

Proof.

Using Lemma 4, we conclude that the function

ψ^{i}

is unbounded from above on

C

. Let

m \in N

. Then, there exists a decision table

T \in C

for which the inequality

ψ^{i} (T) \geq m

holds. Let us consider a degenerate decision table

T^{'} \in C

obtained from T by replacing the sets of decisions attached to the rows by the set

{0}

. It is clear that

ψ^{i} (T^{'}) \geq m

. Let

Γ

be a decision tree that consists of the root, the terminal node labeled with 0, and the edge connecting these two nodes. One can show that

Γ

is a deterministic decision tree for the table

T^{'}

. Therefore,

ψ^{a} (T^{'}) \leq ψ^{d} (T^{'}) \leq ψ (Γ) = ψ (λ)

. Taking into account that m is an arbitrary number from

N

, we obtain

U_{C ψ}^{i d} (ψ (λ)) = \infty

and

U_{C ψ}^{i a} (ψ (λ)) = \infty

. Using Lemma 2, we conclude that

typ (U_{C ψ}^{i d}) = typ (U_{C ψ}^{i a}) = ϵ

. □

Example 9.

Let us consider a t pair

(C_{0}, h)

, where

C_{0}

is a closed class described in Example 5. It is clear that the function

h^{i}

is unbounded from above on

C_{0}

, and the functions

h^{a}

and

h^{d}

are bounded from above on

C_{0}

. Using Lemma 4, we obtain that

typ (U_{C_{0} h}^{a b}) = typ (U_{C_{0} h}^{d b}) = α

for any

b \in {i, d, a}

, and

typ (U_{C_{0} h}^{i i}) = γ

. Using Lemma 5,

typ (U_{C_{0} h}^{i d}) = typ (U_{C_{0} h}^{i a}) = ϵ

. Therefore,

{typ}_{u} (C_{0}, h) = t_{2}

.

Lemma 6.

Let

(C, ψ)

be a t pair. Then,

typ (U_{C ψ}^{a i}) \in {α, γ}

.

Proof.

Using Lemma 3 and Corollary 1, we obtain

typ (U_{C ψ}^{a i}) \in {α, β, γ}

. Using Lemma 2,

Dom (U_{C ψ}^{a i}) = {n : n \in N, n \geq n_{0}}

for some

n_{0} \in N

. Set

D = Dom (U_{C ψ}^{a i})

. Assume that

typ (U_{C ψ}^{a i}) = β

. Then, there exists

m \in D

such that

U_{C ψ}^{a i} (n) < n

for any

n \in D, n > m

. Let us prove by induction on n that, for any decision table T from

C

, if

ψ^{i} (T) \leq n

, then

ψ^{a} (T) \leq m_{0}

, where

m_{0} = max {m, ψ (λ)}

. Using Lemma 1, we conclude that the considered statement holds under the condition

n \leq m

. Let it hold for some

n, n \geq m

. Let us show that this statement holds for

n + 1

too. Let

T \in C

,

ψ^{i} (T) \leq n + 1

, and let the columns of the table T be labeled with the attributes

f_{i_{1}}, \dots, f_{i_{k}}

. Since

n + 1 > m

, we obtain

ψ^{a} (T) \leq n

. Let

Γ

be a nondeterministic decision tree for the table T, and

ψ (Γ) = ψ^{a} (T)

. Assume that in

Γ

, there exists a complete path

ξ

in which there are no worker nodes. In this case, a decision tree that consists of the root, the terminal node labeled with

τ (ξ)

, and the edge connecting these two nodes is a nondeterministic decision tree for the table T. Therefore,

ψ^{a} (T) \leq ψ (λ) \leq m_{0}

. Assume now that each complete path in the decision tree

Γ

contains a worker node. Let

ξ \in Path (Γ), Δ (T π (ξ)) \neq ⌀

,

ξ = v_{0}, d_{0}, \dots, v_{p}, d_{p}, v_{p + 1}

and, for

i = 1, \dots, p

, the node

v_{i}

is labeled with the attribute

f_{i}

, and the edge

d_{i}

is labeled with the number

δ_{i}

. Let the decision table

T^{'}

be obtained from the decision table T using the operations of the permutation of columns and the duplication of columns so that its columns are labeled with attributes

f_{1}, \dots, f_{p}, f_{i_{1}}, \dots, f_{i_{k}}

. We obtain the decision table

T^{''}

from

T^{'}

by removal of the last k columns. Let us denote as

T_{ξ}

the decision table obtained from

T^{''}

by changing the set of decisions corresponding to the row

(δ_{1}, \dots, δ_{p})

with

{τ (ξ)}

and for the remaining rows with

{τ (ξ) + 1}

. It is clear that

ψ^{i} (T_{ξ}) \leq n

. Using the inductive hypothesis, we conclude that there exists a nondeterministic decision tree

Γ_{ξ}

for the table

T_{ξ}

such that

ψ (Γ_{ξ}) \leq m_{0}

. We denote as

{\tilde{Γ}}_{ξ}

a tree obtained from

Γ_{ξ}

by the removal of all the nodes and edges that satisfy the following condition: there is not a complete path

ξ^{'}

in

Γ_{ξ}

that contains this node or edge and for which

τ (ξ^{'}) = τ (ξ)

. Let

{ξ : ξ \in Path (Γ), Δ (T π (ξ)) \neq ⌀} = \{ξ_{1}, \dots, ξ_{r}\}

. Let us identify the roots of the trees

{\tilde{Γ}}_{ξ_{1}}, \dots, {\tilde{Γ}}_{ξ_{r}}

. We denote as G the obtained tree. It is not difficult to show that G is a nondeterministic decision tree for the table T, and

ψ (G) \leq m_{0}

. Thus, the considered statement holds. Using Lemma 4, we conclude that

typ (U_{C ψ}^{a i}) = α

. The obtained contradiction shows that

typ (U_{C ψ}^{a i}) \in {α, γ}

. □

Let T be a decision table from

M_{k} (F)

. We now give the definitions of the parameters

N (T)

and

M (T)

of the table T.

Definition 22.

We denote as

N (T)

the number of rows in the table T.

Definition 23.

Let the columns of table T be labeled with the attributes

f_{1}, \dots, f_{n} \in F

. We now define the parameter

M (T)

. If table T is degenerate, then

M (T) = 0

. Let T now be a nondegenerate table, and

\bar{δ} = (δ_{1}, \dots, δ_{n}) \in E_{k}^{n}

. Then,

M (T, \bar{δ})

is the minimum natural m such that there exist attributes

f_{i_{1}}, \dots, f_{i_{m}} \in A t (T)

for which

T (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

is a degenerate table. We denote

M (T) = max {M (T, \bar{δ}) : \bar{δ} \in E_{k}^{n}}

.

The following statement follows immediately from Theorem 3.5 [23].

Lemma 7.

Let T be a nonempty decision table from

M_{k} (F)

in which each row is labeled with a set containing only one decision. Then,

h^{d} (T) \leq M (T) {log}_{2} N (T) .

Lemma 8.

Let

(C, ψ)

be a limited t pair, and

typ (U_{C ψ}^{a i}) = α

. Then,

typ (U_{C ψ}^{d i}) \in

{α, β}

.

Proof.

Using Lemma 4, we conclude that there exists

r \in N

such that the inequality

ψ^{a} (T) \leq r

holds for any table

T \in C

. □

Let T be a nonempty table from

C

in which the columns are labeled with the attributes

f_{1}, \dots, f_{n}

and

\bar{δ} = (δ_{1}, \dots, δ_{n}) \in E_{k}^{n}

. We now show that there exist attributes

f_{i_{1}}, \dots, f_{i_{m}} \in A t (T)

such that the subtable

T (\bar{δ}) = T (f_{1}, δ_{1}) \dots (f_{n}, δ_{n})

is equal to the subtable

T (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

, and

m \leq r

if

\bar{δ}

is a row of T; as well,

m \leq r + 1

if

\bar{δ}

is not a row of T.

Let

\bar{δ}

be a row of T. Let us change the set of decisions attached to the row

\bar{δ}

with the set

{1}

and for the remaining rows of T with the set

{0}

. We denote the obtained table as

T^{'}

. It is clear that

T^{'} \in C

. Taking into account that

ψ^{a} (T^{'}) \leq r

and the complexity measure

ψ

has the property (c), it is not difficult to show that there exist attributes

f_{i_{1}}, \dots, f_{i_{m}} \in A t (T^{'}) = A t (T)

such that

m \leq r

, and

T^{'} (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

contains only the row

\bar{δ}

. From here, it follows that

T (\bar{δ}) = T (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

.

Let

\bar{δ}

be not a row of T. Let us show that there exist attributes

f_{i_{1}}, \dots, f_{i_{m}} \in A t (T)

such that

m \leq r + 1

, and the subtable

T (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

is empty. If

T (f_{1}, δ_{1})

is empty, then the considered statement holds. Otherwise, there exists

q \in {1, \dots, n - 1}

such that the subtable

T (f_{1}, δ_{1}) \dots (f_{q}, δ_{q})

is nonempty, but the subtable

T (f_{1}, δ_{1}) \dots (f_{q + 1}, δ_{q + 1})

is empty. We denote as

T^{'}

the table obtained from T by the removal of the attributes

f_{q + 1}, \dots, f_{n}

. It is clear that

T^{'} \in C

, and

(δ_{1}, \dots, δ_{q})

is a row of

T^{'}

. According to what has been proven above, there exist attributes

f_{i_{1}}, \dots, f_{i_{p}} \in {f_{1}, \dots, f_{q}}

such that

T^{'} (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{p}}, δ_{i_{p}}) = T^{'} (f_{1}, δ_{1}) \dots (f_{q}, δ_{q})

and

p \leq r

. Using this fact, one can show that

T (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{p}}, δ_{i_{p}}) (f_{q + 1}, δ_{q + 1})

is empty and is equal to

T (\bar{δ})

.

Let

T_{1} \in C

. We denote as

T_{2}

the decision table obtained from

T_{1}

by the removal of all the columns in which all the numbers are equal. Let the columns of

T_{2}

be labeled with attributes

f_{1}, \dots, f_{n}

. We now consider the decision table

T_{3}

, which is obtained from

T_{2}

by changing the decisions so that the decision set attached to each row of table

T_{3}

contains only one decision and, for any two non-equal rows, the corresponding decisions are different. It is clear that

T_{3} \in C

. It is not difficult to show that

ψ^{d} (T_{1}) \leq ψ^{d} (T_{2}) \leq ψ^{d} (T_{3})

.

We now show that the inequality

ψ (f) \leq r

holds for any attribute

f \in A t (T_{3})

. Let us denote as

T^{'}

the decision table obtained from

T_{3}

by the removal of all the columns except the column labeled with the attribute f. If there is more than one column in

T_{3}

, which is labeled with the attribute f, then we keep only one of them. Let the decision table

T_{f}

be obtained from

T^{'}

by changing the set of decisions for each row

(δ)

with the set of decisions

{δ}

. It is clear that

T_{f} \in C

. Let

Γ

be a nondeterministic decision tree for the table

T_{f}

, and

ψ (Γ) = ψ^{a} (T_{f}) \leq r

. Since the column f contains different numbers, we have

f \in A t (Γ)

. Using the property (b) of the complexity measure

ψ

, we obtain

ψ (Γ) \geq ψ (f)

. Consequently,

ψ (f) \leq r

.

Taking into account that, for any

\bar{δ} \in Δ (T_{3})

, there exist attributes

f_{i_{1}}, \dots,

f_{i_{m}} \in {f_{1}, \dots, f_{n}}

such that

m \leq r

, and

T_{3} (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}})

contains only the row

\bar{δ}

, it is not difficult to show that

N (T_{3}) \leq n^{r} \cdot k^{r} .

(1)

According to what has been proven above, for any

\bar{δ} \in E_{k}^{n}

, there exist attributes

f_{i_{1}}, \dots,

f_{i_{m}} \in {f_{1}, \dots, f_{n}}

such that

m \leq r + 1

, and

T_{3} (f_{i_{1}}, δ_{i_{1}}) \dots (f_{i_{m}}, δ_{i_{m}}) = T_{3} (f_{1}, δ_{1})

\dots (f_{n}, δ_{n})

. Taking into account this equality, one can show that

M (T_{3}) \leq r + 1 .

(2)

Using Lemma 7, as well as inequalities (1) and (2), we conclude that there exists a deterministic decision tree

Γ

for the table

T_{3}

with

h (Γ) \leq M (T_{3}) {log}_{2} N (T_{3}) \leq {(r + 1)}^{2} {log}_{2} (k n)

. Taking into account that

ψ (f) \leq r

for any attribute

f \in A t (T_{3})

and that the complexity measure

ψ

has the property (a), we obtain

ψ^{d} (T_{3}) \leq {(r + 1)}^{3} {log}_{2} (k n) .

Consequently,

ψ^{d} (T_{1}) \leq {(r + 1)}^{3} {log}_{2} (k n)

. Taking into account that the complexity measure

ψ

has the property (c), we obtain

ψ^{i} (T_{1}) \geq n

. Since

T_{1}

is an arbitrary decision table from

C

, we have that

{Dom}^{+} (U_{C ψ}^{d i})

is a finite set. Therefore,

typ (U_{U ψ}^{d i}) \neq γ

. Using Lemma 3 and Corollary 1, we obtain

typ (U_{C ψ}^{d i}) \in {α, β}

. □

Proof of Proposition 1.

Let

(C, ψ)

be a t pair. Using Corollary 1, we conclude that

typ (U_{C ψ}^{i i}) \in {α, γ}

. Using Corollary 1 and Lemma 3, we obtain

typ (U_{C ψ}^{d i}) \in {α, β, γ}

. From Lemma 6, it follows that

typ (U_{C ψ}^{a i}) \in {α, γ}

. Then, we have the following:

(a) Let

typ (U_{C ψ}^{i i}) = α

. Using Lemmas 3 and 4, we obtain

{typ}_{u} (C, ψ) = t_{1}

.

(b) Let

typ (U_{C ψ}^{i i}) = γ

and

typ (U_{C ψ}^{d i}) = α

. Using Lemmas 3, 4, and 5, we obtain

{typ}_{u} (C, ψ) = t_{2}

.

(c) Let

typ (U_{C ψ}^{i i}) = γ

and

typ (U_{C ψ}^{d i}) = β

. From Lemma 5, it follows that

typ (U_{C ψ}^{i d}) = typ (U_{C ψ}^{i a}) = ϵ

. Using Lemmas 3 and 6, we obtain

typ (U_{C ψ}^{a i}) = α

. From this equality and from Lemma 4, it follows that

typ (U_{C ψ}^{a d}) = typ (U_{C ψ}^{a a}) =

α

. Using the equality

typ (U_{C ψ}^{d i}) = β

, Lemma 3, and Corollary 1, we obtain

typ (U_{C ψ}^{d d}) = γ

. From the equalities,

typ (U_{C ψ}^{d d}) = γ, typ (U_{C ψ}^{a a}) = α

and from Lemmas 2 and 4, it follows that

typ (U_{C ψ}^{d a}) = ϵ

. Thus,

{typ}_{u} (C, ψ) = t_{3}

.

(d) Let

typ (U_{C ψ}^{i i}) = typ (U_{C ψ}^{d i}) = γ

and

typ (U_{C ψ}^{a i}) = α

. Using Lemma 5, we obtain

typ (U_{C ψ}^{i d}) = typ (U_{C ψ}^{i a}) = ϵ

. From Lemma 4, it follows that

typ (U_{C ψ}^{a d}) =

typ (U_{C ψ}^{a a}) = α

. Using Lemma 3 and Corollary 1, we obtain

typ (U_{C ψ}^{d d}) = γ

. From this equality, equality

typ (U_{C ψ}^{a a}) = α

, and from Lemmas 2 and 4, it follows that

typ (U_{C ψ}^{d a}) = ϵ

. Thus,

{typ}_{u} (C, ψ) = t_{4}

.

(e) Let

typ (U_{C ψ}^{i i}) = typ (U_{C ψ}^{d i}) = typ (U_{C ψ}^{a i}) = γ

. Using Lemma 5, we conclude that

typ (U_{C ψ}^{i d}) = typ (U_{C ψ}^{i a}) = ϵ

. Using Lemma 3 and Corollary 1, we obtain

typ (U_{C ψ}^{d d}) = typ (U_{C ψ}^{a d}) = typ (U_{C ψ}^{a a}) = γ

. Using Lemma 3, we obtain

typ (U_{C ψ}^{d a}) \in

{γ, δ, ϵ}

. Therefore,

{typ}_{u} (C, ψ) \in {t_{5}, t_{6}, t_{7}}

. □

Proof of Proposition 2.

Let

(C, ψ)

be a limited t pair. Taking into account that the complexity measure

ψ

has the property (c) and using Lemma 4, we obtain

typ (U_{C ψ}^{i i}) \neq α

. Therefore,

{typ}_{u} (C, ψ) \neq t_{1}

. Using Lemma 8, we obtain

{typ}_{u} (C, ψ) \neq t_{4}

. From these relations and Proposition 1, it follows that the statement of the proposition holds. □

5. Realizable Upper Types of T Pairs

In this section, all realizable upper types of t pairs are enumerated.

Proposition 3.

For any

i \in {1, 2, 3, 4, 5, 6, 7}

, there exists a t pair

(C, ψ)

such that

{typ}_{u} (C, ψ) = t_{i} .

Proposition 4.

For any

i \in {2, 3, 5, 6, 7}

, there exists a limited t pair

(C, h)

such that

{typ}_{u} (C, h) = t_{i} .

The proofs of these propositions are based on the results obtained for information systems [22].

Let

U = (A, F)

be an information system, where the attributes from F have values from

E_{k}

, and

ψ

is a complexity measure over U [22]. Note that

ψ

is also a complexity measure over the set of decision tables

M_{k} (F)

. Let

z = (ν, f_{1}, \dots, f_{n})

be a problem over U. In [22], three parameters of the problem z were defined:

ψ_{U}^{i} (z) = ψ (f_{1} \dots f_{n})

was called the complexity of the problem z description,

ψ_{U}^{d} (z)

was called the minimum complexity of a decision tree with attributes from the set

{f_{1}, \dots, f_{n}}

—which solves the problem z deterministically—and

ψ_{U}^{a} (z)

was called the minimum complexity of a decision tree with attributes from the set

{f_{1}, \dots, f_{n}}

, which solves the problem z nondeterministically.

Let

b, c \in {i, d, a}

. In [22], the partial function

U_{U ψ}^{b c} : N \to N

was defined as follows:

U_{U ψ}^{b c} (n) = max {ψ_{U}^{b} (z) : z \in P r o b l (U), ψ_{U}^{c} (z) \leq n} .

The table

{typ}_{l u} (U, ψ)

for the pair

(U, ψ)

was defined in [22] as follows: this is a table with three rows and three columns, in which the rows from top to bottom and the columns from left to right are labeled with the indices

i, d, a

. The value

typ (U_{U ψ}^{b c})

is in the intersection of the row with the index

b \in {i, d, a}

and the column with the index

c \in {i, d, a}

.

We now prove the following proposition:

Proposition 5.

Let U be an information system and ψ be a complexity measure over U. Then,

{typ}_{l u} (U, ψ) = {typ}_{u} (T a b (U), ψ) .

Proof.

Let

z = (ν, f_{1}, \dots, f_{n})

be a problem over U and

T (z)

be the decision table corresponding to this problem. It is easy to see that

ψ_{U}^{i} (z) = ψ^{i} (T (z))

. One can show that the set of decision trees solving the problem z nondeterministically and using only the attributes from the set

{f_{1}, \dots, f_{n}}

(see corresponding definitions in [22]) is equal to the set of nondeterministic decision trees for the table

T (z)

. From here, it follows that

ψ_{U}^{a} (z) = ψ^{a} (T (z))

and

ψ_{U}^{d} (z) = ψ^{d} (T (z))

. Using these equalities, we can show that

{typ}_{l u} (U, ψ) = {typ}_{u} (T a b (U), ψ)

. □

This proposition allows us to transfer the results obtained for information systems in [22] to the case of closed classes of decision tables. Before each of the following seven lemmas, we define a pair

(U, ψ)

, where U is an information system, and

ψ

is a complexity measure over U.

Let us define a pair

(U_{1}, π)

as follows:

U_{1} = (N, F_{1})

, where

F_{1} = {f}

,

f \equiv 0

, and

π \equiv 0

.

Lemma 9.

{typ}_{u} (T a b (U_{1}), π) = t_{1}

.

Proof.

From Lemma 4.1 [22], it follows that

{typ}_{l u} (U_{1}, π) = t_{1}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{1}), π) = t_{1}

. □

Let us define a pair

(U_{2}, h)

as follows:

U_{2} = (N, F_{2})

, where

F_{2} = F_{1}

.

Lemma 10.

{typ}_{u} (T a b (U_{2}), h) = t_{2}

.

Proof.

From Lemma 4.2 [22], it follows that

{typ}_{l u} (U_{2}, h) = t_{2}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{2}), h) = t_{2}

. □

Let us define a pair

(U_{3}, h)

as follows:

U_{3} = (N, F_{3})

, where

F_{3} = {l_{i} : i \in N ∖ {0}}

and, for any

i \in N ∖ {0}, j \in N

, if

j \leq i

, then

l_{i} (j) = 0

, and if

j > i

, then

l_{i} (j) = 1

.

Lemma 11.

{typ}_{u} (T a b (U_{3}), h) = t_{3}

.

Proof.

From Lemma 4.3 [22], it follows that

{typ}_{l u} (U_{3}, h) = t_{3}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{3}), h) = t_{3}

. □

Let us define a pair

(U_{4}, μ)

as follows:

U_{4} = (N, F_{4})

, where

F_{4} = F_{3}, μ (λ) =

0, μ (l_{i_{1}} \dots l_{i_{m}}) = 1

if

m = 1

or

m = 2

, and

i_{1} > i_{2}, μ (l_{i_{1}} \dots l_{i_{m}}) = max {i_{1}, \dots, i_{m}}

in other cases.

Lemma 12.

{typ}_{u} (T a b (U_{4}), μ) = t_{4}

.

Proof.

From Lemma 4.4 [22], it follows that

{typ}_{l u} (U_{4}, μ) = t_{4}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{4}), μ) = t_{4}

. □

Let us define a pair

(U_{5}, h)

as follows:

U_{5} = (N, F_{5})

, where

F_{5} = {f_{i} : i \in

N ∖ {0}}

and, for any

i \in N ∖ {0}, j \in N

, if

i = j

, then

f_{i} (j) = 1

, and if

i \neq j

, then

f_{i} (j) = 0

.

Lemma 13.

{typ}_{u} (T a b (U_{5}), h) = t_{5}

.

Proof.

From Lemma 4.5 [22], it follows that

{typ}_{l u} (U_{5}, h) = t_{5}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{5}), h) = t_{5}

. □

Let us define a pair

(U_{6}, h)

as follows:

U_{6} = (N, F_{6})

, where

F_{6} = F_{5} \cup G

,

G = {g_{2 i + 1} : i \in N}

and, for any

i \in N, j \in N

, if

j \in {2 i + 1, 2 i + 2}

, then

g_{2 i + 1} (j) = 1

, and if

j \notin {2 i + 1, 2 i + 2}

, then

g_{2 i + 1} (j) = 0

.

Lemma 14.

{typ}_{u} (T a b (U_{6}), h) = t_{6}

.

Proof.

From Lemma 4.6 [22], it follows that

{typ}_{l u} (U_{6}, h) = t_{6}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{6}), h) = t_{6}

. □

Let us define a pair

(U_{7}, h)

as follows:

U_{7} = (N, F_{7})

, where

F_{7} = F_{3} \cup F_{5}

.

Lemma 15.

{typ}_{u} (T a b (U_{7}), h) = t_{7}

.

Proof.

From Lemma 4.7 [22], it follows that

{typ}_{l u} (U_{7}, h) = t_{7}

. Using Proposition 5, we obtain

{typ}_{u} (T a b (U_{7}), h) = t_{7}

. □

Proof of Proposition 3.

The statement of the proposition follows from Lemmas 9–15. □

Proof of Proposition 4.

The statement of the proposition follows from Lemmas 10, 11, 13, 14, and 15. □

6. Union of T Pairs

In this section, we define a union of two t pairs, which is also a t pair, and study its upper type. Let

τ_{1} = (C_{1}, ψ_{1})

and

τ_{2} = (C_{2}, ψ_{2})

be t pairs, where

C_{1} \subseteq M_{k_{1}} (F_{1})

, and

C_{2} \subseteq M_{k_{2}} (F_{2})

. These two t pairs are called compatible if

F_{1} \cap F_{2} = ⌀

and

ψ_{1} (λ) = ψ_{2} (λ)

. We now define a t pair

τ = (C, ψ)

, which is called a union of compatible t pairs

τ_{1}

and

τ_{2}

.

Definition 24.

The closed class

C

in τ is defined as follows:

C = C_{1} \cup C_{2} \subseteq M_{max (k_{1}, k_{2})} (F_{1} \cup F_{2})

. The complexity measure ψ in τ is defined for any word

α \in {(F_{1} \cup F_{2})}^{*}

in the following way: if

α \in F_{1}^{*}

, then

ψ (α) = ψ_{1} (α)

; if

α \in F_{2}^{*}

, then

ψ (α) = ψ_{2} (α)

; if α contains letters from both

F_{1}

and

F_{2}

, then

ψ (α)

can have an arbitrary value from

N

. In particular, if

ψ_{1} = ψ_{2} = h

, then with ψ we can use the depth h.

We now consider the upper type of t pair

τ = (C, ψ)

. We denote as

max^{˜}

the function maximum for the linear order

α ⪯ β ⪯ γ ⪯ δ ⪯ ϵ

.

Theorem 3.

The equality

typ (U_{C ψ}^{b c}) = max^{˜} (typ (U_{C_{1} ψ_{1}}^{b c}), typ (U_{C_{2} ψ_{2}}^{b c}))

holds for any

b, c \in {i, d, a}

, except for the case that

b c = d a

and

typ (U_{C_{1} ψ_{1}}^{d a}) = typ (U_{C_{2} ψ_{2}}^{d a}) = γ

. In the last case,

typ (U_{C ψ}^{d a}) \in {γ, δ}

.

Proof.

Let

n \in N

and

b, c \in {i, d, a}

. We now define the value

M = max_{̲} (U_{1}, U_{2})

, where

U_{1} = U_{C_{1} ψ_{1}}^{b c} (n)

, and

U_{2} = U_{C_{2} ψ_{2}}^{b c} (n)

. Both

U_{1}

and

U_{2}

have values from the set

{⌀, \infty} \cup N

(see the definitions before Lemma 2). If

U_{1} = U_{2} = ⌀

, then

M = ⌀

. If one of

U_{1}, U_{2}

is equal to ⌀ and another one is equal to a number

m \in N

, then

M = m

. If

U_{1}, U_{2} \in N

, then

M = max (U_{1}, U_{2})

. If at least one of

U_{1}, U_{2}

is equal to ∞, then

M = \infty

. □

The following equality follows from the definition of the partial function

U_{C ψ}^{b c} (n)

, where

n \in N

, and

b, c \in {i, d, a}

:

U_{C ψ}^{b c} (n) = max_{̲} (U_{C_{1} ψ_{1}}^{b c} (n), U_{C_{2} ψ_{2}}^{b c} (n))

. Later in the proof, we will use this equality without special mention. From this equality, we obtain

typ (U_{C_{1} ψ_{1}}^{b c}) ⪯ typ (U_{C ψ}^{b c})

and

typ (U_{C_{2} ψ_{2}}^{b c}) ⪯ typ (U_{C ψ}^{b c})

. We now consider two different cases separately: (1)

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c})

and (2)

typ (U_{C_{1} ψ_{1}}^{b c}) \neq typ (U_{C_{2} ψ_{2}}^{b c})

. Thus, we have the following:

(1) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c})

.

(a) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c}) = α

. Since the functions

U_{C_{1} ψ_{1}}^{b c}

and

U_{C_{2} ψ_{2}}^{b c}

are both bounded from above, we obtain that the function

U_{C ψ}^{b c} = max_{̲} (U_{C_{1} ψ_{1}}^{b c}, U_{C_{2} ψ_{2}}^{b c})

is also bounded from above. From this, it follows that

typ (U_{C ψ}^{b c}) = max^{˜} (typ (U_{C_{1} ψ_{1}}^{b c}),

typ (U_{C_{2} ψ_{2}}^{b c})) = α

.

(b) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c}) = β

. From the fact that

D o m^{+} (U_{C_{1} ψ_{1}}^{b c})

and

D o m^{+} (U_{C_{2} ψ_{2}}^{b c})

are both finite, we obtain that

D o m^{+} (U_{C ψ}^{b c})

is also finite. Similarly, one can show that

U_{C ψ}^{b c}

is unbounded from above on

C

. From here, it follows that

typ (U_{C ψ}^{b c}) = max^{˜} (typ (U_{C_{1} ψ_{1}}^{b c}),

typ (U_{C_{2} ψ_{2}}^{b c})) = β

.

(c) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c}) = γ

. From here, it follows that the function

ψ^{b}

is unbounded from above on

C

. From Proposition 1, it follows that

b c

belongs to the set

{i i, d i, d d, d a, a i, a d, a a}

. Let

c = b

. Using Lemma 4, we obtain

typ (U_{C ψ}^{b b}) = γ

. Let

b c \in {d i, a i, a d}

. Using Lemma 3 and the inequalities

typ (U_{C_{1} ψ_{1}}^{b c}) ⪯ typ (U_{C ψ}^{b c})

and

typ (U_{C_{2} ψ_{2}}^{b c}) ⪯ typ (U_{C ψ}^{b c})

, we obtain

typ (U_{C ψ}^{b c}) = γ

. The only case left is when

b c = d a

. Since there is no

n \in N

for which

U_{C_{1} ψ_{1}}^{b c} (n) = \infty

or

U_{C_{2} ψ_{2}}^{b c} (n) = \infty

, then according to Lemma 2, we obtain that

D o m (U_{C ψ}^{b c})

is an infinite set. Therefore,

typ (U_{C ψ}^{b c}) \neq ϵ

, and hence,

typ (U_{C ψ}^{b c}) \in {γ, δ}

. From Proposition 6, it follows that both cases are possible. Thus, we have the following:

(d) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c}) = δ

. From here, it follows that there is no

n \in N

for which

U_{C_{1} ψ_{1}}^{b c} (n) = \infty

or

U_{C_{2} ψ_{2}}^{b c} (n) = \infty

. Using Lemma 2, we conclude that

D o m (U_{C ψ}^{b c})

is an infinite set. From the fact that

D o m^{-} (U_{C_{1} ψ_{1}}^{b c})

and

D o m^{-} (U_{C_{2} ψ_{2}}^{b c})

are both finite, we obtain that

D o m^{-} (U_{C ψ}^{b c})

is also finite. Therefore,

typ (U_{C ψ}^{b c}) = max^{˜} (typ (U_{C_{1} ψ_{1}}^{b c}), typ (U_{C_{2} ψ_{2}}^{b c})) = δ

.

(e) Let

typ (U_{C_{1} ψ_{1}}^{b c}) = typ (U_{C_{2} ψ_{2}}^{b c}) = ϵ

. Since both

D o m (U_{C_{1} ψ_{1}}^{b c})

and

D o m (U_{C_{2} ψ_{2}}^{b c})

are finite sets, we obtain that

D o m (U_{C ψ}^{b c})

is also a finite set. Therefore,

typ (U_{C ψ}^{b c}) = max^{˜} (typ (U_{C_{1} ψ_{1}}^{b c}), typ (U_{C_{2} ψ_{2}}^{b c})) = ϵ

.

(2) Let

typ (U_{C_{1} ψ_{1}}^{b c}) \neq typ (U_{C_{2} ψ_{2}}^{b c})

. Denote

f = U_{C_{1} ψ_{1}}^{b c}

and

g = U_{C_{2} ψ_{2}}^{b c}

. Let

typ (f) ⪯ typ (g)

. We now consider a number of cases.

(a) Let

typ (g) = ϵ

. From here, it follows that

D o m (g)

is a finite set. Taking into account this fact, we obtain that

D o m (max_{̲} (f, g))

is also a finite set. Therefore,

typ (max_{̲} (f, g)) = max^{˜} (typ (f), typ (g)) = ϵ

. Later, we assume that

typ (g) \neq ϵ

.

(b) Let

typ (f) = α

. Then, both f and g are nondecreasing functions, f is bounded from above, and g is unbounded from above. From here, it follows that there exists

n_{0} \in N

such that

f (n) < g (n)

for any

n \in N, n \geq n_{0}

. Using this fact, we conclude that

max_{̲} (f (n), g (n)) = g (n)

for

n \geq n_{0}

. Therefore,

typ (max_{̲} (f, g)) = max^{˜} (typ (f), typ (g)) = typ (g)

. Later, we will assume that

typ (f) \neq α

. It means we should only consider the pairs

(typ (f), typ (g)) \in {(β, δ), (β, γ), (γ, δ)}

.

(c) Let

typ (f) = β, typ (g) = δ

. From here, it follows that

D o m^{-} (f), D o m^{+} (g)

are both infinite sets, and

D o m^{+} (f), D o m^{-} (g)

are both finite sets. Taking into account that both f and g are nondecreasing functions, we obtain that there exists

n_{0} \in N

such that

f (n) < g (n)

for any

n \in N, n \geq n_{0}

. Therefore,

typ (max_{̲} (f, g)) = max^{˜} (typ (f), typ (g)) = typ (g) = δ

.

(d) Let

typ (f) = β, typ (g) = γ

. Then,

D o m^{+} (max_{̲} (f, g))

is an infinite set. Taking into account that

D o m^{-} (g)

is an infinite set and that

D o m^{+} (f)

is a finite set, we obtain that

D o m^{-} (max_{̲} (f, g))

is also an infinite set. Therefore,

typ (max_{̲} (f, g)) = max^{˜} (typ (f), typ (g)) = typ (g) = γ

.

(e) Let

typ (f) = γ, typ (g) = δ

. From here, it follows that

D o m^{+} (max_{̲} (f, g))

is an infinite set, and

D o m^{-} (max_{̲} (f, g))

is a finite set. Therefore,

typ (max_{̲} (f, g)) = max^{˜} (typ (f),

typ (g)) = typ (g) = δ

. □

The next statement follows immediately from Proposition 1 and Theorem 3.

Corollary 2.

Let

τ_{1}

and

τ_{2}

be compatible t pairs, and let τ be a union of these t pairs. Then, the possible values of

{typ}_{u} (τ)

are in the table shown in Figure 9 in the intersection of the row labeled with

{typ}_{u} (τ_{1})

and the column labeled with

{typ}_{u} (τ_{2})

.

To finalize the study of unions of t pairs, we prove the following statement:

Proposition 6.

(a) There exist compatible t pairs

τ_{1}^{1}

and

τ_{2}^{1}

and their union

τ^{1}

such that

{typ}_{u} (τ_{1}^{1}) = {typ}_{u} (τ_{2}^{1}) = {typ}_{u} (τ^{1}) = t_{5}

.

(b) There exist compatible t pairs

τ_{1}^{2}

and

τ_{2}^{2}

and their union

τ^{2}

such that

{typ}_{u} (τ_{1}^{2}) = {typ}_{u} (τ_{2}^{2}) = t_{5}

and

{typ}_{u} (τ^{2}) = t_{6}

.

Proof.

For

i \in N

, we denote

F_{i} = {a_{i}, b_{i}, c_{i}}

, and

G_{i}

in the decision table depicted in Figure 10. We study the t pair

(T_{i}, ψ_{i})

, where

T_{i}

is the closed class of decision tables from

M_{2} (F_{i})

, which is equal to

[G_{i}]

, and

ψ_{i}

is a complexity measure over

M_{2} (F_{i})

defined in the following way:

ψ_{i} (λ) = 0, ψ_{i} (a_{i}) = ψ_{i} (b_{i}) = ψ_{i} (c_{i}) = i

and

ψ_{i} (α) = i + 1

if

α \in F_{i}^{*}

and

| α | \geq 2

. □

We now study the function

U_{T_{i} ψ_{i}}^{d a}

. Since the operations of the duplication of columns and the permutation of columns do not change the minimum complexity of the deterministic and nondeterministic decision trees, we only consider the operations of the changing of decisions and the removal of columns.

Using these operations, the decision tables from

T_{i}

can be obtained from

G_{i}

in three ways: (a) only through the changing of decisions, (b) by removing one column and through the changing of decisions, and (c) by removing two columns and through the changing of decisions. Figure 11 demonstrates examples of the decision tables from

T_{i}

for each case. Without loss of generality, we can restrict ourselves to considering these three tables:

H_{1}

,

H_{2}

, and

H_{3}

.

We consequently have the following:

(a) There are three different cases for the table

H_{1}

: (i) the sets of decisions

d_{1}, d_{2}, d_{3}

are pairwise disjoint, (ii) there are

l, t \in {1, 2, 3}

such that

l \neq t, d_{l} \cap d_{t} \neq ⌀

and

d_{1} \cap d_{2} \cap d_{3} = ⌀

, and (iii)

d_{1} \cap d_{2} \cap d_{3} \neq ⌀

. In the first case,

ψ_{i}^{a} (H_{1}) = i

and

ψ_{i}^{d} (H_{1}) = i + 1

. In the second case,

ψ_{i}^{a} (H_{1}) = i

and

ψ_{i}^{d} (H_{1}) = i

. In the third case,

ψ_{i}^{a} (H_{1}) = 0

and

ψ_{i}^{d} (H_{1}) = 0

.

(b) There are three different cases for the table

H_{2}

: (i) the sets of decisions

d_{4}, d_{5}, d_{6}

are pairwise disjoint, (ii) there are

l, t \in {4, 5, 6}

such that

l \neq t, d_{l} \cap d_{t} \neq ⌀

and

d_{4} \cap d_{5} \cap d_{6} = ⌀

, and (iii)

d_{4} \cap d_{5} \cap d_{6} \neq ⌀

. In the first case,

ψ_{i}^{a} (H_{2}) = i + 1

, and

ψ_{i}^{d} (H_{2}) = i + 1

. In the second case, we have either

ψ_{i}^{a} (H_{2}) = ψ_{i}^{d} (H_{2}) = i + 1

or

ψ_{i}^{a} (H_{2}) = ψ_{i}^{d} (H_{2}) = i

depending on the intersecting decision sets. In the third case,

ψ_{i}^{a} (H_{2}) = 0

, and

ψ_{i}^{d} (H_{2}) = 0

.

(c) There are two different cases for the table

H_{3}

: (i)

d_{7} \cap d_{8} = ⌀

and (ii)

d_{7} \cap d_{8} \neq ⌀

. In the first case,

ψ_{i}^{a} (H_{3}) = i

, and

ψ_{i}^{d} (H_{3}) = i

. In the second case,

ψ_{i}^{a} (H_{3}) = 0

, and

ψ_{i}^{d} (H_{3}) = 0

.

As a result, we obtain that, for any

n \in N

,

U_{T_{i} ψ_{i}}^{d a} (n) = \{\begin{matrix} 0, & n < i, \\ i + 1, & n \geq i . \end{matrix}

(3)

Let K be an infinite subset of the set

N

. Denote

F_{K} = \cup_{i \in K} F_{i}

and

T_{K} = \cup_{i \in K} [G_{i}]

. It is clear that

T_{K}

is a closed class of decision tables from

M_{2} (F_{K})

. We now define a complexity measure

ψ_{K}

over

M_{2} (F_{K})

. Let

α \in F_{K}^{*}

. If

α \in F_{i}^{*}

for some

i \in K

; then,

ψ_{K} (α) = ψ_{i} (α)

. If

α

contains letters from both

F_{i}

and

F_{j}

, and if

i \neq j

, then

ψ_{K} (α) = 0

.

Let

K = {n_{j} : j \in N}

and

n_{j} < n_{j + 1}

for any

j \in N

. We define a function

φ_{K} : N \to N

as follows. Let

n \in N

. If

n < n_{0}

, then

φ_{K} (n) = 0

. Let, for some

j \in N

, that

n_{j} \leq n < n_{j + 1}

. Then,

φ_{K} (n) = n_{j}

. Using (3), one can show that, for any

n \in N

,

U_{T_{K} ψ_{K}}^{d a} (n) = φ_{K} (n) .

Using this equality, one can prove that

typ (U_{T_{K} ψ_{K}}^{d a}) = γ

if the set

N ∖ K

is infinite and that

typ (U_{T_{K} ψ_{K}}^{d a}) = δ

if the set

N ∖ K

is finite.

Denote

K_{1}^{1} = {3 j : j \in N}

,

K_{2}^{1} = {3 j + 1 : j \in N}

and

K^{1} = K_{1}^{1} \cup K_{2}^{1}

. Denote

τ_{1}^{1} = (T_{K_{1}^{1}}, ψ_{K_{1}^{1}})

,

τ_{2}^{1} = (T_{K_{2}^{1}}, ψ_{K_{2}^{1}})

, and

τ^{1} = (T_{K^{1}}, ψ_{K^{1}})

. One can show that the t pairs

τ_{1}^{1}

and

τ_{2}^{1}

are compatible and that

τ^{1}

is a union of

τ_{1}^{1}

and

τ_{2}^{1}

. It is easy to prove that

typ (U_{T_{K_{1}^{1}} ψ_{K_{1}^{1}}}^{d a}) = typ (U_{T_{K_{2}^{1}} ψ_{K_{2}^{1}}}^{d a}) = typ (U_{T_{K^{1}} ψ_{K^{1}}}^{d a}) = γ

. Using Proposition 2, we obtain

{typ}_{u} (τ_{1}^{1}) = {typ}_{u} (τ_{2}^{1}) = {typ}_{u} (τ^{1}) = t_{5}

.

Denote

K_{1}^{2} = {2 j : j \in N}

,

K_{2}^{2} = {2 j + 1 : j \in N}

and

K^{2} = K_{1}^{2} \cup K_{2}^{2} = N

. Denote

τ_{1}^{2} = (T_{K_{1}^{2}}, ψ_{K_{1}^{2}})

,

τ_{2}^{2} = (T_{K_{2}^{2}}, ψ_{K_{2}^{2}})

, and

τ^{2} = (T_{K^{2}}, ψ_{K^{2}})

. One can show that the t pairs

τ_{1}^{2}

and

τ_{2}^{2}

are compatible and that

τ^{2}

is a union of

τ_{1}^{2}

and

τ_{2}^{2}

. It is easy to prove that

typ (U_{T_{K_{1}^{2}} ψ_{K_{1}^{2}}}^{d a}) = typ (U_{T_{K_{2}^{2}} ψ_{K_{2}^{2}}}^{d a}) = γ

and

typ (U_{T_{K^{2}} ψ_{K^{2}}}^{d a}) = δ

. Using Proposition 2, we obtain

{typ}_{u} (τ_{1}^{2}) = {typ}_{u} (τ_{2}^{2}) = t_{5}

and

{typ}_{u} (τ^{2}) = t_{6}

. □

7. Proofs of Theorems 1 and 2

First, we consider some auxiliary statements.

Definition 25.

Let us define a function

ρ : {α, β, γ, δ, ϵ} \to {α, β, γ, δ, ϵ}

as follows:

ρ (α) =

ϵ, ρ (β) = δ, ρ (γ) = γ, ρ (δ) = β, ρ (ϵ) = α

.

Proposition 7

(Proposition 5.1 [22]). Let X be a nonempty set

f : X \to N, g : X \to N, U^{f g} (n) =

max {f (x) : x \in X, g (x) \leq n}

, and

L^{g f} (n) = min {g (x) : x \in X, f (x) \geq n}

for any

n \in N

. Then,

typ (L^{g f}) = ρ (typ (U^{f g}))

.

Using Proposition 7, we obtain the following statement:

Proposition 8.

Let

(C, ψ)

be a t pair, and

b, c \in {i, d, a}

. Then,

typ (L_{C ψ}^{c b}) =

ρ (typ (U_{C ψ}^{b c}))

.

Corollary 3.

Let

(C, ψ)

be a t pair, and

i \in {1, \dots, 7}

. Then,

{typ}_{u} (C, ψ) = t_{i}

if and only if

typ (C, ψ) = T_{i}

.

Proof of Theorem 1.

The statement of the theorem follows from Propositions 1 and 3 and from Corollary 3. □

Proof of Theorem 2.

The statement of the theorem follows from Propositions 2 and 4 and from Corollary 3. □

8. Conclusions

This paper is devoted to a comparative analysis of the deterministic and nondeterministic decision tree complexity for decision tables from closed classes. It is a qualitative research: we have considered a finite number of types of the behavior of functions characterizing the relationships among different parameters of decision tables. In this paper, we have enumerated all the realizable types of t pairs and limited t pairs. We have also defined the notion of a union of two t pairs and studied the upper type of the resulting t pair, thus depending on the upper types of the initial t pairs. The obtained results allow us to point out cases where the complexity of deterministic and nondeterministic decision trees is essentially less than the complexity of the decision table. Future publications will be related to a quantitative research: we will study the lower and upper bounds on the considered functions.

Author Contributions

Conceptualization, A.O. and M.M.; methodology, A.O. and M.M.; validation, A.O.; formal analysis, A.O. and M.M.; investigation, A.O.; resources, A.O. and M.M.; writing—original draft preparation, A.O. and M.M.; writing—review and editing, A.O. and M.M.; visualization, A.O.; supervision, M.M.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by King Abdullah University of Science and Technology.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST). The authors are grateful to the anonymous reviewers for their useful remarks and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning multi-label scene classification. Pattern Recognit. 2004, 37, 1757–1771. [Google Scholar] [CrossRef]
Vens, C.; Struyf, J.; Schietgat, L.; Dzeroski, S.; Blockeel, H. Decision trees for hierarchical multi-label classification. Mach. Learn. 2008, 73, 185–214. [Google Scholar] [CrossRef]
Zhou, Z.; Zhang, M.; Huang, S.; Li, Y. Multi-instance multi-label learning. Artif. Intell. 2012, 176, 2291–2320. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth and Brooks: Monterey, CA, USA, 1984. [Google Scholar]
Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: Burlington, MA, USA, 1993. [Google Scholar]
Rokach, L.; Maimon, O. Data Mining with Decision Trees—Theory and Applications; Series in Machine Perception and Artificial Intelligence; World Scientific: Singapore, 2007; Volume 69. [Google Scholar]
Ostonov, A.; Moshkov, M. On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes. Entropy 2023, 25, 1411. [Google Scholar] [CrossRef] [PubMed]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A. Logical analysis of numerical data. Math. Program. 1997, 79, 163–190. [Google Scholar] [CrossRef]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A.; Mayoraz, E.; Muchnik, I.B. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng. 2000, 12, 292–306. [Google Scholar] [CrossRef]
Fürnkranz, J.; Gamberger, D.; Lavrac, N. Foundations of Rule Learning; Cognitive Technologies; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Pawlak, Z. Rough Sets—Theoretical Aspects of Reasoning about Data; Theory and Decision Library: Series D; Kluwer: Alphen aan den Rijn, The Netherlands, 1991; Volume 9. [Google Scholar]
Pawlak, Z.; Skowron, A. Rudiments of rough sets. Inf. Sci. 2007, 177, 3–27. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, 2nd ed.; Independent Publishers: Chicago, IL, USA, 2022; Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 1 May 2024).
Blum, M.; Impagliazzo, R. Generic Oracles and Oracle Classes (Extended Abstract). In Proceedings of the 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, USA, 27–29 October 1987; IEEE Computer Society: Washington, DC, USA, 1987; pp. 118–126. [Google Scholar]
Hartmanis, J.; Hemachandra, L.A. One-way functions, robustness, and the non-isomorphism of NP-complete sets. In Proceedings of the Second Annual Conference on Structure in Complexity Theory, Ithaca, NY, USA, 16–19 June 1987; IEEE Computer Society: Washington, DC, USA, 1987. [Google Scholar]
Tardos, G. Query complexity, or why is it difficult to separate NP^A∩coNP^A from P^A by random oracles A? Combinatorica 1989, 9, 385–392. [Google Scholar] [CrossRef]
Buhrman, H.; de Wolf, R. Complexity measures and decision tree complexity: A survey. Theor. Comput. Sci. 2002, 288, 21–43. [Google Scholar] [CrossRef]
Pawlak, Z. Information systems theoretical foundations. Inf. Syst. 1981, 6, 205–218. [Google Scholar] [CrossRef]
Post, E. Two-Valued Iterative Systems of Mathematical Logic; Annals of Mathematics Studies; Princeton University Press: Princeton, NJ, USA, 1941; Volume 5. [Google Scholar]
Robertson, N.; Seymour, P.D. Graph Minors. XX. Wagner’s conjecture. J. Comb. Theory, Ser. B 2004, 92, 325–357. [Google Scholar] [CrossRef]
Moshkov, M. On depth of conditional tests for tables from closed classes. In Combinatorial-Algebraic and Probabilistic Methods of Discrete Analysis; Markov, A.A., Ed.; Gorky University Press: Gorky, Russia, 1989; pp. 78–86. (In Russian) [Google Scholar]
Moshkov, M. Comparative Analysis of Deterministic and Nondeterministic Decision Tree Complexity. Local Approach. Trans. Rough Sets 2005, 4, 125–143. [Google Scholar]
Moshkov, M. Time Complexity of Decision Trees. Trans. Rough Sets 2005, 3, 244–459. [Google Scholar]

Figure 1. Decision tables

T_{1}

,

T_{2}

, and

T_{3}

.

Figure 1. Decision tables

T_{1}

,

T_{2}

, and

T_{3}

.

Figure 2. Degenerate decision tables

D_{1}

and

D_{2}

.

Figure 2. Degenerate decision tables

D_{1}

and

D_{2}

.

Figure 3. Subtables

T_{1} (f_{1}, 1)

and

T_{2} (f_{1}, 0) (f_{2}, 0) (f_{3}, 0)

of tables

T_{1}

and

T_{2}

shown in Figure 1.

Figure 3. Subtables

T_{1} (f_{1}, 1)

and

T_{2} (f_{1}, 0) (f_{2}, 0) (f_{3}, 0)

of tables

T_{1}

and

T_{2}

shown in Figure 1.

Figure 4. Decision tables

T_{1}^{'}

,

T_{2}^{'}

,

T_{1}^{''}

, and

T_{2}^{''}

obtained from tables

T_{1}

and

T_{2}

shown in Figure 1 by operations of changing the decisions, removal of columns, permutation of columns, and duplication of columns, respectively.

Figure 4. Decision tables

T_{1}^{'}

,

T_{2}^{'}

,

T_{1}^{''}

, and

T_{2}^{''}

obtained from tables

T_{1}

and

T_{2}

shown in Figure 1 by operations of changing the decisions, removal of columns, permutation of columns, and duplication of columns, respectively.

Figure 5. Decision table Q.

Figure 6. Decision tables from closed class

C_{0}

, where

d_{1}, \dots, d_{7} \in P (N)

.

Figure 6. Decision tables from closed class

C_{0}

, where

d_{1}, \dots, d_{7} \in P (N)

.

Figure 7. Nondeterministic decision trees

Γ_{1}

and

Γ_{2}

for decision tables

T_{1}

and

T_{2}

depicted in Figure 1.

Figure 7. Nondeterministic decision trees

Γ_{1}

and

Γ_{2}

for decision tables

T_{1}

and

T_{2}

depicted in Figure 1.

Figure 8. Deterministic decision trees

Γ_{1}^{'}

and

Γ_{2}^{'}

for decision tables

T_{1}

and

T_{2}

depicted in Figure 1.

Figure 8. Deterministic decision trees

Γ_{1}^{'}

and

Γ_{2}^{'}

for decision tables

T_{1}

and

T_{2}

depicted in Figure 1.

Figure 9. Possible upper types of a union of two compatible t pairs.

Figure 10. Decision table

G_{i}

.

Figure 10. Decision table

G_{i}

.

Figure 11. Decision tables from closed class

T_{i}

, where

d_{1}, \dots, d_{8} \in P (N)

.

Figure 11. Decision tables from closed class

T_{i}

, where

d_{1}, \dots, d_{8} \in P (N)

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ostonov, A.; Moshkov, M. Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes. Entropy 2024, 26, 519. https://doi.org/10.3390/e26060519

AMA Style

Ostonov A, Moshkov M. Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes. Entropy. 2024; 26(6):519. https://doi.org/10.3390/e26060519

Chicago/Turabian Style

Ostonov, Azimkhon, and Mikhail Moshkov. 2024. "Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes" Entropy 26, no. 6: 519. https://doi.org/10.3390/e26060519

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes

Abstract

1. Introduction

2. Basic Definitions

2.1. Decision Tables and Closed Classes

2.2. Deterministic and Nondeterministic Decision Trees

2.3. Complexity Measures

2.4. Information Systems

2.5. Types of T Pairs

3. Main Results

4. Possible Upper Types of T Pairs

5. Realizable Upper Types of T Pairs

6. Union of T Pairs

7. Proofs of Theorems 1 and 2

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI