*3.2. Decoding*

We use a maximum a posteriori (MAP) method to reconstruct a signal **x** in the NTGT.

$$
\hat{\mathbf{x}} = \arg\max P(\mathbf{x}|\mathbf{y}, \mathbf{A})\tag{6}
$$

The posteriori probability in (6) is as follows:

$$\begin{aligned} P(\mathbf{x}|\mathbf{y},\mathbf{A}) &= \frac{P(\mathbf{x},\mathbf{y},\mathbf{A})}{P(\mathbf{y},\mathbf{A})} \\ &\propto P(\mathbf{x},\mathbf{y},\mathbf{A}) \\ &= \sum\_{\mathbf{e}} P(\mathbf{x},\mathbf{y},\mathbf{A},\mathbf{e}) \\ &= \sum\_{\mathbf{e}} P(\mathbf{x})P(\mathbf{A})P(\mathbf{e})P(\mathbf{y}|\mathbf{x},\mathbf{A},\mathbf{e}) \end{aligned} \tag{7}$$

The last line of (7) is obtained using independent conditions, while the conditional probability *P*( **<sup>y</sup>**|**<sup>x</sup>**, **A**, **e**) is an indicator function that satisfies the following condition:

$$P(\mathbf{y}|\mathbf{x}, \mathbf{A}, \mathbf{e}) = \begin{cases} 1 & \text{if } \mathbf{y} = \mathbf{z} \oplus \mathbf{e}, \\ 0 & \text{if } \mathbf{y} \neq \mathbf{z} \oplus \mathbf{e}, \end{cases} \tag{8}$$

We define an error event if **xˆ** from (6) is not the same as the true realization of **x**. In other words, the probability of an error is expressed as *PE* = Pr{**xˆ** = **<sup>x</sup>**}.

#### *3.3. Bounds for Group Testing Schemes*

Now consider the number of tests on successful decoding in the conventional GT models. The number of tests required to identify *K* defective samples out of all *N* samples for an adaptive GT algorithm with perfect reconstruction denotes as *<sup>m</sup>*(*<sup>N</sup>*, *<sup>K</sup>*). Moreover, for the case of a non-adaptive model, the number of tests is defined as *<sup>m</sup>*¯(*<sup>N</sup>*, *<sup>K</sup>*). The number of tests *N* required for individual testing is greater than *<sup>m</sup>*¯(*<sup>N</sup>*, *<sup>K</sup>*). Adaptive GT models require less or equal number of tests than those of non-adaptive GTs because they check the results of previous tests and perform the next tests, *<sup>m</sup>*(*<sup>N</sup>*, *K*) ≤ *<sup>m</sup>*¯(*<sup>N</sup>*, *<sup>K</sup>*). Even if the number of defective samples is one, at least one test must be performed, 1 ≤ *<sup>m</sup>*(*<sup>N</sup>*, *<sup>K</sup>*). Therefore, the number of tests has a wide range as follows:

$$1 \le m(N, K) \le m(N, K) \le N \tag{9}$$

From an information-theoretic bound, the minimum number of tests *M* for a GT framework with a sample space is obtained as [3],

$$M \ge \log\_2 |\mathcal{S}|\tag{10}$$

where S denotes the sample space. In addition, an information-theoretic performance is presented even for a GT framework with small error probability. It is expressed as an upper bound of the error probability for the number of tests required for successful decoding. This GT algorithm performs in such a way as the following bound on successful probability *Ps* for decoding of defective samples [25]:

$$P\_s \le \frac{M}{\log\_2 \binom{N}{K}}\tag{11}$$

In the past half century, many studies on GT models have been performed, and among them, well-known and important GT algorithms are introduced next. The first one to be considered is the binary splitting algorithm [3]. This algorithm solves the existing GT problems efficiently and is applicable to the adaptive GT models. So far, the reason this algorithm is used for GT problems is because of its simplicity and good performance. The number of tests required to reconstruct defective samples using the binary splitting algorithm is known through the following bounds:

$$M = \begin{cases} \begin{array}{c} N \\ (\log\_2 \sigma + 2)K + p - 1 \end{array} & \text{if } N \le 2K - 1, \end{cases} \tag{12}$$

where *σ* is the number of samples to be included for one test, and *p* is a uniquely determined nonnegative integer conditioning *p* < *K*.

Next, the definite defectives algorithm [26] is considered. This algorithm is suitable for non-adaptive GT models because an unknown input signal can be reconstructed using all of the test results at the same time through an iterative process. The feature of the definite defectives algorithm is attractive in that it can eliminate false negatives that may occur during the reconstruction process. As a result, the use of the definite defectives algorithm is more useful in applications where false negatives are sensitive or should not be present. For given *N* and *K*, the definite defective algorithm has the following lower bound for the number of tests *M* required for identifying defective samples if it is allowed an error rate of *σ*,

$$M \ge (1 - \sigma) \log\_2 \binom{N}{K} \tag{13}$$

This can be observed that (11) and (13) coincide with the same in the perfect reconstruction of defective samples.

#### **4. Necessary Condition for Complete Recovery**

## *4.1. Lower Bound*

In this section, we take into account a necessary condition for the number of tests required to identify defective samples in the NTGT model. We obtain the necessary condition using Fano's inequality theorem [27] presented in information theory. Fano's inequality is mainly exploited in channel coding theory, and describes the connection between error probability and entropy. In addition, in [28], the authors reviewed GT problems comprehensively and in-depth from an information theory perspective. The lower bound on the probability of an error is obtained by considering Fano's inequality theorem. From this lower bound, we are lead to the necessary condition for the number of tests to find all defective samples for the NTGT model. We first explain Fano's inequality theorem before deriving the necessary condition.

**Theorem 1** (Fano's inequality [27])**.** *Suppose there are random variables A and B of finite size. If the decoding function* Φ *that finds A by considering B is used, the following inequality holds:*

$$1 + P(\Phi(B) \neq A) \log\_2 |A| \ge H(A|B) \tag{14}$$

*where P*(Φ(*B*) = *A*) *is the probability of an error for the decoding function* Φ*, and the conditional entropy H*(*A*|*B*) *is defined as follows:*

$$H(A|B) = -\sum\_{a \in A} \sum\_{\beta \in B} P\_{AB}(a, \beta) \log P\_{A|B}\left(a|\beta\right) \tag{15}$$

*where PAB and PA*|*B are the joint probability and conditional probability, respectively.*

In the NTGT problem, we are able to obtain a lower bound on the probability of an error. This lower bound shows the minimum number of tests required to reconstruct an unknown signal, regardless of which decoding function is used. In this paper, our lower bound is a variant of the results obtained in [8]. Compared to [8], this work obtains the lower bound taking into account the measurement noise. However, the overall procedure of derivation is similar to each other because it uses Fano's inequality theorem.

**Theorem 2** (Lower bound)**.** *For any decoding function with the unknown sample signal defined in* (1) *and the measurement noise defined in* (4)*, a necessary condition for the probability of error PE to be less than an arbitrary small and positive value ρ for PE* < *ρ holds such that*

$$\frac{NH(\delta) - M + MH(\eta) - 1}{N} < \rho \tag{16}$$

*where <sup>H</sup>*(·) *is the entropy function.*

by

**Proof of Theorem 2.** Let **xˆ** be the estimated signal of **x** found using the decoding function. Considering the following process in terms of a Markov chain, we can say **x** → (**<sup>y</sup>**, **A**) → **xˆ**. Then, the following inequality is satisfied,

$$H(\mathbf{x}|\mathbf{y}, \mathbf{A}) \le H(\mathbf{x}|\mathbf{\hat{x}}) \tag{17}$$

Further, from Fano's inequality described in (14), the conditional entropy is bounded

$$H(\mathbf{x}|\mathbf{y}, \mathbf{A}) \le 1 + P\_E \log\_2 \left( 2^N - 1 \right) \tag{18}$$

Then, the probability of error is bounded in terms of the conditional entropy and the total number of samples *N*,

$$P\_E \ge \frac{H(\mathbf{x}|\mathbf{y}, \mathbf{A}) - 1}{N} \tag{19}$$

It needs to tackle the conditional entropy *<sup>H</sup>*(**x**|**<sup>y</sup>**, **<sup>A</sup>**). Let us divide and expand the following conditional entropy in more detail:

$$\begin{array}{rcl}H(\mathbf{x}|\mathbf{y},\mathbf{A})&=&H(\mathbf{x})-I(\mathbf{x};\mathbf{y},\mathbf{A})\\&=&H(\mathbf{x})-(I(\mathbf{x};\mathbf{A})+I(\mathbf{x};\mathbf{y}|\mathbf{A})) \\&\overset{(a)}{=}&H(\mathbf{x})-(H(\mathbf{y}|\mathbf{A})-H(\mathbf{y}|\mathbf{A},\mathbf{x})) \end{array} \tag{20}$$

where *<sup>I</sup>*(·) is mutual information, and equality (a) comes from the fact that **x** and **A** are independent of each other. Note that the smaller the term on the right side of (19), the lower the minimum value of the probability of error. This means that the conditional entropy, *<sup>H</sup>*(**x**|**<sup>y</sup>**, **<sup>A</sup>**), should be small as possible. As a result, on the last line of the right side in (20), the conditional entropy *<sup>H</sup>*(**y**|**A**) should be large; conversely, the conditional entropy *<sup>H</sup>*(**y**|**<sup>A</sup>**, **x** ) should be small.

To do this, let us find the maximum and minimum values of the two conditional entropies, respectively.

$$\begin{array}{rcl}H(\mathbf{y}|\mathbf{A}) \leq H(\mathbf{y}) & = & H(\mathbf{z} \oplus \mathbf{e}) \\ & \leq & M \end{array} \tag{21}$$

where the first inequality is due to the definition of conditional entropy, and the last inequality comes from the fact that the result *yj* is either 0 or 1, *yj* values are independent of each other, and the maximum binary entropy is 1 in the case that Pr *yj* = 0 = Pr *yj* = 1 . Next, we take into account the other conditional entropy *<sup>H</sup>*(**y**|**<sup>A</sup>**, **x** ) which is minimized,

$$\begin{array}{rcl}H(\mathbf{y}|\mathbf{A},\mathbf{x})&=&H(\mathbf{z}\oplus\mathbf{e}|\mathbf{A},\mathbf{x})\\&=&H(\mathbf{e})\\&=&MH(\eta)\end{array}\tag{22}$$

where the second equality comes from how the randomness of **z** vanishes if **x** and **A** are known, the last equality being due to the independent events of **e**. Using (21) and (22), (20) can be rewritten as

$$H(\mathbf{x}|\mathbf{y}, \mathbf{A}) \le NH(\delta) - M + MH(\eta) \tag{23}$$

Finally, if (19) is changed to satisfy the condition *PE* < *ρ* where *ρ* is a small, positive value and *ρ* > 0, the following condition holds:

$$\frac{NH(\delta) - M + MH(\eta) - 1}{N} < \rho \tag{24}$$

This completes the proof of Theorem 2.

#### *4.2. Construction of Noisy Threshold Group Testing*

We now consider the result obtained from Theorem 2. First, Theorem 2 can be expressed as the ratio of the number of tests to the total number of samples as follows:

$$\frac{M}{N} > \frac{H(\delta) - \rho}{1 - H(\eta)}\tag{25}$$

It is advantageous to use the NTGT framework until the point *N* and *M* are equal. Otherwise, when *M* > *N*, individual testing becomes more effective than GT. This shows that NTGT can theoretically be used under the following noise conditions:

$$H(\eta) < 1 + \rho - H(\delta) \tag{26}$$

To design an NTGT framework, how to construct a group matrix is important. The key to this is shown in the proof of Theorem 2. Looking carefully at the conditions under which the inequality of conditional entropy holds in (21), the maximum conditional entropy *<sup>H</sup>*(**y**|**A**) is obtained when the following conditions are satisfied: Pr *yj* = 0 = Pr *yj* = 1 . This means that the NTGT system should be designed so that the output has an equal probability of being 0 or 1. Since **x** and **A** are independent of each other, the probability of an output of 0 is as follows:

$$\Pr\left(y\_{j} = 0\right) = \sum\_{t=0}^{T-1} \binom{N}{t} \left(\delta\gamma\right)^{t} \left(1 - \delta\gamma\right)^{N-t} = \frac{1}{2} \tag{27}$$

As shown in (27), it can be seen that there is a trade-off between *δ* and *γ*. In other words, to reconstruct a sparse signal, a high-density group matrix needs to be generated and used. Conversely, if the signal is not sparse, the group matrix should be designed with low density.

#### **5. Sufficient Condition for Average Performance**

## *5.1. Upper Bound*

Now we prove there is an upper bound on the probability of errors from the MAP decoding used in NTGT. We divide the proof of the upper bound into two parts: one considers the definition of the error event and the other part formulates the probability of errors.

We rewrite the a posteriori probability.

$$P(\mathbf{x}|\mathbf{y}, \mathbf{A}) \propto \sum\_{\mathbf{e}} P(\mathbf{x}) P(\mathbf{A}) P(\mathbf{e}) \mathbf{1}\_{\mathbf{y} = \mathbf{x} \oplus \mathbf{e}} \tag{28}$$

Note that both **A** and **y** are given and known. Using MAP decoding, we estimate with (28)

$$\mathfrak{X} = \arg\max\_{\mathbf{x}} \sum\_{\mathbf{e}} P(\mathbf{x}) P(\mathbf{A}) P(\mathbf{e}) \mathbf{1}\_{\mathbf{y} = \mathbf{z} \oplus \mathbf{e}} \tag{29}$$

An error event occurs if there is a feasible vector **x¯** = **x**, such that

$$\sum\_{\mathbf{v}} P(\bar{\mathbf{x}}) P(\mathbf{v}) \mathbf{1}\_{\mathbf{y} = \mathbf{w} \oplus \mathbf{v}} \ge \sum\_{\mathbf{e}} P(\mathbf{x}) P(\mathbf{e}) \mathbf{1}\_{\mathbf{y} = \mathbf{z} \oplus \mathbf{e}} \tag{30}$$

where **w** = ∑*Ajix*¯*i*≥*<sup>T</sup>* **Ax**¯ comes from (3), and **v** comes from a realization from (4). When given **y**, **A**, and **x**, we have one vector **e**, such that **e** = **z** ⊕ **y**. Then we can rewrite (30).

$$P(\bar{\mathbf{x}})P\_{\mathbf{V}}(\mathbf{y}\oplus\mathbf{w}) \ge P(\mathbf{x})P\_{\mathbf{e}}(\mathbf{y}\oplus\mathbf{z})\tag{31}$$

Therefore, an error event becomes equivalent to there existing a pair (**x**¯, **v**) such that

$$\begin{array}{c} \bar{\mathbf{x}} \neq \mathbf{x}, \\ \mathbf{y} = \mathbf{w} \oplus \mathbf{v} = \mathbf{z} \oplus \mathbf{e}, \\ P(\bar{\mathbf{x}}) P\_{\mathbf{v}}(\mathbf{y} \oplus \mathbf{w}) \ge P(\mathbf{x}) P\_{\mathbf{e}}(\mathbf{y} \oplus \mathbf{z}) \end{array} \tag{32}$$

So far, we have defined the error event and now we will derive an upper bound on the probability of error. When given **x** and **e**, we let *<sup>P</sup>*(I|**<sup>x</sup>**, **e** ) be the conditional error probability. We have an average error probability as follows:

$$P\_E = \sum\_{\mathbf{x}} \sum\_{\mathbf{e}} P(\mathbf{x}, \mathbf{e}) P(\mathcal{Z} | \mathbf{x}, \mathbf{e}) \tag{33}$$

We now introduce two typical sets that were defined in [27] (Ch.3.1). Let A *M* [**e**]*ε* and A *N* [**x**]*ε* be typical sets of **x** and **e** with respect to *<sup>P</sup>*(**x**) and *<sup>P</sup>*(**e**) as defined in (1) and (4). For any positive number *ε* and sufficiently large numbers of *N* and *M*, the two typical sets are defined as 

$$\mathcal{A}\_{[\mathbf{x}]\varepsilon}^{N} = \left\{ \mathbf{x} \in \mathbf{2}^{N} : \left| -\frac{1}{N} \log P(\mathbf{x}) - H(\delta) \right| \le \varepsilon \right\} \tag{34}$$

and

$$\mathcal{A}\_{[\mathbf{e}]\varepsilon}^{M} = \left\{ \mathbf{e} \in \mathbf{2}^{M} : \left| -\frac{1}{M} \log P(\mathbf{e}) - H(\eta) \right| \le \varepsilon \right\} \tag{35}$$

From the Shannon–McMillan–Breiman theorem [27] (Ch.16.8), we obtain the following two bounds: 

$$P\left(\left|-\frac{1}{N}\log P(\mathbf{x}) - H(\delta)\right| \le \varepsilon\right) \ge 1 - \varepsilon \tag{36}$$

and

$$P\left(\left|-\frac{1}{M}\log P(\mathbf{e}) - H(\eta)\right| \le \varepsilon\right) \ge 1 - \varepsilon \tag{37}$$

Now we define the space of the pair(**<sup>x</sup>**, **e**) with respect to the two typical sets. Let U and U*c* be the sets for the pair (**<sup>x</sup>**, **e**) such that

$$\mathcal{U} = \left\{ \mathbf{x} \in \mathfrak{L}^{N}, \mathbf{e} \in \mathfrak{L}^{M} : \left( \mathbf{x} \in \mathcal{A}\_{[\mathbf{x}]\varepsilon}^{N} \bigcap \mathbf{e} \in \mathcal{A}\_{[\mathbf{e}]\varepsilon}^{M} \right) \right\} \tag{38}$$

and

$$\mathcal{M}^{\mathbb{C}} = \left\{ \mathbf{x} \in \mathbf{2}^{N}, \mathbf{e} \in \mathbf{2}^{M} : \left( \mathbf{x} \notin \mathcal{A}\_{[\mathbf{x}] \boldsymbol{\varepsilon}}^{N} \bigcup \mathbf{e} \notin \mathcal{A}\_{[\mathbf{e}] \boldsymbol{\varepsilon}}^{M} \right) \right\} \tag{39}$$

where U is the joint typical set for the pair (**<sup>x</sup>**, **<sup>e</sup>**), since **x** and **e** are independent.

**Theorem 3** (Upper bound)**.** *In NTGT one, a distribution of defective samples defined in* (1) *and noise probability defined in* (4)*, for any small ε, the ratio of the number of tests M to the total number of samples N is upper-bounded:*

$$\frac{M}{N} > \frac{H(\delta) + \varepsilon}{1 - H(\eta) - \varepsilon} \tag{40}$$

**Proof of Theorem 3.** The probability of error is bounded as

$$\begin{split} P\_{\mathcal{E}} &= \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} P(\mathbf{x}) P(\mathbf{e}) P(\mathcal{Z}|\mathbf{x}, \mathbf{e}) + \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}^{c}} P(\mathbf{x}) P(\mathbf{e}) P(\mathcal{Z}|\mathbf{x}, \mathbf{e}) \\ &\stackrel{(a)}{\leq} \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} P(\mathbf{x}) P(\mathbf{e}) P(\mathcal{Z}|\mathbf{x}, \mathbf{e}) + \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}^{c}} P(\mathbf{x}) P(\mathbf{e}) \\ &\stackrel{(b)}{\leq} \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} P(\mathbf{x}) P(\mathbf{e}) P(\mathcal{Z}|\mathbf{x}, \mathbf{e}) + 2\varepsilon \end{split} \tag{41}$$

where (a) is due to *<sup>P</sup>*(I|**<sup>x</sup>**, **e** ) ≤ 1, and (b) comes from the following,

$$\begin{split} \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}^{\xi}} P(\mathbf{x})P(\mathbf{e}) &= 1 - \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} P(\mathbf{x})P(\mathbf{e}) \\ &= 1 - \sum\_{\mathbf{x} \notin \mathcal{A}^{N}\_{[\mathbf{x}] \times \varepsilon}} P(\mathbf{x}) \sum\_{\mathbf{e} \notin \mathcal{A}^{M}\_{[\mathbf{e}] \times \varepsilon}} P(\mathbf{e}) \\ &\leq 1 - (1 - \varepsilon)(1 - \varepsilon) \\ &\leq 2\varepsilon \end{split} \tag{42}$$

This is because **A** is randomly generated as defined in (2); then we can define the following event as

$$\mathcal{E}\left(\mathbf{x}, \mathbf{e}; \bar{\mathbf{x}}, \mathbf{v}\right) = \left\{ \left(\mathbf{x}, \mathbf{e}; \bar{\mathbf{x}}, \mathbf{v}\right) : \mathbf{z} \oplus \mathbf{e} = \mathbf{w} \oplus \mathbf{v} \right\} \tag{43}$$

The conditional error probability *<sup>P</sup>*(I|**<sup>x</sup>**, **e** ) is the probability of the union of all the events in (43) with respect to all pairs (**x**¯, **v**) that satisfy (32). Thus, the conditional error probability in (33) can be rewritten as

$$P(\mathcal{Z}|\mathbf{x}, \mathbf{e}) = \Pr\left\{ \bigcup\_{\mathbf{x}, \mathbf{v}: P(\mathbf{x})P(\mathbf{v}) \ge P(\mathbf{x})P(\mathbf{e})} \mathcal{E}(\mathbf{x}, \mathbf{e}; \mathbf{\bar{x}}, \mathbf{v}) \right\} \tag{44}$$

Using the union bound in (41), we have the following bound:

$$\begin{split} \boldsymbol{P}\_{\mathrm{E}} &\leq \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} \mathrm{P}(\mathbf{x}) \, P(\mathbf{e}) \sum\_{(\mathbf{x},\mathbf{v}) : P(\mathbf{x}) \, P(\mathbf{v}) \geq P(\mathbf{x})} \mathrm{P}\left(\mathcal{E}\left(\mathbf{x},\mathbf{e};\mathbf{\bar{x}},\mathbf{v}\right)\right) + 2\varepsilon \\ &= \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} \mathrm{P}(\mathbf{x}) P(\mathbf{e}) \sum\_{(\mathbf{x},\mathbf{v})} P\left(\mathcal{E}\left(\mathbf{x},\mathbf{e};\mathbf{\bar{x}},\mathbf{v}\right)\right) \Phi(\mathbf{x},\mathbf{\bar{x}},\mathbf{e},\mathbf{v}) + 2\varepsilon \end{split} \tag{45}$$

where <sup>Φ</sup>(**<sup>x</sup>**, **x**¯, **e**, **v**) is the indicator function, such that *<sup>P</sup>*(**x**¯)*P*(**v**) ≥ *<sup>P</sup>*(**x**)*P*(**e**).

$$\Phi(\mathbf{x}, \bar{\mathbf{x}}, \mathbf{e}, \mathbf{v}) = \begin{cases} 1 & \text{if } P(\bar{\mathbf{x}})P(\mathbf{v}) \ge P(\mathbf{x})P(\mathbf{e})\\ 0 & \text{if } P(\bar{\mathbf{x}})P(\mathbf{v}) < P(\mathbf{x})P(\mathbf{e}) \end{cases} \tag{46}$$

The indicator function is bounded [29] (Ch. 5.6) for 0 < *s* ≤ 1.

$$\Phi(\mathbf{x}, \vec{\mathbf{x}}, \mathbf{e}, \mathbf{v}) \le \left(\frac{P(\vec{\mathbf{x}})P(\mathbf{v})}{P(\mathbf{x})P(\mathbf{e})}\right)^{s} \tag{47}$$

For *s* = 1 in (47), we have the following bound:

$$P\_{\rm E} \leq \sum\_{(\mathbf{x},\mathbf{e}) \in \mathcal{U}} \sum\_{(\mathbf{x},\mathbf{v})} P(\mathbf{x}) P(\mathbf{v}) P\left(\mathcal{E}\left(\mathbf{x},\mathbf{e};\mathbf{\bar{x}},\mathbf{v}\right)\right) + 2\varepsilon \tag{48}$$

From the definition in (43), note that the probability *<sup>P</sup>*E(**<sup>x</sup>**, **e**; **x¯**, **v**) is

$$P\left(\mathcal{E}\left(\mathbf{x}, \mathbf{e}; \overline{\mathbf{x}}, \mathbf{v}\right)\right) = \Pr\left(\mathbf{w} \oplus \mathbf{v} = \mathbf{z} \oplus \mathbf{e}\right) \tag{49}$$

where

$$\begin{split} \boldsymbol{P}\_{\mathcal{E}} & \leq \sum\_{(\mathbf{x}, \mathbf{e}) \in \mathcal{U}} \sum\_{(\mathbf{x}, \mathbf{v})} P(\hat{\mathbf{x}}) P(\mathbf{v}) P(\mathcal{E} \left( \mathbf{x}, \mathbf{e}; \hat{\mathbf{x}}, \mathbf{v} \right)) + 2 \varepsilon \\ &= \sum\_{(\mathbf{x}, \mathbf{e}) \in \mathcal{U}, \|\mathbf{e}\| \in \mathbf{v}\|\_{0} - d\_{2}} \sum\_{\|\mathbf{x}\|\_{0} - d\_{1}} \sum\_{\|\mathbf{e}\| \in \mathbf{v}\|\_{0} - d\_{2}} P(\hat{\mathbf{x}}) P(\mathbf{v}) P\left( \mathbf{z} \oplus \mathbf{w} = \mathbf{e} \oplus \mathbf{v} \, \middle| \, \|\mathbf{x}\|\_{0} = d\_{1}, \, \|\mathbf{e} \oplus \mathbf{v}\|\_{0} = d\_{2} \right) \end{split} \tag{50}$$

In (50), we find the following probability depending on the number of nonzero elements *d*1 and *d*2:

$$P\left(\mathbf{z}\oplus\mathbf{w}=\mathbf{e}\oplus\mathbf{v}\,\middle|\,\|\mathbf{\dot{x}}\|\_{0}=d\_{1},\,\|\mathbf{e}\oplus\mathbf{v}\|\_{0}=d\_{2}\right)=\prod\_{j=1}^{M}P\left(z\_{j}\oplus w\_{j}=e\_{j}\oplus v\_{j}\middle|\,\|\mathbf{\dot{x}}\|\_{0}=d\_{1},\,\|\mathbf{e}\oplus\mathbf{v}\|\_{0}=d\_{2}\right)$$

$$=P\left(z\_{j}\oplus w\_{j}=1\middle|\,\|\mathbf{\dot{x}}\|\_{0}=d\_{1},\,\|\mathbf{e}\oplus\mathbf{v}\|\_{0}=d\_{2}\right)^{d\_{2}}\tag{51}$$

$$\times P\left(z\_{j}\oplus w\_{j}=0\middle|\,\|\mathbf{\dot{x}}\|\_{0}=d\_{1},\,\|\mathbf{e}\oplus\mathbf{v}\|\_{0}=d\_{2}\right)^{M-d\_{2}}$$

$$=(1-P\_{0})^{d\_{2}}P\_{0}^{M-d\_{2}}$$

where each row is independent. Given this, we define the following probability:

$$P\_0 \stackrel{\Delta}{=} \Pr(z\_j \oplus w\_j = 0 || \| \mathbf{\bar{x}} ||\_0 = d\_1) \tag{52}$$

We can divide *P*0 in (52) into two parts. If *d*1 < *T*,

$$\begin{split} \mathbf{P}\_0 &= \Pr\left(z\_j = 0\right) \Pr\left(w\_j = 0\right) + \Pr\left(z\_j = 1\right) \Pr\left(w\_j = 1\right) \\ &= \Pr\left(z\_j = 0\right) \end{split} \tag{53}$$

Otherwise,

$$\begin{split} P\_0 &= \text{Pr}\{z\_j = 0\} \left(\sum\_{t=0}^{T-1} \binom{d\_1}{t} \gamma^t (1-\gamma)^{(d\_1-t)}\right) + \text{Pr}\{z\_j = 1\} \left(\sum\_{t=T}^{d\_1} \binom{d\_1}{t} \gamma^t (1-\gamma)^{(d\_1-t)}\right) \\ &= P\_{z,0}(\delta,\gamma) P\_{w,0}(d\_1,\gamma) + (1-P\_{z,0}(\delta,\gamma))(1-P\_{w,0}(d\_1,\gamma)) \end{split} \tag{54}$$

where

$$\begin{aligned} P\_{z,0}(\delta,\gamma) \stackrel{\Delta}{=} \Pr(z\_{\bar{j}}=0) &= \sum\_{t=0}^{T-1} \binom{N}{t} (\delta\gamma)^t (1-\delta\gamma)^{N-t}, \\ P\_{w,0}(d\_1,\gamma) \stackrel{\Delta}{=} \Pr(w\_{\bar{j}}=0) &= \sum\_{t=0}^{T-1} \binom{d\_1}{t} \gamma^t (1-\gamma)^{d\_1-t} \end{aligned} \tag{55}$$

The maximum for *P*0 by looking at *Pz*,<sup>0</sup>(*<sup>δ</sup>*, *γ*) = 1/2 and *Pw*,<sup>0</sup>(*d*1, *γ*) = 1/2 from the fact that *P*0 in (54) is concave with respect to *Pz*,<sup>0</sup>(*<sup>δ</sup>*, *γ*) and *Pw*,<sup>0</sup>(*d*1, *<sup>γ</sup>*). Therefore, its bound is

$$P\_0 \le \frac{1}{2} \tag{56}$$

Using (51) and (56), (50) can be bounded as follows:

$$\begin{split} &P\_{\mathbb{E}} \leq 2^{-M} \sum\_{d\_1=0, \mathbf{x} \neq \mathbf{x}} \sum\_{(\mathbf{x}, \mathbf{e}) \in \mathcal{U}, \mathbf{x} \parallel \mathbf{e} \parallel d\_0 = d\_1} P(\bar{\mathbf{x}}) \left( \sum\_{\mathbf{v}} P(\mathbf{v}) \right) + 2\varepsilon \\ &\leq 2^{-M} \sum\_{d\_1=0, \mathbf{x} \neq \mathbf{x}} \sum\_{(\mathbf{x}, \mathbf{e}) \in \mathcal{U}, \mathbf{x} \parallel \mathbf{e} \parallel d\_0 = d\_1} P(\bar{\mathbf{x}}) + 2\varepsilon \\ &\leq 2^{-M} \sum\_{\mathbf{x} \in \mathcal{A}^{M}\_{[\mathbf{x}]}} \sum\_{\mathbf{e} \in \mathcal{A}^{M}\_{[\mathbf{e}]}} \sum\_{d\_1=0, \mathbf{x} \neq \mathbf{x}} P(\bar{\mathbf{x}}) + 2\varepsilon \\ &= 2^{-M} \left| \mathcal{A}^{N}\_{[\mathbf{x}] \varepsilon} \right| \cdot \left| \mathcal{A}^{M}\_{[\mathbf{e}] \varepsilon} \right| \sum\_{d\_1=0, \mathbf{x} \neq \mathbf{x}} P(\bar{\mathbf{x}}) + 2\varepsilon \\ &\leq 2^{-M} \left| \mathcal{A}^{N}\_{[\mathbf{x}] \varepsilon} \right| \cdot \left| \mathcal{A}^{M}\_{[\mathbf{e}] \varepsilon} \right| + 2\varepsilon \\ &\leq 2^{-M} 2^{N(H(\delta) + \varepsilon)} 2^{M(H(\eta) + \varepsilon)} + 2\varepsilon \\ &= 2^{N(H(\delta) + \varepsilon) + M(H(\eta) + \varepsilon) - M} + 2\varepsilon \end{split} \tag{57}$$

As the probability of error is less than 1, the exponent term on the right side of (57) is bounded by

$$N(H(\delta) + \varepsilon) + M(H(\eta) + \varepsilon) - M < 0\tag{58}$$

Then, the ratio of *M* to *N* is

$$\frac{M}{N} > \frac{H(\delta) + \varepsilon}{1 - H(\eta) - \varepsilon} \tag{59}$$

This completes the proof of Theorem 3.

#### *5.2. Discussion for Necessary and Sufficient Conditions*

In this section, we discuss the results obtained from Theorems 2 and 3. The result from Theorem 2 allows us to solve the lower bound in the NTGT problem using Fano's inequality. The minimum number of tests required to recover all defective samples with *δ* probability out of *N* samples is also obtained. In other words, Theorem 2 is a necessary condition for any probability of error to be smaller than *ρ*. Conversely, Theorem 3 leads to the upper bound on the probability of an error using the MAP decoding method. This condition refers to the upper bound on performance and is the sufficient condition to allow us to reconstruct defective samples.

We show that the results of Theorems 2 and 3 coincide with each other. Finding and presenting the necessary and sufficient conditions for the number of tests required in the NTGT problem is significant for TGT. In addition, as shown in (27) above, a system design method for NTGT was proposed so that the probability that a test result is 0 and the probability that it is 1 are the same depending on threshold *T*.
