Assisted Identification over Modulo-Additive Noise Channels

Lapidoth, Amos; Ni, Baohua

doi:10.3390/e25091314

Open AccessFeature PaperArticle

Assisted Identification over Modulo-Additive Noise Channels

by

Amos Lapidoth

^*

and

Baohua Ni

Signal and Information Processing Laboratory, ETH Zurich, 8092 Zurich, Switzerland

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(9), 1314; https://doi.org/10.3390/e25091314

Submission received: 15 June 2023 / Revised: 28 August 2023 / Accepted: 1 September 2023 / Published: 8 September 2023

(This article belongs to the Special Issue Extremal and Additive Combinatorial Aspects in Information Theory)

Download Versions Notes

Abstract

:

The gain in the identification capacity afforded by a rate-limited description of the noise sequence corrupting a modulo-additive noise channel is studied. Both the classical Ahlswede–Dueck version and the Ahlswede–Cai–Ning–Zhang version, which does not allow for missed identifications, are studied. Irrespective of whether the description is provided to the receiver, to the transmitter, or to both, the two capacities coincide and both equal the helper-assisted Shannon capacity.

Keywords:

erasures-only capacity; helper; identification capacity; modulo-additive noise; rate-limited; zero-undetected-error capacity

1. Introduction

If a helper can observe the additive noise corrupting a channel and can describe it to the decoder, then the latter can subtract it and thus render the channel noiseless. However, for this to succeed, the description must be nearly lossless and hence possibly of formidable rate. It is thus of interest to study scenarios where the description rate is limited, and to understand how the rate of the help affects performance.

When performance is measured in terms of the Shannon capacity, the problem was solved for a number of channel models [1,2,3], where the former two address assistance to the decoder and the latter to the encoder. When performance was measured in terms of the erasures-only capacity or the list-size capacity, the problem was solved in [4,5]. Error exponents with assistance were studied in [6]. Here we study how rate-limited help affects the identification capacity [7].

We focus on the memoryless modulo-additive channel (MMANC), whose time-k output

Y_{k}

corresponding to the time-k input

x_{k}

is:

Y_{k} = x_{k} \oplus Z_{k}

(1)

where

Z_{k}

is the time-k noise sample; the channel input

x_{k}

, the channel output

Y_{k}

, and the noise

Z_{k}

all take values in the set

A

—also denoted

X

, or

Y

, or

Z

—comprising the

| A |

elements

{0, \dots, | A | - 1}

; and ⊕ and ⊖ denote mod-

| A |

addition and subtraction, respectively. The noise sequence

{Z_{k}}

is IID

\sim P_{Z}

, where

P_{Z}

is some PMF on

A

.

Irrespective of whether the help is provided to the encoder, to the decoder, or to both, the Shannon capacity of this channel coincides with its erasures-only capacity, and both are given by [3] (Section V) and [4] (Theorems 2 and 6):

C_{e-o} (R_{h}) = C_{S h} (R_{h}) = log | A | - {\{H (P_{Z}) - R_{h}\}}^{+}

(2)

where

{ξ}^{+}

denotes

max {0, ξ}

, and

H (P_{Z})

is the Shannon entropy of

P_{Z}

.

Here we study two versions of the identification capacity of this channel: Ahlswede and Dueck’s original identification capacity

C_{ID}

[7], and the identification capacity subject to no missed-identifications

C_{ID, 0}

[8]. Our main result is that—irrespective of whether the help is provided to the encoder, to the decoder, or to both—the two identification capacities coincide and both equal the right-hand side (RHS) of (2).

2. Problem Formulation

The identification-over-a-channel problem is parameterized by the blocklength n, which tends to infinity in the definition of the identification capacity. The n-length noise sequence

Z^{n} \in A^{n}

is presented to a helper, which produces its

n R_{h}

-bit description

t (Z^{n})

:

t (z^{n}) \in T

(3)

where:

T = {0, 1}^{n R_{h}} .

(4)

We refer to the set

N = {1, \dots, N}

as the set of identification messages and to its cardinality

N

as the number of identification messages. The identification rate is defined (for

N

sufficiently large) as:

\frac{1}{n} log log N .

(5)

A generic element of

N

—namely, a generic identification message—is denoted i.

If no help is provided to the encoder, then the latter is specified by a family

{P_{X^{n}}^{i}}_{i \in N}

of PMFs on

A^{n}

that are indexed by the identification messages, with the understanding that, to convey the identification message (IM) i, the encoder transmits a random sequence in

A^{n}

that it draws according to the PMF

P_{X^{n}}^{i}

. If help

T = t (Z^{n}) \in T

is provided to the encoder, then the encoder’s operation is specified by a family of PMFs

{P_{X^{n} | t}^{i}}_{(i, t) \in N \times T}

that is now indexed by pairs of identification messages and noise descriptions, with the understanding that, to convey IM i given the description

T = t (Z^{n})

, the encoder produces a random n-length sequence of channel inputs that is distributed according to

P_{X^{n} | T}^{i}

. In either case, the channel output sequence

Y^{n}

is:

Y^{n} = X^{n} \oplus Z^{n}

(6)

componentwise.

If help is provided to the encoder, and if IM i is to be conveyed, then the joint distribution of

(X^{n}, Z^{n}, Y^{n}, T)

has the form:

P_{Z^{n}} (z^{n}) P_{T | Z^{n}} (t | z^{n}) P_{X^{n} | T}^{i} (x^{n} | t) P_{Y^{n} | X^{n}, Z^{n}} (y^{n} | x^{n}, z^{n})

(7)

where

P_{Y^{n} | X^{n}, Z^{n}} (y^{n} | x^{n}, z^{n}) = 𝟙 \{y^{n} = x^{n} \oplus z^{n}\}

(8)

and where

P_{T | Z^{n}} (t | z^{n}) = 𝟙 \{t = t (z^{n})\}

(9)

because we are assuming that the noise description is a deterministic function of the noise sequence. (The results also hold if we allow randomized descriptions: our coding schemes employ deterministic descriptions and the converse allows for randomization.) Here

𝟙 {statement}

equals 1 if the statement holds and equals 0 otherwise. In the absence of help, the joint distribution has the form:

P_{Z^{n}} (z^{n}) P_{T | Z^{n}} (t | z^{n}) P_{X^{n}}^{i} (x^{n}) P_{Y^{n} | X^{n}, Z^{n}} (y^{n} | x^{n}, z^{n}) .

(10)

Based on the data available to it—

Y^{n}

in the absence of help to the decoder and

(Y^{n}, t (Z^{n}))

in its presence—the receiver performs

N

binary tests indexed by

i \in N

, where the i-th test is whether or not the IM is i. It accepts the hypothesis that the IM is i if

Y^{n}

is in its acceptance region, which we denote

D_{i} (t) \in A^{n}

in the presence of decoder assistance

t \in T

and

D_{i} \in A^{n}

in its absence.

When the help

t \in T

is provided to the receiver, the probability of missed detection associated with IM

i \in N

is thus:

p_{MD}^{i} (t) = 1 - P_{Y^{n} | T = t}^{i} (D_{i} (t))

(11)

and the worst-case false alarm associated with it is:

p_{FA}^{i} (t) = max_{j \in N ∖ {i}} P_{Y^{n} | T = t}^{j} (D_{i} (t)) .

(12)

Note that, given

t \in T

, the acceptance regions

{D_{i} (t)}_{i \in N}

of the different tests need not be disjoint. We define:

p_{MD, \max} = max_{i \in N} \sum_{t \in T} P_{T} (t) p_{MD}^{i} (t)

(13)

and:

p_{FA, \max} = max_{i \in N} \sum_{t \in T} P_{T} (t) p_{FA}^{i} (t) .

(14)

In the absence of help to the receiver, the probability of missed detection associated with IM i is:

p_{MD}^{i} = 1 - \sum_{t \in T} P_{T} (t) P_{Y^{n} | T = t}^{i} (D_{i}) = 1 - P_{Y^{n}}^{i} (D_{i})

(15)

and the worst-case probability of false alarm associated with it is:

p_{FA}^{i} = \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} P_{Y^{n} | T = t}^{j} (D_{i}) .

(16)

In this case, we define:

p_{MD, \max} = max_{i \in N} p_{MD}^{i}

(17)

and:

p_{FA, \max} = max_{i \in N} p_{FA}^{i} .

(18)

In both cases we say that a scheme is of zero missed detectionsif

p_{MD, \max}

is zero.

A rate R is an achievable identification rate if, for every

γ > 0

and every

ϵ > 0

, there exists some positive integer

n_{0}

such that, for all blocklengths n exceeding

n_{0}

, there exists a scheme with:

N = ⌈ 2^{2^{n (R - γ)}} ⌉

(19)

identification messages for which:

max {p_{MD, \max}, p_{FA, \max}} < ϵ .

(20)

The supremum of achievable rates is the identification capacity with a helper

C_{ID} (R_{h})

. Replacing requirement (20) with:

p_{MD, \max} = 0, p_{FA, \max} < ϵ

(21)

leads to the definition of the zero missed-identification capacity

C_{ID, 0} (R_{h})

.

Remark 1.

Writing out

p_{FA, \max}

of (14) as:

p_{FA, \max} = max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} P_{Y^{n} | T = t}^{j} (D_{i} (t))

(22)

highlights that (prior to maximizing over i) we first maximize over j and then average the result over t. In this sense, the help—even if provided to both encoder and decoder—cannot be viewed as “common randomness” in the sense of [9,10,11] where the averaging over the common randomness is performed before taking the maximum. Our criterion is more demanding of the direct part (code construction) and less so of the converse.

Both criteria are interesting. Ours allows for the notion of “outage”, namely, descriptions that indicate that identification might fail and that therefore call for retransmission. The other criterion highlights the interplay between the noise description and the generation of common randomness (particularly when the help is provided to both transmitter and receiver).

The following theorem is the main result of this paper.

Theorem 1.

On the modulo additive noise channel—irrespective of whether the help is provided to the transmitter, to the receiver, or to both—the identification capacity with a helper

C_{ID} (R_{h})

and the zero missed-identification capacity with a helper

C_{ID, 0} (R_{h})

are equal and coincide with the Shannon capacity:

C_{ID} (R_{h}) = C_{ID, 0} (R_{h}) = C_{Sh} (R_{h})

(23)

where the latter is given in (2).

We prove this result by establishing in Section 3 that

C_{ID, 0} (R_{h}) \geq C_{Sh} (R_{h})

using a slight strengthening of recent results in [4] in combination with the code construction proposed in [8]. The converse is proved in Section 4, where we use a variation on a theme by Watanabe [12] to analyze the case where the assistance is provided to both transmitter and receiver.

3. Direct Part: Zero Missed Detection

In this section we prove that:

C_{ID, 0} (R_{h}) \geq C_{Sh} (R_{h})

(24)

by proposing identification schemes of no missed detections and of rates approaching

C_{Sh} (R_{h})

. To this end, we extend to the helper setting the connection—due to Ahlswede, Cai, and Zhang [8]—between the zero-missed-detection identification capacity

C_{ID, 0}

and the erasures-only capacity

C_{e-o}

. We then call on recent results [4] to infer that, on the modulo-additive noise channel with a helper, the Erasures-Only capacity is equal to the Shannon capacity. We treat encoder-only assistance and decoder-only assistance separately. Either case also proves achievability when the assistance is provided to both encoder and decoder.

Recall that an erasures-only decoder produces a list

L

comprising the messages under which the observation is of positive likelihood and then act as follows: If the list contains only one message, it produces that message; otherwise, it declares an erasure. Since the list always contains the transmitted message, this decoder never errs. The erasures-only capacity is defined like the Shannon capacity, but with the additional requirement that the decoder be the erasures-only decoder. This notion extends in a natural way to settings with a helper [4].

3.1. Encoder Assistance

A rate-R, blocklength-n, encoder-assisted, erasures-only transmission code comprises a message set

M = {1, \dots, M}

with

M = 2^{n R}

messages and a collection of

M

mappings

{f_{m}}_{m \in M}

from

T

to

X^{n}

, indexed by

M

, with the understanding that to transmit Message m after being presented with the help

t (Z^{n}) \in T

, the encoder produces the n-tuple of channel inputs

f_{m} (t (Z^{n})) \in X^{n}

. Since the decoder observes only the channel outputs (and not the help), it forms the list:

L (y^{n}) = \{m \in M : \exists t \in T s . t . P_{Y^{n} | X^{n}, T} (y^{n} | f_{m} (t), t)) > 0\} .

(25)

The collection of output sequences that cause the erasures-only decoder to produce an erasure is:

Y_{er} = \{y^{n} \in A^{n} : | L (y^{n}) | > 1\} .

(26)

The probability of erasure associated with the transmission of Message m with encoder help t is

P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t)

. On the modulo additive noise channel with rate-

R_{h}

encoder assistance, the erasures-only capacity and the Shannon capacity coincide and [4]:

C_{e-o} (R_{h}) = C_{Sh} (R_{h}) = log | A | - {\{H (P_{Z}) - R_{h}\}}^{+} .

(27)

We shall need the following slightly-stronger version of the achievability part of this result, where we swap the maximization over the messages with the expectation over the help:

Proposition 1.

Consider the modulo additive noise channel with rate-

R_{h}

encoder assistance. For any transmission rate R smaller than

C_{e-o} (R_{h})

of (27)), there exists a sequence of rate-R transmission codes for which:

lim_{n \to \infty} \sum_{t \in T} P_{T} (t) max_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t) = 0 .

(28)

A similar result holds for decoder assistance.

Proof.

The proof is presented in Appendix A. It is based on the construction in [4], but with a slightly finer analysis. □

The coding scheme we propose is essentially that of [8]. We just need to account for the help. For each blocklength n, we start out with a transmission code of roughly

2^{n C_{e-o} (R_{h})}

codewords for which (28) holds, and use Lemma 1 ahead to construct approximately

2^{2^{n C_{e-o} (R_{h})}}

lightly-intersecting subsets of its message set. We then associate an IM with each of the subsets, with the understanding that to transmit an IM we pick uniformly at random one of the messages in the subset associated with it and transmit this message with the helper’s assistance.

Lemma 1

([7] Proposition 14). Let

Z

be a finite set, and let

λ \in (0, \frac{1}{2})

be given. If

ϵ > 0

is sufficiently small so that:

λ log (\frac{1}{ϵ} - 1) > 2 and \frac{1}{ϵ} > 6

(29)

then there exist subsets

A_{1}, \dots, A_{N}

of

Z

such that for all distinct

i, j \in {1, \dots, N}

the following hold:

\begin{matrix} (30) & (a) & | A_{i} | & = & ⌊ϵ | Z |⌋, \\ (31) & (b) & | A_{i} \cap A_{j} | & < & λ ⌊ϵ | Z |⌋, \\ (32) & (c) & N & \geq & {| Z |}^{- 1} \cdot 2^{⌊ ϵ | Z | ⌋} - 1 . \end{matrix}

With the aid of this lemma, we can now prove the achievability of

C_{e-o} (R_{h})

.

Proof.

Given an erasures-only encoder-assisted transmission code

{(f_{m})}_{m \in M}

where

f_{m} : T \to X^{n}

, we apply Lemma 1 to the transmission message set

M

with:

ϵ = \frac{1}{n^{2} + 2} and λ = \frac{1}{log n}

(33)

to infer, for large enough n, the existence of subsets

F_{1}, \dots, F_{N} \subseteq M

such that for all distinct

i, j \in {1, \dots, N}

with

j \neq i

:

\begin{matrix} (34) & | F_{i} | & = & ⌊\frac{M}{n^{2} + 2}⌋ \\ (35) & | F_{i} \cap F_{j} | & < & \frac{1}{log n} ⌊\frac{M}{n^{2} + 2}⌋ \\ (36) & N & \geq & M^{- 1} \cdot 2^{⌊\frac{M}{n^{2} + 2}⌋} - 1 . \end{matrix}

Note that (36) implies that:

\underset{n \to \infty}{lim_{̲}} (\frac{1}{n} log log N - \frac{1}{n} log M) \geq 0 .

(37)

To send IM i after obtaining the assistance

t (z^{n})

, the encoder picks a random element M from

F_{i}

equiprobably and transmits

X^{n} = f_{M} (t (Z^{n}))

, so:

P_{X^{n} | T}^{i} (x^{n} | t) = \frac{1}{| F_{i} |} \sum_{m \in F_{i}} 𝟙 \{x^{n} = f_{m} (t)\} .

(38)

To guarantee no missed detections, we set the acceptance region of i-th IM to be:

D_{i} = \{y^{n} \in Y^{n} : \exists (m, t) \in F_{i} \times T s . t . P_{Y^{n} | X^{n}, T} (y^{n} | f_{m} (t), t) > 0\} .

(39)

It now remains to analyze the scheme’s maximal false-alarm probability.

\begin{matrix} (40) & p_{FA, \max} & = & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} P_{Y^{n} | T = t}^{j} (D_{i}) \\ (41) & = & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \frac{1}{| F_{j} |} \sum_{m \in F_{j}} P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t) \\ (42) & = & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \frac{1}{| F_{j} |} [\sum_{m \in F_{j} ∖ F_{i}} P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t) \\ + \sum_{m \in F_{j} \cap F_{i}} P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t)] \\ (43) & \leq & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \frac{1}{| F_{j} |} [\sum_{m \in F_{j} ∖ F_{i}} P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t) + |F_{j} \cap F_{i}|] \\ (44) & \leq & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \frac{1}{| F_{j} |} [\sum_{m \in F_{j} ∖ F_{i}} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t) + |F_{j} \cap F_{i}|] \\ (45) & < & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \{\frac{1}{| F_{j} |} \sum_{m \in F_{j} ∖ F_{i}} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t) + \frac{⌊ \frac{M}{n^{2} + 2} ⌋}{| F_{j} | log n}\} \\ (46) & \leq & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \{\frac{|F_{j} ∖ F_{i}|}{| F_{j} |} max_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t)\} + \frac{1}{log n} \\ (47) & \leq & max_{i \in N} \sum_{t \in T} P_{T} (t) max_{j \in N ∖ {i}} \{max_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t)\} + \frac{1}{log n} \\ (48) & = & \sum_{t \in T} P_{T} (t) max_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t) + \frac{1}{log n} \end{matrix}

where in (41) we expressed

P_{Y^{n} | T = t}^{j} (D_{i})

as

P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t)

using (7); in (42) we expressed

F_{j}

as the disjoint union of

F_{j} \cap F_{i}

and

F_{j} ∖ F_{i}

; in (43) we used the trivial bound:

P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t) \leq 1;

(49)

in (44) we used the fact that whenever

m \neq i

:

P_{Y^{n} | X^{n}, T} (D_{i} | f_{m} (t), t) \leq P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t)

(50)

which holds because, by the definition of the set

Y_{er}

, any output sequence

y^{n}

that contributes to the LHS of (50), i.e., that is in

D_{i}

with

P_{Y^{n} | X^{n}, T} (y^{n} | f_{m} (t), t) > 0

, must also be in

Y_{er}

; in (45) we used (35); in (46) we replaced each term in the sum with the global maximum (over

m \in M

) and used (34); in (47) we used the trivial bound

| F_{j} ∖ F_{i} | \leq | F_{j} |

; and in (48) we could simplify the expression because the dependence on i and j is no longer.

The above construction demonstrates that every transmission scheme that drives

\sum_{t \in T} P_{T} (t) {max}_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t)

to zero induces a zero missed-identification scheme that drives the false-alarm probability to zero. Since the former exists for all rates up to

C_{e-o} (R_{h})

, we conclude, by (37), that

C_{ID, 0} (R_{h}) \geq C_{e-o} (R_{h})

. This, in turn, implies that

C_{ID, 0} (R_{h}) \geq C_{Sh} (R_{h})

and hence concludes the achievability proof for encoder-assistance because, on the modulo additive noise channel,

C_{e-o} (R_{h}) = C_{Sh} (R_{h})

. □

3.2. Decoder Assistance

When, rather than to the encoder, the assistance is to the decoder, the transmission codewords are n-tuples in

A^{n}

, and we denote the transmission codebook

C = {x^{n} (m)}_{m \in M}

. For the induced identification scheme we use the same message subsets as before, with IM i being transmitted by choosing uniformly at random a message M from the subset

F_{i}

and transmitting the codeword

x^{n} (M)

. To avoid any missed detections, we set the acceptance region corresponding to IM i and decoder assistance t to be:

D_{i} (t) = {y^{n} \in A^{n} : \exists m \in F_{i} s . t . P_{Y^{n}, T | X^{n}} (y^{n}, t | x^{n} (m)) > 0}

(51)

The analysis of the false-alarm probability is nearly identical to that with encoder assistance and is omitted.

4. Converse Part: Help Provided to Both Transmitter and Receiver

In this section we establish the converse for all the cases of interest by proving that the inequality:

C_{ID} (R_{h}) \leq log | A | - {\{H (P_{Z}) - R_{h}\}}^{+}

(52)

holds even when the help is provided to both encoder and decoder. The RHS of (52) is the helper Shannon capacity, irrespective of whether the help is provided to the encoder, to the decoder, or to both [3] (Section V).

There are two main steps to the proof. The first addresses the conditional probabilities of the two types of testing errors conditional on a given description

T = t

. It relates the two to the conditional entropy of the noise given the description, namely,

H (Z^{n} | T = t)

. Very roughly, this corresponds to proving the converse part of the ID-capacity theorem for the channel whose noise is distributed according to the conditional distribution of

Z^{n}

given

T = t

. The difficulty in this step is that, given

T = t

, the noise is not memoryless, and the channel may not even be stable. Classical type-based techniques for proving the converse part of the ID-capacity theorem—such as those employed in [7] (Theorem 12), [13] (Section III), or [14] (Section III)—are therefore not applicable. Instead, we extend to the helper setting Watanabe’s technique [12], which is inspired by the partial channel resolvability method introduced by Steinberg [15].

The second step in the proof addresses the unconditional error probabilities. This step is needed because, in the definition of achievability (see (13) and (14)), the error probabilities are averaged over the noise description t. We will show that, when the identification rate exceeds the Shannon capacity, there exists an IM

i^{*}

for which the sum of the two types of errors is large whenever the description t is in a subset

T^{*}

of

T

whose probability is bounded away from zero. This will imply that, for this IM

i^{*}

, the sum of the averaged probabilities of error is bounded away from zero, thus contradicting the achievability.

4.1. Additional Notation

Given a PMF

P_{X}

and a conditional PMF

P_{Y | X}

, we write

P_{X} \circ P_{Y | X}

for the joint PMF that assigns the pair

(x, y)

the probability

P_{X} (x) P_{Y | X} (y | x)

. We use

I_{P \circ P_{Y | X}} (X; Y)

to denote the mutual information between X and Y under the joint distribution

P \circ P_{Y | X}

. The product PMF of marginals

P_{X}

and

P_{Y}

is denoted

P_{X} \times P_{Y}

; it assigns

(x, y)

the probability

P_{X} (x) P_{Y} (y)

.

For the hypothesis testing problem of guessing whether some observation X was drawn

\sim P_{X}

(the “null hypothesis”) or

\sim Q_{X}

(the “alternative hypothesis”), we use

K (\cdot | X)

to denote a generic randomized test that, after observing

X = x

, guesses the null hypothesis (

X \sim P_{X}

) with probability

K (0 | X = x)

and the alternative (

X \sim Q_{X}

) with probability

K (1 | X = x)

. (Here

K (0 | X = x) + K (1 | X = x) = 1

for every

x \in X

.) The type I error probability associated with

K (\cdot | X)

is:

λ_{1} [K] = \sum_{x \in X} P_{X} (x) K (1 | x)

(53)

and the type II:

λ_{2} [K] = \sum_{x \in X} Q_{X} (x) K (0 | x) .

(54)

For a given

0 < ϵ < 1

we define:

β_{ϵ} (P_{X}, Q_{X}) = inf_{K : λ_{1} [K] \leq ϵ} λ_{2} [K]

(55)

to be the least type-II error probability that can be achieved under the constraint that the type-I error probability does not exceed

ϵ

.

4.2. Conditional Missed-Detection and False-Alarm Probabilities

The following lemma follows directly from Watanabe’s work [12].

Lemma 2

([12] Theorem 1 and Corollary 2). Let

P_{Y^{n} | X^{n}, T = t}

be the n-letter conditional distribution of the channel output sequence given that the noise description is

T = t

and the input is

X^{n}

. For any

λ_{1}

,

λ_{2}

> 0 with

λ_{1} + λ_{2} < 1

, any

0 < η < 1 - λ_{1} - λ_{2}

, and any fixed

t \in T

, the condition:

p_{MD}^{i} (t) + p_{FA}^{i} (t) < λ_{1} + λ_{2}, \forall i \in N

(56)

implies:

\begin{matrix} log log N \leq sup_{P \in P (X^{n})} inf_{Q \in P (Y^{n})} & - log β_{λ_{1} + λ_{2} + η} (P \circ P_{Y^{n} | X^{n}, T = t}, P \times Q) \\ + {log log | A |}^{n} + 2 log (\frac{1}{η}) + 2 \end{matrix}

(57)

and hence:

\frac{1}{n} log log N \leq \frac{1}{n} sup_{P \in P (X^{n})} inf_{Q \in P (Y^{n})} - log β_{λ_{1} + λ_{2} + η} (P \circ P_{Y^{n} | X^{n}, T = t}, P \times Q) + ψ_{n} (η),

(58)

where:

ψ_{n} (η) = \frac{log n}{n} + \frac{log log | A |}{n} - \frac{2}{n} log η + \frac{2}{n}

(59)

which—for any fixed

η > 0

—tends to 0 as n tends to ∞.

Substituting

P_{Y^{n} | X^{n}, T = t}

for

P_{Y | X}

in the following theorem will allow us to link the RHS of (57) with the conditional mutual information between

X^{n}

and

Y^{n}

given

t \in T

. The theorem’s proof was inspired by the proof of [16] (Theorem 8). See also [17] (Lemma 1).

Theorem 2.

Given any

0 < ϵ < 1

and any conditional PMF

P_{Y | X}

,

sup_{P \in P (X)} inf_{Q \in P (Y)} - log β_{ϵ} (P \circ P_{Y | X}, P \times Q) \leq \frac{{sup}_{P \in P (X)} I_{P \circ P_{Y | X}} (X; Y) + h (ϵ)}{1 - ϵ}

(60)

where

h (ϵ) ≜ - ϵ log (ϵ) - (1 - ϵ) log (1 - ϵ)

is the binary entropy function.

Proof.

Applying the data-processing inequality for relative entropy to the binary hypothesis testing setting (see, e.g., [18] (Thm. 30.12.5)) we conclude that for any randomized test

K (\cdot | X)

,

D_{bin} (1 - λ_{1} [K] ∥ λ_{2} [K]) \leq D (P \circ P_{Y | X} ∥ P \times Q)

(61)

where:

D_{bin} (α ∥ β) ≜ α log \frac{α}{β} + (1 - α) log \frac{1 - α}{1 - β}

(62)

denotes the binary divergence function. Since there exists a randomized test

K^{*} (\cdot | X)

for which

(λ_{1} [K^{★}], λ_{2} [K^{★}]) = (ϵ, β_{ϵ} (P \circ P_{Y | X}, P \times Q))

(see, e.g., [18] (Lemma 30.5.4 and Proposition 30.8.1) we can apply (61) to

K^{★} (\cdot | X)

to conclude that:

D_{bin} (1 - ϵ ∥ β_{ϵ} (P \circ P_{Y | X}, P \times Q)) \leq D (P \circ P_{Y | X} ∥ P \times Q) .

(63)

(The above existence also holds when

β_{ϵ} (P \circ P_{Y | X}, P \times Q))

is zero, but for this case we can verify (63) directly by noting that in this case, since

ϵ < 1

, the RHS of (63) is

+ \infty

.) The LHS of (63) can be lower bounded by lower-bounding the binary divergence function as:

D_{bin} (1 - ϵ ∥ β_{ϵ} (P \circ P_{Y | X}, P \times Q)) \geq - h (ϵ) - (1 - ϵ) log β_{ϵ} (P \circ P_{Y | X}, P \times Q) .

(64)

It follows from (63) and (64) that:

- log β_{ϵ} (P \circ P_{Y | X}, P \times Q) \leq \frac{D (P \circ P_{Y | X} ∥ P \times Q) + h (ϵ)}{1 - ϵ}

(65)

so the infimum over Q of the LHS is upper bounded by the infimum over Q on the RHS. The latter (for fixed

P \in P (X)

) is achieved when Q is the Y-marginal of

P \circ P_{Y | X}

, a marginal that we denote

P_{Y}

:

inf_{Q} D (P \circ P_{Y | X} ∥ P \times Q) = I_{P \circ P_{Y | X}} (X; Y) .

(66)

This is a special case of a more general result on Rényi divergence [19] (Theorem II.2). Here we give a simple proof for K-L divergence:

\begin{matrix} D (P \circ P_{Y | X} ∥ P \times Q) \\ (67) & = \sum_{x \in X, y \in Y} P \circ P_{Y | X} (x, y) log (\frac{P \circ P_{Y | X} (x, y)}{P (x) Q (y)}) \\ (68) & = \sum_{x \in X, y \in Y} P \circ P_{Y | X} (x, y) log (\frac{P \circ P_{Y | X} (x, y)}{P (x) P_{Y} (y)} \frac{P_{Y} (y)}{Q (y)}) \\ (69) & = \sum_{x \in X, y \in Y} P \circ P_{Y | X} (x, y) log (\frac{P \circ P_{Y | X} (x, y)}{P (x) P_{Y} (y)}) + \sum_{y} P_{Y} (y) log \frac{P_{Y} (y)}{Q (y)} \\ (70) & \geq I_{P \circ P_{Y | X}} (X; Y) + 0 \end{matrix}

with equality if and only if Q equals

P_{Y}

.

From (63), (64), and (66) we obtain:

\begin{matrix} sup_{P \in P (X)} inf_{Q \in P (Y)} - log β_{ϵ} (P \circ P_{Y | X}, P \times Q) \\ (71) & \leq sup_{P \in P (X)} inf_{Q \in P (Y)} \frac{D_{b i n} (1 - ϵ ∥ β_{ϵ} (P \circ P_{Y | X}, P \times Q)) + h (ϵ)}{1 - ϵ} \\ (72) & \leq sup_{P \in P (X)} inf_{Q \in P (Y)} \frac{D (P \circ P_{Y | X} ∥ P \times Q) + h (ϵ)}{1 - ϵ} \\ (73) & = sup_{P \in P (X)} \frac{I_{P \circ P_{Y | X}} (X; Y) + h (ϵ)}{1 - ϵ} . \end{matrix}

□

Applying Lemma 2 and Theorem 2 to our channel when its law is conditioned on

T = t

yields the following corollary.

Corollary 1.

On the MMANC, for any

λ_{1}

,

λ_{2}

> 0 with

λ_{1} + λ_{2} < 1

, any

0 < η < 1 - λ_{1} - λ_{2}

, and any fixed

t \in T

, the condition:

p_{MD}^{i} (t) + p_{FA}^{i} (t) < λ_{1} + λ_{2}, \forall i \in N

(74)

implies:

\frac{1}{n} log log N \leq \frac{log | A | - H (Z^{n} | T = t) / n}{1 - ϵ^{'}} + ψ_{n} (η),

(75)

where

ϵ^{'} = λ_{1} + λ_{2} + η

.

Proof.

Substituting

X^{n}

for

X

,

Y^{n}

for

Y

,

P_{Y^{n} | X^{n}, T = t}

for

P_{Y | X}

, and

ϵ^{'}

for

ϵ

in Theorem 2, we obtain:

\begin{matrix} sup_{P \in P (X^{n})} inf_{Q \in P (Y^{n})} - log β_{ϵ^{'}} (P \circ P_{Y^{n} | X^{n}, T = t}, P \times Q) \\ \leq \frac{{sup}_{P \in P (X^{n})} I_{P \circ P_{Y^{n} | X^{n}, T = t}} (X^{n}; Y^{n}) + h (ϵ^{'})}{1 - ϵ^{'}} . \end{matrix}

(76)

Given

P \in P (X^{n})

and

P_{Y^{n} | X^{n}, T = t}

, the mutual information term in (76) can be upper-bounded as follows:

\begin{matrix} (77) & I_{P \circ P_{Y^{n} | X^{n}, T = t}} (X^{n}; Y^{n}) & \leq & n log | A | - H_{P \circ P_{Y^{n} | X^{n}, T = t}} (Y^{n} | X^{n}, T = t) \\ (78) & = & n log | A | - \sum_{x^{n}} P (x^{n} | T = t) H (Y^{n} | X^{n} = x^{n}, T = t) \\ (79) & = & n log | A | - \sum_{x^{n}} P (x^{n} | T = t) H (Z^{n} | X^{n} = x^{n}, T = t) \\ (80) & = & n log | A | - H (Z^{n} | T = t) . \end{matrix}

Applying (76) and (80) to (58) in Lemma 2 establishes Corollary 1. □

4.3. Averaging over T

Corollary 1 deals with identification for a given fixed

T = t

, but our definition of achievability in (13) and (14) entails averaging over t, which we must thus study. We begin by lower-bounding the conditional entropy of the noise sequence

Z^{n}

given the assistance T:

\begin{matrix} (81) & H (Z^{n} | T) & = & H (Z^{n}, T) - H (T) \\ (82) & \geq & {H (Z^{n}) - n R_{h}}^{+} \\ (83) & = & n {H (P_{Z}) - R_{h}}^{+} . \end{matrix}

We next define, for every

δ > 0

, the subset of descriptions:

T^{*} (δ) = \{t \in T : H (Z^{n} | T = t) \geq n {H (P_{Z}) - R_{h} - δ}^{+}\} .

(84)

These are poor noise descriptions in the sense that, after they are revealed, the remaining uncertainty about the noise is still large. Key is that their probability is bounded away from zero. In fact, as we next argue:

P_{T} (T^{*} (δ)) \geq \{\begin{matrix} \frac{δ}{log | A | - H (P_{Z}) + R_{h} + δ} & if R_{h} < H (P_{Z}) - δ \\ 1 & if R_{h} \geq H (P_{Z}) - δ \end{matrix}

(85)

where in the second case the probability is 1 because when

R_{h} \geq H (P_{Z}) - δ

the condition appearing in the definition of

T^{*} (δ)

in (84) translates to

H (Z^{n} | T = t) \geq 0

. As to the first case, we begin with (83) to obtain:

\begin{matrix} (86) & n (H (P_{Z}) - R_{h}) & \leq & H (Z^{n} | T) \\ (87) & = & \sum_{t \notin T^{*} (δ)} P_{T} (t) H (Z^{n} | T = t) + \sum_{t \in T^{*} (δ)} P_{T} (t) H (Z^{n} | T = t) \\ (88) & \leq & (1 - P_{T} (T^{*} (δ))) \cdot n (H (P_{Z}) - R_{h} - δ) + P_{T} (T^{*} (δ)) \cdot n log | A | \end{matrix}

from which the first case of the bound in (85) follows. Here (87) follows from expressing

T

as the disjoint union of

T^{*} (δ)

and

T ∖ T^{*} (δ)

, and (88) follows from the definition of

T^{*} (δ)

and the bound

H (Z^{n} | T = t) \leq n log | A |

.

Inequality (85) establishes that the probability of a poor description is lower bounded by a positive constant that does not depend on n. Using Corollary 1 for such t’s will be the key to the converse.

Henceforth, we fix some sequence of identification codes of rate R exceeding

C_{Sh} (R_{h})

, i.e., satisfying

R > log | A | - {H (P_{Z}) - R_{h}}^{+}

, and show that

p_{MD, \max} + p_{FA, \max}

cannot tend to 0 as n tends to ∞. For such a rate R, there exist

R^{'}

,

δ > 0

; a pair

λ_{1}, λ_{2} > 0

with

λ_{1} + λ_{2} < 1

; and some

η \in (0, 1 - λ_{1} - λ_{2})

such that:

R > R^{'} > \frac{log | A | - {H (P_{Z}) - R_{h} - δ}^{+}}{1 - ϵ^{'}}

(89)

where

ϵ^{'} ≜ λ_{1} + λ_{2} + η < 1

. Fix such

R^{'}

,

δ

,

λ_{1}

,

λ_{2}

,

η

, and

ϵ^{'}

.

Since the inequality on

R^{'}

in (89) is strict, and since

ψ_{n} (η)

tends to zero with n, it follows that the inequality continues to hold also when we add

ψ_{n} (η)

to the RHS provided that n is sufficiently large, i.e., that there exists some

n_{0} (η)

such that:

R^{'} > \frac{log | A | - {H (P_{Z}) - R_{h} - δ}^{+}}{1 - ϵ^{'}} + ψ_{n} (η), n \geq n_{0} (η) .

(90)

It then follows from (90) and the definition of

T^{*} (δ)

in (84) that, whenever

n \geq n_{0} (η)

,

R^{'}

exceeds the RHS of (75):

R^{'} > \frac{log | A | - n^{- 1} H (Z^{n} | T = t)}{1 - ϵ^{'}}, \forall t \in T^{*} (δ) .

(91)

Corollary 1 thus implies that, for

n > n_{0} (η)

:

(\frac{1}{n} log log N > R^{'}) \Rightarrow (\forall t \in T^{★} \exists i (t) \in N s . t . p_{MD}^{i} (t) + p_{FA}^{i} (t) \geq λ_{1} + λ_{2}) .

(92)

However, we need a stronger statement because, in the above, the IM i for which

p_{MD}^{i} (t) + p_{FA}^{i} (t) \geq λ_{1} + λ_{2}

depends on t, whereas in our definition of achievability we are averaging over T for fixed IM. The stronger result we will establish is that the condition on the LHS of (92) implies that, for all sufficiently large n, there exists some IM

i^{*}

(that does not depend on t) which performs poorly for every t in

T^{*} (δ)

, i.e., for which:

p_{MD}^{i^{*}} (t) + p_{FA}^{i^{*}} (t) \geq λ_{1} + λ_{2}, \forall t \in T^{*} (δ) .

(93)

That is, we will show that for sufficiently large n:

(\frac{1}{n} log log N > R^{'}) \Rightarrow (\exists i \in N s . t min_{t \in T^{★} (δ)} \{p_{MD}^{i} (t) + p_{FA}^{i} (t)\} \geq λ_{1} + λ_{2}) .

(94)

To this end, define for each

t \in T^{*} (δ)

:

N (t) = \{i \in N : p_{MD}^{i} (t) + p_{FA}^{i} (t) < λ_{1} + λ_{2}\}

(95)

and consider the identification code that results when we restrict our code to the IMs in

N (t)

(while keeping the same acceptance regions). Applying Corollary 1 to this restricted code using (91), we obtain that:

\frac{1}{n} log log | N (t) | < R^{'}, \forall t \in T^{*} (δ) .

(96)

Consequently,

|⋃_{t \in T^{*} (δ)} N (t)| \leq \sum_{t \in T^{*} (δ)} |N (t)| \leq 2^{n R_{h}} 2^{2^{n R^{'}}}

(97)

where the second inequality holds by (96) and the fact that

T^{*} (δ)

is contained in

T

, and the latter’s cardinality is

2^{n R_{h}}

.

Since

R^{'} < R

(89), there exists some

n_{1} (R, R^{'}, R_{h})

such that:

2^{n R_{h}} 2^{2^{n R^{'}}} < 2^{2^{n R}}, n \geq n_{1} (R, R^{'}, R_{h}) .

(98)

We can use this to upper-bound the RHS of (97) to obtain that, for

n \geq max \{n_{0} (η), n_{1} (R, R^{'}, R_{h})\}

:

|⋃_{t \in T^{*} (δ)} N (t)| < N .

(99)

The complement (in

N

) of the union on the LHS of (99) is thus not empty, which proves the existence of some

i^{★} \in N

for which (93) holds.

With

i^{★}

in hand, the converse follows from the fact that the probability that T is in

T^{*} (δ)

is bounded away from zero (85), because for every

n \geq max \{n_{0} (η), n_{1} (R, R^{'}, R_{h})\}

:

\begin{matrix} (100) & p_{MD, \max} + p_{FA, \max} & = & max_{i \in N} \sum_{t \in T} P_{T} (t) p_{MD}^{i} (t) + max_{i \in N} \sum_{t \in T} P_{T} (t) p_{FA}^{i} (t) \\ (101) & \geq & \sum_{t \in T} P_{T} (t) (p_{M}^{i^{*}} D (t) + p_{FA}^{i^{*}} (t)) \\ (102) & \geq & \sum_{t \in T^{*} (δ)} P_{T} (t) (p_{M}^{i^{*}} D (t) + p_{FA}^{i^{*}} (t)) \\ (103) & \geq & \sum_{t \in T^{*} (δ)} P_{T} (t) \cdot (λ_{1} + λ_{2}) \\ (104) & = & P_{T} (T^{*} (δ)) \cdot (λ_{1} + λ_{2}) \end{matrix}

where (100) follows from the definitions in (13) and (14); in (101) we replaced the maximum with the IM

i^{*}

; and (103) follows from (93). Thus, any code of rate

R > log | A | - {H (P_{Z}) - R_{h}}^{+}

with large enough n must have

p_{MD, \max} + p_{FA, \max} \geq P_{T} (T^{*}) \cdot (λ_{1} + λ_{2})

, and the latter is bounded away from zero. This concludes the proof of the converse part.

Author Contributions

Writing—original draft preparation, A.L. and B.N.; writing—review and editing, A.L. and B.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank Christian Deppe and Johannes Rosenberger who read a preprint of this paper and provided them with helpful comments. They also thank the guest editor and the anonymous reviewers for their insightful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MMANC	Memoryless modulo-additive noise channel
IID	Identical independent distribution
IM	Identification message
LHS	Left hand side
RHS	Right hand side

Appendix A. Proof of Proposition 1

Proof.

As in [4], the code construction entails time-sharing between two schemes: a “zero-rate help scheme” corresponding to help of zero rate, and a “high-rate help scheme” corresponding to help of a rate exceeding the noise entropy. In the former, the help comprises one bit, indicating whether or not the noise is typical. We denote this help

T^{(z)}

and assume that it takes values in the set

H = {τ, α}

, with

T^{(z)} = τ

indicating that the noise is typical and

T^{(z)} = α

that it is atypical (when the help is to the encoder, the helper additionally provides the encoder with the description of one noise sample in order to enable the encoder to convey

T^{(z)}

to the decoder error free).

When the help is of high rate, we denote it

T^{(h)}

. It has two parts, that we denote

T_{t / a}^{(h)}

and

T_{d}^{(h)}

, so

T^{(h)} = (T_{t / a}^{(h)}, T_{d}^{(h)})

. The first part,

T_{t / a}^{(h)}

, indicates whether or not the noise is typical and hence takes values in

H

. The second part,

T_{d}^{(h)}

, describes the noise (perfectly) when the latter is typical, and is null otherwise (as above, when the help is to the encoder, the helper additionally provides the encoder with the description of one noise sample in order to enable the encoder to convey

T_{t / a}^{(h)}

to the decoder error free). The help in the time-sharing scheme, which we denote T, comprises the help in the zero-rate part and the help in the high-rate part:

T = (T^{(z)}, T^{(h)})

(A1)

The duty cycle is chosen so that the rate of T be

R_{h}

(or the entropy of the noise, if the latter is smaller than

R_{h}

). We assume throughout that

R < log | A |

.

The transmission code derived in [4] has two salient properties:

In the high-rate scheme, conditional on $T_{t / a}^{(h)} = τ$ (i.e., on the noise being typical and that it can therefore be perfectly described by $T_{d}^{(h)}$ ), no erasures are declared.
In the zero-rate scheme, conditional on $T^{(z)} = τ$ (i.e., on the noise being typical), the maximal (over the messages) probability of erasure is upper bounded by some $ϵ_{n}$ tending to zero.

(To guarantee the second property, the code is constructed—as in [4]—using random coding and we then expurgate half the codewords to obtain a code whose maximal probability of erasure is smaller than

ϵ

. The asserted property then follows by bounding, for each message, the conditional probability of erasure given that the noise is typical by the ratio of the unconditional probability of erasure to the probability that the noise is typical.)

We next analyze the time-sharing scheme. We focus on the case where

0 < R_{h} \leq H (P_{Z})

. The remaining cases, where

R_{h} = 0

or

R_{h} > H (P_{Z})

are very similar, except that they require no time sharing. We use the superscript

(h)

for quantities occurring in the high-rate help phase, and the superscript

(z)

for those in the zero-rate phase. For example,

m^{(h)}, X^{(h)}, Y^{(h)}, T^{(h)}

are the message, input sequence, output sequence, and help in the high-rate help phase; and the set of output sequences causing an erasure in this phase is denoted

Y_{er}^{(h)}

. The set of outputs causing an erasure in the time-sharing scheme is:

Y_{er} = \{y^{n} : y^{(h)} \in Y_{er}^{(h)} or y^{(z)} \in Y_{er}^{(z)}\} .

(A2)

For the time-sharing scheme we now have:

\begin{matrix} (A3) & \sum_{t \in T} P_{T} (t) max_{m \in M} P_{Y^{n} | X^{n}, T} (Y_{er} | f_{m} (t), t) \\ \leq \sum_{t \in T} P_{T} (t) max_{m \in M} [P_{Y^{(h)} | Y^{(h)}, T^{(h)}} (Y_{er}^{(h)} | f_{m^{(h)}}^{(h)} (t^{(h)}), t^{(h)}) \\ + P_{Y^{(z)} | X^{(z)}, T^{(z)}} (Y_{er}^{(z)} | f_{m^{(z)}}^{(z)} (t^{(z)}), t^{(z)})] \\ (A4) & \leq \sum_{t \in T} P_{T} (t) max_{m^{(h)} \in M^{(h)}} P_{Y^{(h)} | X^{(h)}, T^{(h)}} (Y_{er}^{(h)} | f_{m^{(h)}}^{(h)} (t^{(h)}), t^{(h)}) \\ + \sum_{t \in T} P_{T} (t) max_{m^{(z)} \in M^{(z)}} P_{Y^{(z)} | X^{(z)}, T^{(z)}} (Y_{er}^{(z)} | f_{m^{(z)}}^{(z)} (t^{(z)}), t^{(z)}) \\ (A5) & = \sum_{t^{(h)})} P_{T^{(h)}} (t^{(h)}) max_{m^{(h)} \in M^{(h)}} P_{Y^{(h)} | X^{(h)}, T^{(h)}} (Y_{er}^{(h)} | f_{m^{(h)}}^{(h)} (t^{(h)}), t^{(h)}) \\ + \sum_{t^{(z)}} P_{T^{(z)}} (t^{(z)}) max_{m^{(z)} \in M^{(z)}} P_{Y^{(z)} | X^{(z)}, T^{(z)}} (Y_{er}^{(z)} | f_{m^{(z)}}^{(z)} (t^{(z)}), t^{(z)}) \\ (A6) & \leq P (T_{t / a}^{(h)} = α) + P (T^{(z)} = α) \cdot 1 + P (T^{(z)} = τ) \cdot ϵ_{n} \\ (A7) & \leq P (T_{t / a}^{(h)} = α) + P (T^{(z)} = α) + ϵ_{n} \end{matrix}

which establishes the proposition, because the RHS tends to zero. Here (A3) follows from (A2) and the union-of-events bound; (A4) holds (in this case with equality) because the maximum of a sum is upper bounded by the sum of the maxima; and (A6) holds by the aforementioned salient properties of the code construction. □

References

Kim, Y.H. Capacity of a class of deterministic relay channels. IEEE Trans. Inf. Theory 2008, 54, 1328–1329. [Google Scholar] [CrossRef]
Bross, S.I.; Lapidoth, A.; Marti, G. Decoder-assisted communications over additive noise channels. IEEE Trans. Commun. 2020, 68, 4150–4161. [Google Scholar] [CrossRef]
Lapidoth, A.; Marti, G. Encoder-assisted communications over additive noise channels. IEEE Trans. Inf. Theory 2020, 66, 6607–6616. [Google Scholar] [CrossRef]
Lapidoth, A.; Marti, G.; Yan, Y. Other helper capacities. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Virtual, 12–20 July 2021; pp. 1272–1277. [Google Scholar] [CrossRef]
Lapidoth, A.; Yan, Y. The listsize capacity of the Gaussian channel with decoder assistance. Entropy 2022, 24, 29. [Google Scholar] [CrossRef] [PubMed]
Merhav, N. On error exponents of encoder-assisted communication systems. IEEE Trans. Inf. Theory 2021, 67, 7019–7029. [Google Scholar] [CrossRef]
Ahlswede, R.; Dueck, G. Identification via channels. IEEE Trans. Inf. Theory 1989, 35, 15–29. [Google Scholar] [CrossRef]
Ahlswede, R.; Cai, N.; Zhang, Z. Erasure, list, and detection zero-error capacities for low noise and a relation to identification. IEEE Trans. Inf. Theory 1996, 42, 55–62. [Google Scholar] [CrossRef]
Steinberg, Y.; Merhav, N. Identification in the presence of side information with application to watermarking. IEEE Trans. Inf. Theory 2001, 47, 1410–1422. [Google Scholar] [CrossRef]
Ahlswede, R.; Dueck, G. Identification in the presence of feedback—A discovery of new capacity formulas. IEEE Trans. Inf. Theory 1989, 35, 30–36. [Google Scholar] [CrossRef]
Wiese, M.; Labidi, W.; Deppe, C.; Boche, H. Identification over Additive Noise Channels in the Presence of Feedback. IEEE Trans. Inf. Theory 2022, 1. [Google Scholar] [CrossRef]
Watanabe, S. Minimax converse for identification via channels. IEEE Trans. Inf. Theory 2022, 68, 25–34. [Google Scholar] [CrossRef]
Han, T.; Verdú, S. New results in the theory of identification via channels. IEEE Trans. Inf. Theory 1992, 38, 14–25. [Google Scholar] [CrossRef]
Bracher, A.; Lapidoth, A. Identification via the broadcast channel. IEEE Trans. Inf. Theory 2017, 63, 3480–3501. [Google Scholar] [CrossRef]
Steinberg, Y. New converses in the theory of identification via channels. IEEE Trans. Inf. Theory 1998, 44, 984–998. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Poor, H.V.; Verdú, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
Rosenberger, J.; Ibrahim, A.; Bash, B.A.; Deppe, C.; Ferrara, R.; Pereg, U. Capacity Bounds for Identification with Effective Secrecy. arXiv 2023, arXiv:2306.14792. [Google Scholar]
Lapidoth, A. A Foundation in Digital Communication, 2nd ed.; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
Aishwarya, G.; Madiman, M. Remarks on Rényi versions of conditional entropy and mutual information. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 1117–1121. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lapidoth, A.; Ni, B. Assisted Identification over Modulo-Additive Noise Channels. Entropy 2023, 25, 1314. https://doi.org/10.3390/e25091314

AMA Style

Lapidoth A, Ni B. Assisted Identification over Modulo-Additive Noise Channels. Entropy. 2023; 25(9):1314. https://doi.org/10.3390/e25091314

Chicago/Turabian Style

Lapidoth, Amos, and Baohua Ni. 2023. "Assisted Identification over Modulo-Additive Noise Channels" Entropy 25, no. 9: 1314. https://doi.org/10.3390/e25091314

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Assisted Identification over Modulo-Additive Noise Channels

Abstract

1. Introduction

2. Problem Formulation

3. Direct Part: Zero Missed Detection

3.1. Encoder Assistance

3.2. Decoder Assistance

4. Converse Part: Help Provided to Both Transmitter and Receiver

4.1. Additional Notation

4.2. Conditional Missed-Detection and False-Alarm Probabilities

4.3. Averaging over T

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Proof of Proposition 1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI