*4.3. Proof of Corollary 2*

For *<sup>α</sup>* <sup>=</sup> 1, Corollary <sup>2</sup> is proved in (67). Fix *<sup>α</sup>* <sup>&</sup>gt; 1, and let *<sup>g</sup>* : <sup>R</sup> <sup>→</sup> <sup>R</sup> be

$$\log(x) \stackrel{\triangle}{=} \begin{cases} x^{\alpha}, & x \ge 0, \\ 0, & x < 0, \end{cases} \tag{79}$$

which is monotonically increasing and convex on the real line. By Theorem 1a,

$$t\_k^{(n)} \ge t\_n^{(n)}, \quad k \in [n]. \tag{80}$$

Since by assumption *f* is nonnegative, it follows from (20) and (79) that

$$t\_k^{(n)} = \frac{1}{\binom{n}{k}} \sum\_{\substack{\mathcal{T} \subseteq \Omega \colon |\mathcal{T}| = k}} g\left(\frac{f(\mathcal{T})}{k}\right) \tag{81a}$$

$$=\frac{1}{k^{\mathfrak{a}}\binom{\mathfrak{n}}{k}}\sum\_{\substack{\mathcal{T}\subseteq\Omega:\,|\mathcal{T}|=k}}f^{\mathfrak{n}}(\mathcal{T}).\tag{81b}$$

Combining (80)–(81) and rearranging terms gives, for all *α* > 1,

$$\sum\_{\substack{\mathcal{T}\subseteq\Omega:\,|\mathcal{T}|=k}} f^a(\mathcal{T}) \ge \left(\frac{k}{n}\right)^a \binom{n}{k} f^a(\Omega) \tag{82a}$$

$$= \left(\frac{k}{n}\right)^{n-1} \binom{n-1}{k-1} f^n(\Omega),\tag{82b}$$

where equality (82b) holds by the identity *<sup>k</sup> n* ( *n <sup>k</sup>*) = ( *n*−1 *<sup>k</sup>*−1). This further gives

$$\sum\_{\substack{\mathcal{T}\subseteq\Omega:\,|\mathcal{T}|=k}} \left( f^{\mathfrak{a}}(\Omega) - f^{\mathfrak{a}}(\mathcal{T}) \right) = \binom{n}{k} f^{\mathfrak{a}}(\Omega) - \sum\_{\substack{\mathcal{T}\subseteq\Omega:\,|\mathcal{T}|=k}} f^{\mathfrak{a}}(\mathcal{T}) \tag{83a}$$

$$\leq \left(1 - \frac{k^{\alpha}}{n^{\alpha}}\right) \binom{n}{k} f^{\alpha}(\Omega) \tag{83b}$$

$$=c\_{\mathfrak{a}}(n,k)\,f^{\mathfrak{a}}(\Omega),\tag{83c}$$

where equality (83c) holds by the definition in (26). This proves (25) for *α* > 1.

We next prove Item (b). The function *f* is (by assumption) a rank function, which yields its nonnegativity. Hence, the leftmost inequality in (27) holds by (82). The rightmost inequality in (27) also holds since *<sup>f</sup>* : <sup>2</sup><sup>Ω</sup> <sup>→</sup> <sup>R</sup> is monotonically increasing, which yields *f*(T ) ≤ *f*(Ω) for all T ⊆ Ω. For *k* ∈ [*n*] and *α* ≥ 0 (in particular, for *α* ≥ 1),

$$\sum\_{\substack{\mathcal{T}\subseteq\Omega:\ |\mathcal{T}|=k}} f^{\mathfrak{a}}(\mathcal{T}) \le \binom{n}{k} f^{\mathfrak{a}}(\Omega),\tag{84}$$

where (84) holds since there are ( *n <sup>k</sup>*) *k*-element subsets T of the *n*-element set Ω, and every summand *<sup>f</sup> <sup>α</sup>*(<sup>T</sup> ) (with T ⊆ <sup>Ω</sup>) is upper bounded by *<sup>f</sup> <sup>α</sup>*(Ω).

### **5. A Problem in Extremal Graph Theory**

This section applies the generalization of Han's inequality in (28) to the following problem.

### *5.1. Problem Formulation*

Let A ⊆ {−1, 1}*n*, with *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>, and let *<sup>τ</sup>* <sup>∈</sup> [*n*]. Let *<sup>G</sup>* <sup>=</sup> *<sup>G</sup>*A,*<sup>τ</sup>* be an un-directed simple graph with vertex set <sup>V</sup>(*G*) = <sup>A</sup>, and pairs of vertices in *<sup>G</sup>* are adjacent (i.e., connected by an edge) if and only if they are represented by vectors in A whose Hamming distance is less than or equal to *τ*:

$$\{\mathbf{x}^{n}, \mathbf{y}^{n}\} \in \mathsf{E}(G) \Leftrightarrow \left(\mathbf{x}^{n}, \mathbf{y}^{n} \in \mathcal{A}, \ x^{n} \neq \mathbf{y}^{n}, \ \mathsf{d}\_{\mathsf{H}}(\mathbf{x}^{n}, \mathbf{y}^{n}) \leq \mathsf{r}\right). \tag{85}$$

The question is how large can the size of *G* be (i.e., how many edges it may have) as a function of the cardinality of the set A, and possibly based also on some basic properties of the set A?

This problem and its related analysis generalize and refine, in a nontrivial way, the bound in Theorem 4.2 of [6] which applies to the special case where *τ* = 1. The motivation for this extension is next considered.

#### *5.2. Problem Motivation*

Constraint coding is common in many data recording systems and data communication systems, where some sequences are more prone to error than others, and a constraint on the sequences that are allowed to be recorded or transmitted is imposed in order to reduce the likelihood of error. Given such a constraint, it is then necessary to encode arbitrary user sequences into sequences that obey the constraint.

From an information–theoretic perspective, this problem can be interpreted as follows. Consider a communication channel <sup>W</sup>: X→Y with input alphabet <sup>X</sup> and output alphabet Y, and suppose that a constraint is imposed on the sequences that are allowed to be transmitted over the channel. As a result of such a constraint, the information sequences are first encoded into codewords by an error-correction encoder, followed by a constrained encoder that maps these codewords into constrained sequences. Let them be binary *<sup>n</sup>*-length sequences from the set A ⊆ {−1, 1}*n*. A channel modulator then modulates these sequences into symbols from X , and the received sequences at the channel output, with alphabet Y, are first demodulated, and then decoded (in a reverse order of the encoding process) by the constrained decoder and error-correction decoder.

Consider a channel model where pairs of binary *n*-length sequences from the set A whose Hamming distance is less than or equal to a fixed number *τ* share a common output sequence with positive probability, whereas this halts to be the case if the Hamming distance is larger than *τ*. In other words, we assume that by design, pairs of sequences in A whose Hamming distance is larger than *τ* cannot be confused in the sense that there does not exist a common output sequence which may be possibly received (with positive probability) at the channel output.

The confusion graph *G* that is associated with this setup is an undirected simple graph whose vertices represent the *n*-length binary sequences in A, and pairs of vertices are adjacent if and only if the Hamming distance between the sequences that they represent is not larger than *τ*. The size of *G* (i.e., its number of edges) is equal to the number of pairs of sequences in A which may not be distinguishable by the decoder.

Further motivation for studying this problem is considered in the continuation (see Section 5.5).
