**Appendix A. Proof of Proposition 1**

For completeness, we prove Proposition 1 which introduces results from [25,28,37].

Let Ω be a non-empty finite set, and let {*Xω*}*ω*∈<sup>Ω</sup> be a collection of discrete random variables. We first prove Item (a), showing that the entropy set function *<sup>f</sup>* : <sup>2</sup><sup>Ω</sup> <sup>→</sup> <sup>R</sup> in (15) is a rank function.

• *f*(∅) = 0.

• Submodularity: If S, T ⊆ Ω, then

$$f(\mathcal{T}\cup\mathcal{S}) + f(\mathcal{T}\cap\mathcal{S})$$

$$\mathbf{H} = \mathbf{H}(X\_{\mathcal{T}\cup\mathcal{S}}) + \mathbf{H}(X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A1a}$$

$$=\mathcal{H}(X\_{\mathcal{T}\backslash\mathcal{S}'}X\_{\mathcal{T}\cap\mathcal{S}'}X\_{\mathcal{S}\backslash\mathcal{T}}) + \mathcal{H}(X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A1b}$$

$$=\mathcal{H}(X\_{\mathcal{T}\backslash\mathcal{S}'}X\_{\mathcal{S}\backslash\mathcal{T}} \,|\, X\_{\mathcal{T}\cap\mathcal{S}}) + 2\,\mathcal{H}(X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A1c}$$

$$\begin{aligned} &= \left[ \mathbf{H}(\mathbf{X}\_{\mathcal{T}\backslash\mathcal{S}} | \mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) + \mathbf{H}(\mathbf{X}\_{\mathcal{S}\backslash\mathcal{T}} | \mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) - \mathbf{I}(\mathbf{X}\_{\mathcal{T}\backslash\mathcal{S}}; \mathbf{X}\_{\mathcal{S}\backslash\mathcal{T}} | \mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) \right] \\ &\quad + 2\,\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) \end{aligned} \tag{A1d}$$

$$\begin{aligned} &= \left[ \mathbf{H}(X\_{\mathcal{T}\backslash\mathcal{S}} | X\_{\mathcal{T}\cap\mathcal{S}}) + \mathbf{H}(X\_{\mathcal{T}\cap\mathcal{S}}) \right] + \left[ \mathbf{H}(X\_{\mathcal{S}\backslash\mathcal{T}} | X\_{\mathcal{T}\cap\mathcal{S}}) + \mathbf{H}(X\_{\mathcal{T}\cap\mathcal{S}}) \right] \\ &- \mathbf{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} | X\_{\mathcal{T}\cap\mathcal{S}}) \end{aligned} \tag{A1e}$$

$$=\mathcal{H}(X\_{\mathcal{T}\backslash\mathcal{S}}, X\_{\mathcal{T}\cap\mathcal{S}}) + \mathcal{H}(X\_{\mathcal{S}\backslash\mathcal{T}}, X\_{\mathcal{T}\cap\mathcal{S}}) - \mathcal{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \mid X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A1f}$$

$$\mathbf{H} = \mathbf{H}(\mathbf{X}\_{\mathcal{T}}) + \mathbf{H}(\mathbf{X}\_{\mathcal{S}}) - \mathbf{I}(\mathbf{X}\_{\mathcal{T}\backslash\mathcal{S}'} \mathbf{X}\_{\mathcal{S}\backslash\mathcal{T}} \,|\, \mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) \tag{A1g}$$

$$\mathcal{S} = f(\mathcal{T}) + f(\mathcal{S}) - \mathrm{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \, | \, X\_{\mathcal{T}\cap\mathcal{S}}),\tag{A1h}$$

which gives

$$\mathbb{E}\left[f(\mathcal{T}) + f(\mathcal{S}) - \left[f(\mathcal{T} \cup \mathcal{S}) + f(\mathcal{T} \cap \mathcal{S})\right]\right] = \mathbb{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \,|\, X\_{\mathcal{T}\cap\mathcal{S}}) \ge 0. \tag{A2}$$

This proves the submodularity of *f* , while also showing that

$$f(\mathcal{T}) + f(\mathcal{S}) = f(\mathcal{T} \cup \mathcal{S}) + f(\mathcal{T} \cap \mathcal{S}) \quad \Longleftrightarrow \quad X\_{\mathcal{T} \backslash \mathcal{S}} \sqcup X\_{\mathcal{S} \backslash \mathcal{T}} \mid X\_{\mathcal{T} \cap \mathcal{S}} \, \, \, \, \tag{A3}$$

i.e., the rightmost side of (A2) holds with equality if and only if *<sup>X</sup>*T \S and *<sup>X</sup>*S\T are conditionally independent given *X*T ∩S.

• Monotonicity: If S⊆T ⊆ Ω, then

$$f(\mathcal{S}) = \mathcal{H}(X\_{\mathcal{S}}) \tag{A4a}$$

$$1 \le \mathcal{H}(X\_{\mathcal{S}}) + \mathcal{H}(X\_{\mathcal{T}} | X\_{\mathcal{S}}) \tag{A4b}$$

$$\mathbf{H} = \mathbf{H}(X\_{\mathcal{T}}) \tag{A4c}$$

$$= f(\mathcal{T}),\tag{A4d}$$

so *f* is monotonically increasing.

We next prove Item (b). Consider the set function *f* in (16).


$$f(\mathcal{T}\cup\mathcal{S}) + f(\mathcal{T}\cap\mathcal{S}) = \mathcal{H}(X\_{\mathcal{T}\cup\mathcal{S}} \, | \, X\_{\mathcal{T}^c\cap\mathcal{S}^c}) + \mathcal{H}(X\_{\mathcal{T}\cap\mathcal{S}} \, | \, X\_{\mathcal{T}^c\cup\mathcal{S}^c}) \tag{A5a}$$

$$=\left[\mathcal{H}(X\_{\Omega}) - \mathcal{H}(X\_{\mathcal{T}^c \cap \mathcal{S}^c})\right] + \left[\mathcal{H}(X\_{\Omega}) - \mathcal{H}(X\_{\mathcal{T}^c \cup \mathcal{S}^c})\right] \tag{A5b}$$

$$=2\operatorname{H}(X\_{\Omega}) - \left[\operatorname{H}(X\_{\mathcal{T}^c \cup \mathcal{S}^c}) + \operatorname{H}(X\_{\mathcal{T}^c \cap \mathcal{S}^c})\right] \tag{A5c}$$

$$\geq 2\,\mathrm{H}(X\_{\Omega}) - \left[\mathrm{H}(X\_{\mathcal{T}^c}) + \mathrm{H}(X\_{\mathcal{S}^c})\right] \tag{A5d}$$

$$=\left[\mathbf{H}(X\_{\Omega}) - \mathbf{H}(X\_{\mathcal{T}^c})\right] + \left[\mathbf{H}(X\_{\Omega}) - \mathbf{H}(X\_{\mathcal{S}^c})\right] \tag{A5e}$$

$$\mathcal{I} = \mathcal{H}(X\_{\mathcal{T}} | X\_{\mathcal{T}^c}) + \mathcal{H}(X\_{\mathcal{S}} | X\_{\mathcal{S}^c}) \tag{A5f}$$

$$f = f(\mathcal{T}) + f(\mathcal{S}),\tag{A5g}$$

where inequality (A5d) holds since the entropy function in (15) is submodular (by Item (a)).

• Monotonicity: If S⊆T ⊆ Ω, then

*<sup>f</sup>*(S) = <sup>H</sup>(*X*<sup>S</sup> |*X*S<sup>c</sup> ) (A6a)

$$1 \le \mathcal{H}(X\_{\mathcal{S}} | X\_{\mathcal{T}^c}) \quad (\mathcal{T}^c \subseteq \mathcal{S}^c) \tag{A6b}$$

≤ <sup>H</sup>(*X*<sup>T</sup> |*X*<sup>T</sup> <sup>c</sup> ) (A6c)

$$= f(\mathcal{T}),\tag{A6d}$$

so *f* is monotonically increasing.

Item (c) follows easily from Items (a) and (b). Consider the set function *<sup>f</sup>* : <sup>2</sup><sup>Ω</sup> <sup>→</sup> <sup>R</sup> in (17). Then, for all T ∈ <sup>Ω</sup>, *<sup>f</sup>*(T ) = <sup>I</sup>(*X*<sup>T</sup> ; *<sup>X</sup>*<sup>T</sup> <sup>c</sup> ) = <sup>H</sup>(*X*<sup>T</sup> ) − <sup>H</sup>(*X*<sup>T</sup> |*X*<sup>T</sup> <sup>c</sup> ), so *<sup>f</sup>* is expressed as a difference of a submodular function and a supermodular function, which gives a submodular function. Furthermore, *f*(∅) = 0; by the symmetry of the mutual information, *<sup>f</sup>*(<sup>T</sup> ) = *<sup>f</sup>*(<sup>T</sup> <sup>c</sup>) for all T ⊆ <sup>Ω</sup>, so *<sup>f</sup>* is not monotonic.

We next prove Item (d). Consider the set function *<sup>f</sup>* : <sup>2</sup><sup>V</sup> <sup>→</sup> <sup>R</sup> in (18), and we need to prove that *f* is submodular under the conditions in Item (d) where U, V ⊆ Ω are disjoint subsets, and the entries of the random vector *X*<sup>V</sup> are conditionally independent given *X*<sup>U</sup> .

$$\bullet \quad f(\mathcal{Q}) = \mathrm{I}(X\_{\mathcal{U}}; X\_{\mathcal{Q}}) = 0.$$

• Submodularity: If S, T ⊆V, then

$$f(\mathcal{T}\cup\mathcal{S}) + f(\mathcal{T}\cap\mathcal{S})$$
 
$$\{\mathcal{T}\cup\mathcal{V}, \mathbf{v}\in\mathcal{V}\cup\mathcal{V}, \mathbf{v}\in\mathcal{V}\}\tag{4.7.5}$$

$$=\mathcal{I}(X\_{l\ell};X\_{\mathcal{T}\cup S}) + \mathcal{I}(X\_{l\ell};X\_{\mathcal{T}\cap S})\tag{A7a}$$

$$=\left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cup\mathcal{S}})-\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cup\mathcal{S}}\,\mathrm{|}\,\mathrm{X}\_{\mathrm{Id}})\right]+\left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}})-\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}\,\mathrm{|}\,\mathrm{X}\_{\mathrm{Id}})\right] \tag{A7b}$$

$$=\left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cup\mathcal{S}}) + \mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}})\right] - \left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}\cup\mathcal{S}} \, | \, \mathbf{X}\_{l\ell}) + \mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}} \, | \, \mathbf{X}\_{l\ell})\right] \tag{A7c}$$

$$\begin{aligned} \mathbf{H} &= \left[ \mathbf{H}(\mathbf{X}\_{\mathcal{T}}) + \mathbf{H}(\mathbf{X}\_{\mathcal{S}}) - \mathbf{I}(\mathbf{X}\_{\mathcal{T}\backslash\mathcal{S}}; \mathbf{X}\_{\mathcal{S}\backslash\mathcal{T}} \,|\, \mathbf{X}\_{\mathcal{T}\cap\mathcal{S}}) \right] \\ &- \left[ \mathbf{H}(\mathbf{X}\_{\mathcal{T}\backslash\mathcal{S}} \,|\, \mathbf{X}\_{\mathcal{U}}) + \mathbf{H}(\mathbf{X}\_{\mathcal{T}\cap\mathcal{S}} \,|\, \mathbf{X}\_{\mathcal{U}}) \right], \end{aligned} \tag{A7d}$$

where equality (A7d) holds by the proof of Item (a) (see (A2)). By the assumption on the conditional independence of the random variables {*Xv*}*v*∈V given *X*<sup>U</sup> , we get

$$\operatorname{H}(X\_{\mathcal{T}\cup\mathcal{S}} \mid X\_{\ell\ell}) + \operatorname{H}(X\_{\mathcal{T}\cap\mathcal{S}} \mid X\_{\ell\ell}) = \sum\_{\omega \in \mathcal{T} \cup \mathcal{S}} \operatorname{H}(X\_{\omega} \mid X\_{\ell\ell}) + \sum\_{\omega \in \mathcal{T} \cap \mathcal{S}} \operatorname{H}(X\_{\omega} \mid X\_{\ell\ell}) \tag{A8a}$$

$$=\sum\_{\omega\in\mathcal{T}}\operatorname{H}(X\_{\omega}\mid X\_{l\ell})+\sum\_{\omega\in\mathcal{S}}\operatorname{H}(X\_{\omega}\mid X\_{l\ell})\tag{A8b}$$

$$=\mathbb{H}(X\_{\mathcal{T}}|X\_{\mathcal{U}}) + \mathbb{H}(X\_{\mathcal{S}}|X\_{\mathcal{U}}).\tag{A8c}$$

Consequently, combining (A7) and (A8) gives

$$f(\mathcal{T}\cup\mathcal{S}) + f(\mathcal{T}\cap\mathcal{S}) = \left[\mathcal{H}(X\_{\mathcal{T}}) + \mathcal{H}(X\_{\mathcal{S}}) - \mathcal{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \,|\, X\_{\mathcal{T}\cap\mathcal{S}})\right]$$

$$-\left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}}|\mathbf{X}\_{l\ell}) + \mathbf{H}(\mathbf{X}\_{\mathcal{S}}|\mathbf{X}\_{l\ell})\right] \tag{A9a}$$

$$=\left[\mathbf{H}(\mathbf{X}\_{\mathcal{T}}) - \mathbf{H}(\mathbf{X}\_{\mathcal{T}}|\mathbf{X}\_{l\ell})\right] + \left[\mathbf{H}(\mathbf{X}\_{\mathcal{S}}) - \mathbf{H}(\mathbf{X}\_{\mathcal{S}}|\mathbf{X}\_{l\ell})\right]$$

$$-\mathcal{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \mid X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A9b}$$

$$=\mathcal{I}(X\_{\mathcal{T}};X\_{\mathcal{U}}) + \mathcal{I}(X\_{\mathcal{S}};X\_{\mathcal{U}}) - \mathcal{I}(X\_{\mathcal{T}\backslash\mathcal{S}'};X\_{\mathcal{S}\backslash\mathcal{T}} \,|\, X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A9c}$$

$$0 = f(\mathcal{T}) + f(\mathcal{S}) - \mathrm{I}(X\_{\mathcal{T}\backslash\mathcal{S}}; X\_{\mathcal{S}\backslash\mathcal{T}} \, | \, X\_{\mathcal{T}\cap\mathcal{S}}) \tag{A9d}$$

$$0 \le f(\mathcal{T}) + f(\mathcal{S}),\tag{A9e}$$

where the inequality (A9e) holds with equality if and only if *<sup>X</sup>*T \S and *<sup>X</sup>*S\T are conditionally independent given *X*T ∩S.

• Monotonicity: If S⊆T ⊆V, then

$$f(\mathcal{S}) = \mathcal{I}(X\_{ll'}; X\_{\mathcal{S}}) \le \mathcal{I}(X\_{ll'}; X\_{\mathcal{T}}) = f(\mathcal{T}),\tag{A10}$$

so *f* is monotonically increasing.

We finally prove Item (e), where it is needed to show that the entropy of a sum of independent random variables is a rank function. Let *<sup>f</sup>* : <sup>2</sup><sup>Ω</sup> <sup>→</sup> <sup>R</sup> be the set function as given in (19).


$$\varsqcup U \stackrel{\triangle}{=} \sum\_{\omega \in \mathcal{T} \cap \mathcal{S}} X\_{\omega \prime} \quad V \stackrel{\triangle}{=} \sum\_{\omega \in \mathcal{S} \backslash \mathcal{T}} X\_{\omega \prime} \quad W \stackrel{\triangle}{=} \sum\_{\omega \in \mathcal{T} \backslash \mathcal{S}} X\_{\omega} . \tag{A11}$$

From the independence of the random variables {*Xω*}*ω*∈Ω, it follows that *U*, *V* and *W* are independent. Hence, we get

$$\left[f(\mathcal{T}) + f(\mathcal{S})\right] - \left[f(\mathcal{T} \cup \mathcal{S}) + f(\mathcal{T} \cap \mathcal{S})\right]$$

$$\mathbf{u} = \left[ f(\mathcal{T}) - f(\mathcal{T} \cap \mathcal{S}) \right] - \left[ f(\mathcal{T} \cup \mathcal{S}) - f(\mathcal{S}) \right] \tag{A12a}$$

$$\mathbf{H} = \left[ \mathbf{H}(\mathcal{U} + \mathcal{W}) - \mathbf{H}(\mathcal{U}) \right] - \left[ \mathbf{H}(\mathcal{U} + \mathcal{V} + \mathcal{W}) - \mathbf{H}(\mathcal{U} + \mathcal{V}) \right] \tag{A12b}$$

$$\mathbf{H} = \left[\mathbf{H}(\mathcal{U} + \mathcal{W}) - \mathbf{H}(\mathcal{U} + \mathcal{W}|\mathcal{W})\right] - \left[\mathbf{H}(\mathcal{U} + \mathcal{V} + \mathcal{W}) - \mathbf{H}(\mathcal{U} + \mathcal{V})\right] \tag{A12c}$$

$$= \left[ \mathbf{H}(\boldsymbol{\varPi} + \boldsymbol{\mathcal{W}}) - \mathbf{H}(\boldsymbol{\varPi} + \boldsymbol{\mathcal{W}}|\boldsymbol{\mathcal{W}}) \right] - \left[ \mathbf{H}(\boldsymbol{\varPi} + \boldsymbol{\mathcal{V}} + \boldsymbol{\mathcal{W}}) - \mathbf{H}(\boldsymbol{\varPi} + \boldsymbol{\mathcal{V}} + \boldsymbol{\mathcal{W}}|\boldsymbol{\mathcal{W}}) \right] \tag{A12d}$$

$$\mathbf{H} = \mathbf{I}(\mathcal{U} + \mathcal{W}; \mathcal{W}) - \mathbf{I}(\mathcal{U} + \mathcal{V} + \mathcal{W}; \mathcal{W}) \tag{A12e}$$

$$\mathbf{I} \ge \mathbf{I}(\mathcal{U} + \mathcal{W}; \mathcal{W}) - \mathbf{I}(\mathcal{U} + \mathcal{W}, V; \mathcal{W}),\tag{A12f}$$

and

$$\mathbf{I}(\mathcal{U}+\mathcal{W},\mathcal{V};\mathcal{W})=\mathbf{I}(\mathcal{V};\mathcal{W})+\mathbf{I}(\mathcal{U}+\mathcal{W};\mathcal{W}|\mathcal{V})\tag{A13a}$$

$$\mathbf{I} = \mathbf{I}(\mathcal{U} + \mathcal{W}; \mathcal{W} \lfloor V \rfloor) \tag{A13b}$$

$$\mathbf{I} = \mathbf{I}(\mathcal{U} + \mathcal{W}; \mathcal{W}).\tag{A13c}$$

Combining (A12) and (A13) gives (11).

• Monotonicity: If S⊆T ⊆ Ω, then since {*Xω*}*ω*∈<sup>Ω</sup> are independent random variables, (A11) implies that *U* and *W* are independent and *V* = 0. Hence,

$$f(\mathcal{T}) - f(\mathcal{S}) = \mathcal{H}(\mathcal{U} + \mathcal{W}) - \mathcal{H}(\mathcal{U})\tag{A14a}$$

$$=\mathbb{H}(\mathcal{U}+\mathcal{W})-\mathbb{H}(\mathcal{U}+\mathcal{W}|\mathcal{W})\tag{A14b}$$

$$\mathbf{I} = \mathbf{I}(\mathcal{U} + \mathcal{W}; \mathcal{W}) \ge 0. \tag{A14c}$$

This completes the proof of Proposition 1.

#### **Appendix B. Proof of Proposition 4**

**Lemma A1.** *Let* {B*j*} *<sup>j</sup>*=<sup>1</sup> *(with* ≥ 2*) be a sequence of sets that is not a chain (i.e., there is no permutation π*: [] → [] *such that* B*π*(1) ⊆ B*π*(2) ⊆ ... ⊆ B*π*()*). Consider a recursive process where, at each step, a pair of sets that are not related by inclusion is replaced with their intersection and union. Then, there exists such a recursive process that leads to a chain in a finite number of steps.*

**Proof.** The lemma is proved by mathematical induction on . It holds for = 2 since B<sup>1</sup> ∩ B<sup>2</sup> ⊆ B<sup>1</sup> ∪ B2, and the process halts in a single step. Suppose that the lemma holds with a fixed ≥ 2, and for an arbitrary sequence of sets which is not a chain. We aim to show that it also holds for every sequence of <sup>+</sup> 1 sets which is not a chain. Let {B*j*}+<sup>1</sup> *j*=1

be such an arbitrary sequence of sets, and consider the subsequence of the first sets B1, ... , B. If it is not a chain, then (by the induction hypothesis) there exists a recursive process as above which enables to transform it into a chain in a finite number of steps, i.e., we get a chain B <sup>1</sup> ⊆ B <sup>2</sup> ⊆ ... ⊆ B . If B ⊆ B+<sup>1</sup> or B+<sup>1</sup> ⊆ B <sup>1</sup>, then we get a chain of + 1 sets. Otherwise, by proceeding with the recursive process where B and B+<sup>1</sup> are replaced with their intersection and union, consider the sequence

$$
\mathcal{B}'\_{1\prime} \dots \dots \mathcal{B}'\_{\ell-1\prime} \mathcal{B}'\_{\ell} \cap \mathcal{B}\_{\ell+1\prime} \mathcal{B}'\_{\ell} \cup \mathcal{B}\_{\ell+1} \,. \tag{A15}
$$

By the induction hypothesis, the first sets in this sequence can be transformed into a chain (in a finite number of steps) by a recursive process as above; this gives a chain of the form B <sup>1</sup> ⊆ B <sup>2</sup> ... ⊆ B −<sup>1</sup> ⊆ B . The first sets in (A15) are all included in B , so every combination of unions and intersections of these sets is also included in B . Hence, the considered recursive process leads to a chain of the form

$$\mathcal{B}\_1^{\prime\prime} \subseteq \mathcal{B}\_2^{\prime\prime} \dots \subseteq \mathcal{B}\_{\ell-1}^{\prime\prime} \subseteq \mathcal{B}\_\ell^{\prime\prime} \subseteq \mathcal{B}\_\ell^{\prime} \cup \mathcal{B}\_{\ell+1\prime} \tag{A16}$$

where the last inclusion in (A16) holds since B ⊆ B . The claim thus holds for + 1 if it holds for a given , and it holds for = 2, it therefore holds by mathematical induction for all integers ≥ 2.

We first prove Proposition 4a. Suppose that there is a permutation *π*: [*M*] → [*M*] such that S*π*(1) ⊆ S*π*(2) ⊆ ... ⊆ S*π*(*M*) is a chain. Since every element in Ω is included in at least *d* of these subsets, then it should be included in (at least) the *d* largest sets of this chain, so S*π*(*j*) = Ω for every *j* ∈ [*M* − *d* + 1 : *M*]. Due to the non-negativity of *f* , it follows that

$$\sum\_{j=1}^{M} f(\mathcal{S}\_{j}) \ge \sum\_{j=M-d+1}^{M} f(\mathcal{S}\_{\pi(j)}) \tag{A17a}$$

$$=d\,f(\Omega).\tag{A17b}$$

Otherwise, if we cannot get a chain by possibly permuting the subsets in the sequence <sup>S</sup>*j*}*<sup>M</sup> <sup>j</sup>*=1, consider a pair of subsets S*<sup>n</sup>* and S*<sup>m</sup>* that are not related by inclusion, and replace them with their intersection and union. By the submodularity of *f* ,

$$\sum\_{j=1}^{M} f(\mathcal{S}\_{j}) = \sum\_{j \neq n, m} f(\mathcal{S}\_{j}) + f(\mathcal{S}\_{n}) + f(\mathcal{S}\_{m}) \tag{A18a}$$

$$\geq \sum\_{j \neq n, m} f(\mathcal{S}\_j) + f(\mathcal{S}\_n \cap \mathcal{S}\_m) + f(\mathcal{S}\_n \cup \mathcal{S}\_m). \tag{A18b}$$

For all *ω* ∈ Ω, let deg(*ω*) be the number of indices *j* ∈ [*M*] such that *ω* ∈ S*j*. By replacing S*<sup>n</sup>* and S*<sup>m</sup>* with S*<sup>n</sup>* ∩ S*<sup>m</sup>* and S*<sup>n</sup>* ∪ S*m*, the set of values {deg(*ω*)}*ω*∈<sup>Ω</sup> stays unaffected (indeed, if *ω* ∈ S*<sup>n</sup>* and *ω* ∈ S*m*, then it belongs to their intersection and union; if *ω* belongs to only one of the sets S*<sup>n</sup>* and S*m*, then *ω* ∈ S / *<sup>n</sup>* ∩ S*<sup>m</sup>* and *ω* ∈ S*<sup>n</sup>* ∪ S*m*; finally, if *ω* ∈ S / *<sup>n</sup>* and *ω* ∈ S / *<sup>m</sup>*, then it does not belong to their intersection and union). Now, consider the recursive process in Lemma A1. Since the profile of the number of inclusions of the elements in Ω is preserved in each step of the recursive process in Lemma A1, it follows that every element in Ω stays to belong to at least *d* sets in the chain which is obtained at the end of this recursive process. Moreover, in light of (A18), in every step of the recursive process in Lemma A1, the sum in the LHS of (A18) cannot increase. Inequality (45) therefore finally follows from the earlier part of the proof for a chain (see (A17)).

We next prove Proposition 4b. Let A ⊂ Ω, and suppose that every element in A is included in at least *<sup>d</sup>* <sup>≥</sup> 1 of the subsets {S*j*}*<sup>M</sup> <sup>j</sup>*=1. For all *j* ∈ [*M*], define S *<sup>j</sup>* -S*<sup>j</sup>* ∩ A, and consider the sequence  S *j <sup>M</sup> <sup>j</sup>*=<sup>1</sup> of subsets of A. If *f* is a rank function, then it is monotonically increasing, which yields

$$f(\mathcal{S}\_j') \le f(\mathcal{S}\_j), \qquad j \in [M]. \tag{A19}$$

Each element of <sup>A</sup> is also included in at least *<sup>d</sup>* of the subsets  S *j <sup>M</sup> <sup>j</sup>*=<sup>1</sup> (by construction, and since (by assumption) each element in <sup>A</sup> is included in at least *<sup>d</sup>* of the subsets {S*j*}*<sup>M</sup> <sup>j</sup>*=1). By the non-negativity and submodularity of *f* , Proposition 4a gives

$$\sum\_{j=1}^{M} f(\mathcal{S}\_j') \ge df(\mathcal{A}).\tag{A20}$$

Combining (A19) and (A20) yields (46). This completes the proof of Proposition 4.

**Remark A1.** *Lemma A1 is weaker than a claim that, in every recursive process as in Lemma A1, the number of pairs of sets that are not related by inclusion is strictly decreasing at each step. Lemma A1 is, however, sufficient for our proof of Proposition 4a.*

#### **References**

