Case (b)

Next, we consider a situation in which both subsystems are updated by extremely constraining information: when the subsystems are treated separately, *q*1(*i*1) is updated to *P*1(*i*1) and *q*2(*i*2) is updated to *P*2(*i*2). When the systems are treated jointly, we require that the joint prior for the combined system *q*1(*i*1)*q*2(*i*2) be updated to *P*1(*i*1)*P*2(*i*2).

First we treat the subsystems separately. Maximize the entropy of subsystem 1,

$$S[p\_1, q\_1] = \sum\_{i\_1} F\left(p\_1(i\_1), q\_1(i\_1)\right) \quad \text{subject to} \quad p\_1(i\_1) = P\_1(i\_1) \dots$$

To each constraint—one constraint for each value of *i*1—we must supply one Lagrange multiplier, *λ*1(*i*1). Then, we obtain

$$\delta \left[ S - \sum\_{i\_1} \lambda\_1(i\_1) (p(i\_1) - P\_1(i\_1)) \right] = 0 \dots$$

Using Equation (A7),

$$\frac{\partial S}{\partial p\_1} = \frac{\partial}{\partial p\_1} F(p\_1, q\_1) = \phi \left(\frac{p\_1}{q\_1}\right) \text{ .}$$

and, imposing that the selected posterior be *P*1(*i*1), we find that the function *φ* must obey

$$
\phi \left( \frac{P\_1(i\_1)}{q\_1(i\_1)} \right) = \lambda\_1(i\_1) \,. \tag{A8}
$$

Similarly, for system 2 we find the following:

$$
\phi \left( \frac{P\_2(i\_2)}{q\_2(i\_2)} \right) = \lambda\_2(i\_2) \,. \tag{A9}
$$

Next, we treat the two subsystems jointly. Maximize the entropy of the joint system as follows:

$$S[p\_\prime q] = \sum\_{i\_1, i\_2} F\left(p(i\_1, i\_2), q\_1(i\_1)q\_2(i\_2)\right)$$

subject to the following constraints on the joint distribution *p*(*i*1, *i*2):

$$
\sum\_{i\_2} p(i\_1, i\_2) = P\_1(i\_1) \qquad \text{and} \qquad \sum\_{i\_1} p(i\_1, i\_2) = P\_2(i\_2) \dots
$$

Again, there is one constraint for each value of *i*<sup>1</sup> and of *i*<sup>2</sup> and we introduce Lagrange multipliers, *η*1(*i*1) or *η*2(*i*2). Then,

$$\delta \left[ S - \sum\_{i\_1} \eta\_1(i\_1) \left( \sum\_{i\_2} p(i\_1, i\_2) - P\_1(i\_1) \right) - \{1 \leftrightarrow 2\} \right] = 0\_\prime$$

where {1 ↔ 2} indicates a third term, similar to the second, with 1 and 2 interchanged. The independent variations *δp*(*i*1, *i*2) yield

$$\phi\left(\frac{p(i\_1, i\_2)}{q\_1(i\_1)q\_2(i\_2)}\right) = \eta\_1(i\_1) + \eta\_2(i\_2) \dots$$

and we impose that the selected posterior be the product *P*1(*i*1)*P*2(*i*2). Therefore, the function *φ* must be such that

$$
\phi \left( \frac{P\_1 P\_2}{q\_1 q\_2} \right) = \eta\_1 + \eta\_2 \dots
$$

To solve this equation, we take the exponential of both sides, let *ξ* = exp *φ*, and rewrite as

$$
\xi \left( \frac{P\_1 P\_2}{q\_1 q\_2} \right) \mathfrak{e}^{-\eta \chi(i\_2)} = \mathfrak{e}^{\eta\_1(i\_1)} \,. \tag{A10}
$$

This shows that for any value of *i*1, the dependences of the LHS on *i*<sup>2</sup> through *P*2/*q*<sup>2</sup> and *η*<sup>2</sup> must cancel each other out. In particular, if for some subset of *i*2s, the subsystem 2 is updated so that *P*<sup>2</sup> = *q*2, which amounts to no update at all, the *i*<sup>2</sup> dependence on the left is eliminated but the *i*<sup>1</sup> dependence remains unaffected,

$$
\xi\left(\frac{P\_1}{q\_1}\right)e^{-\eta\_2'} = e^{\eta\_1(i\_1)}\dots
$$

where *η* <sup>2</sup> is some constant independent of *i*2. A similar argument with {1 ↔ 2} yields

$$
\xi \left( \frac{P\_2}{q\_2} \right) e^{-\eta\_1'} = e^{\eta\_2(i\_2)} \text{ .}
$$

where *η* <sup>1</sup> is a constant. Taking the exponential of (A8) and (A9) leads to the following:

$$
\xi \left( \frac{P\_1}{q\_1} \right) e^{-\eta\_2'} = e^{\lambda\_1 - \eta\_2'} = e^{\eta\_1} \quad \text{and} \quad \xi \left( \frac{P\_2}{q\_2} \right) e^{-\eta\_1'} = e^{\lambda\_2 - \eta\_1'} = e^{\eta\_2} \dots
$$

Substituting back into (A10), we obtain

$$
\xi \left( \frac{P\_1 P\_2}{q\_1 q\_2} \right) = \xi \left( \frac{P\_1}{q\_1} \right) \xi \left( \frac{P\_2}{q\_2} \right) \ .
$$

where a constant factor *e*−(*η* 1+*η* <sup>2</sup>) is absorbed into a new function *ξ*. The general solution of this functional equation is a power,

$$
\xi(xy) = \xi(x)\xi(y) \Longrightarrow \xi(x) = x^a,
$$

so that

$$
\phi(x) = a \log x + b \ll x
$$

where *a* and *b* are constants. Finally, integrate (A7),

$$\frac{\partial F}{\partial p} = \phi \left( \frac{p}{q} \right) = a \log \frac{p}{q} + b \text{ , } \phi$$

to obtain

$$F[p,q] = ap\log\frac{p}{q} + b'p + c$$

where *b* and *c* are constants.

At this point, the entropy takes the general form

$$S[p\_\prime q] = \Sigma\_i \left( a p\_i \log \frac{p\_i}{q\_i} + b^\prime p\_i + c \right).$$

The additive constant *c* may be dropped: it contributes a term that does not depend on the probabilities and has no effect on the ranking scheme. Furthermore, since *S*[*p*, *q*] will be maximized subject to constraints that include normalization, the *b* term has no effect on the selected distribution and can also be dropped. Finally, the multiplicative constant *a* has no effect on the overall ranking, except in the trivial sense that inverting the sign of *a* will transform the maximization problem to a minimization problem or vice versa. We can, therefore, set *a* = −1 so that maximum *S* corresponds to maximum preference, which gives us Equation (9) and concludes our derivation.

#### **References and Notes**

