**1. Introduction**

Quantum tomography [1] allows us to associate a unique quantum state over a finite-dimensional Hilbert space provided that multiple copies of the quantum system are available, together with a complete set of measurements. Observe that when the degrees of freedom increase, the amount of resources for performing the latter grows exponentially. However, physically relevant phenomena are entirely determined by few-body correlations—their Hamiltonians are in general highly local [2]—and when we restrict ourselves to *k*-order dependencies, the data collection results in an exponential speed-up in the number of subsystems, leading to efficient tomography techniques [3]. Clearly, a partial dataset admits many possible compatible density operators. The overlap between (quantum) statistical mechanics and quantum information theory provides a well-established tool, entropy maximization, to dealing with the remaining degrees of freedom. By using von Neumann entropy within Jaynes' principle [4], we define a criterion to estimate density operators, maximally unbiased with regards to the provided partial information. Problem statement.

A question that naturally arises is the following: is there an efficient and effective procedure for inferring the aforementioned quantum state? More concretely, is it possible to find a density operator describing a finite-dimensional multipartite quantum system that maximizes the von Neumann entropy under the constraints given by its few-body marginals? In this work, we focus on this problem for the case of direct correlations, that is, 2-body marginals.

The problem we address is strictly related to the (quantum) Hamiltonian learning problem [5–7]—every density operator is thermal for a determined Hamiltonian. In general, the Hamiltonian is given, and one tries to find out its properties, so the problem of its characterization is not well explored. Recent developments in (quantum) machine learning techniques [8] renewed the interest in the Hamiltonian learning problem. In [9], an effective

**Citation:** Di Giorgio, S.; Mateus, P. On the Complexity of Finding the Maximum Entropy Compatible Quantum State. *Mathematics* **2021**, *9*, 193. https://doi.org/10.3390/math 9020193

Received: 15 December 2020 Accepted: 15 January 2021 Published: 19 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

5

neural-networks approach to the problem has been proposed, and an upper bound, which is polynomial in the number of qudits, has been established for its sample complexity [10].

One of the main reasons for the little background on the problem at hand relies on the computational hardness of well-known problems that reduce to it. First, the *quantum marginal problem* [11,12], that consists of determining whether a set of marginal quantum states has a global density operator compatible with them, and for which a solution is known just in some particular cases [13–15]. Then, the classical inference problem of a probability distribution via graphical models [16] also leads to a maximum entropy estimation. Density operators naturally encompass classical probability distributions on the finite-dimensional setup; therefore, when considering direct correlations between the subsystems, the hardness results for classical graph-inference should be considered. In particular, the classical problem is well known for being computationally hard [17,18]. The only cases for which a polynomial procedure is known is when the direct correlations have the structure of a tree (undirected acyclic graph), and moreover, for this case, there exists an efficient procedure for determining the most likely tree from a general graph—the Chow–Liu algorithm [19]. The speed-up is due to the Markov condition, which can be directly inferred from the graphical structure, resulting in the factorization of the maximum entropy joint probability distribution. Many attempts have been made for developing appropriate operatorial graphical models [20,21], but none of them naturally encodes the desired generalization of Markovianity. For obtaining a compression of the learning procedure, further conditions [22] need to be verified.

In this article, we study the aforementioned problem restricted to a tree-structured set of marginals density operators and the abstraction of the Chow–Liu algorithm. Namely, we focus on two questions. First, is the inference efficiency limited to mutually commuting (and acyclic connected) density operators, which encode classical probability distributions? Second, can we determine a broader set of density operators for which an extended efficient procedure is similarly achieved?

Contributions of the paper.

We start by showing that comparing the entropies of 3-chains—quantum states compatible with two given 2-body marginals—is a complete problem for the class QSZK [23–25]— Quantum Statistical Zero Knowledge. This result hints that finding the maximum entropy compatible state given two marginals should be not feasible, even for a quantum computer [26], at least by performing an entropy-monotonic step-by-step optimization into the compatibility space of the provided marginals. Indeed, the complexity class QSZK, originally defined by J. Watrous in 2002 [25], collects promise problems whose true instances can be verified by a zero knowledge quantum proof between two quantum entities, generalizing the class Statistical Zero Knowledge (SZK) to quantum computers. Natural complete problems for the class represent its hardness, including distinguishing two quantum states (Problem 4) and determining their quantum entropy difference [27].

Next, we restrict the class of quantum states to make the problem feasible. We consider quantum Markov trees, states for which each 3-subchains form a quantum Markov chain [28]. In this case, we show that the maximum entropy compatible problem is in P, and also that there exists a polynomial-time quantum circuit that constructs the maximal entropy compatible state. Finally, we use this result to extend the Chow–Liu algorithm [19] for quantum states whose all 3-subchains are quantum Markov chains. The results obtained in this paper provide a natural extension of prior work [29] to the many-body scenario. Organization of the paper.

In Section 2, we give some background and state clearly the problems we are addressing. In Section 3, we attain the hardness of comparing the entropy of a compatible chain. In Section 4, we consider the restriction of the maximum entropy problem to quantum Markov trees. There, we provide the polynomial-time solution for this case, how to construct the solution with a polynomial-quantum circuit, and the generalization of Chow–Liu algorithm. Some of the proofs are left to the appendices. Finally, we draw some conclusions and leave some open problems in Section 5.

#### **2. Background and Problem Statement**

Throughout this work, we assume all quantum states and operators to be defined over a finite dimensional Hilbert space H that is composed of *n* parts, such that H = <sup>⊗</sup>*ni*=<sup>1</sup>H*i*. We denote by I a collection of subsets of {1, ... *n*} and throughout the text we call I the *set of marginals indexes*. Elements of I are denoted by *J*, and its complement is represented by *J*. Given I, we are interested in density operators that are compatible with a I-indexed family of marginal density operators C where C = {*ρJ* ∈ B-<sup>H</sup>*J*}*J*∈I such that

$$\text{Tr}\_{\overline{I \cap I'}}[\rho\_I] = \text{Tr}\_{\overline{I \cap I'}}[\rho\_{I'}] \text{ for all } I, f' \in \mathcal{Z},\tag{1}$$

where H*J* = *<sup>i</sup>*∈*<sup>J</sup>* H*i*. We call each element *ρJ* a *marginal density operator*. We also denote by Q(C) = {*QJ*}*J*∈I a family of quantum circuits such that *QJ* constructs the density operator *ρJ*.

The *compatibility set* Comp(C) associated to a given family of compatible marginals C is the set of density operators over H that admits as partial traces all the elements of C, that is:

$$\text{Comp}(\mathcal{C}) := \left\{ \rho \in \mathcal{B}(\mathcal{H}) \,:\, \text{Tr}\_{\mathsf{T}}[\rho] = \rho\_{\mathsf{I}} \text{ for all } \mathsf{J} \in \mathcal{Z} \right\}. \tag{2}$$

The family C is said to be admissible when Comp(C) = 0, that is, if it admits at least one density operator whose marginals coincide with those in C.

We start by noticing that, the problem of the admissibility for a compatible set where all marginal density operators are diagonal for the same basis—that is, density operators encoding discrete probability distributions—collapses in the classical compatible marginal problem [30]. This classical problem has been shown to be NP-complete for the threedimensional case [31]. There are many cases for which it is solvable [32], and there is always a solution if we consider only two-body marginals (bipartite marginals) that form an acyclic graph.

The relevant case where the marginals are not diagonal for the same basis has been the target of several research works and is called the quantum compatible marginal problem. Liu showed that this problem is Quantum Merlin Arthur (QMA)-complete, that is, it is one of the hardest problem in the computational complexity class QMA [12]. The class Quantum Merlin Arthur (QMA) [33] collects promise problems whose "yes" answer can be verified by a 1-message quantum interactive proof, generalizing to the quantum realm the class NP of problems classically verifiable in poly-time.

#### **Problem 1.** *Quantum Compatible Marginal Problem (QCMP)*


In some cases, we know that C is admissible, for instance when we are promised that the marginals *ρJ* are indeed partial traces of a global state. In Physics, it is reasonable to assume that we can prepare many copies of a global system, but in general, we can only partially observe it. In this case, given that we have many copies of the global system, we would be able to characterize in full detail the partial traces and know that they form an admissible set. The question now is to infer the global state with maximum entropy among those in the compatibility set. This leads to the following problem.

#### **Problem 2.** *Maximum Entropy Compatible Marginal Problem (MECMP)*


Given the general complexity of this problem, we focus on the more straightforward case where all sets *J* in I have two indexes. Thus, we consider that we are given a set of compatible two-body marginals, and we want to reconstruct the maximum entropy state compatible with those marginals. For this two-body case, it is possible to construct an associated graph, where each two-body marginal denotes an edge.

**Definition 1.** *Let* C *be a* I*-indexed family of two-body compatible marginal density operators. The associated graph* GC *is* ({1, . . . , *<sup>n</sup>*}, *<sup>E</sup>*)*, where* (*i*, *j*) ∈ *E if* {*i*, *j*}∈I*.*

In the simplest non-trivial case, we have that *n* = 3 and I = {{1, <sup>2</sup>}, {2, <sup>3</sup>}}. We call this case a 3-chain. In the next section, we show that given two density operators *ρ*0 and *ρ*1 in the compatible set of a 3-chain, comparing who has higher entropy is QSZK-complete. We denote the subspaces H1, H2 and H3 by H*A*, H*B* and H*C*, respectively.

#### **3. Hardness of Comparing Entropy of a Compatible Chain**

Ben-Aroya et al. [27] showed that, given two quantum circuits *Q*0 and *Q*1 that generate two mixed states *ρ*0 and *ρ*1, respectively, such that |*S*(*ρ*0) − *<sup>S</sup>*(*ρ*1)| > 1 2 , determining whether *<sup>S</sup>*(*ρ*0) > *<sup>S</sup>*(*ρ*1) is QSZK-complete. Thus, they conclude that it is quite improbable that computing the von Neumann entropy of a mixed state can be done in BQP [34]. We further look into this problem by restricting to the case when *ρ*0 and *ρ*1 live in the same Hilbert space and have the same marginals. We state our problem as follows:

#### **Problem 3.** *3-Chain Compatible Quantum Entropy Difference (3cQED)*

	- - Tr*A*(*ρ*0) = Tr*A*(*ρ*1)*;*
	- - Tr*C*(*ρ*0) = Tr*C*(*ρ*1)*;*
	- - |*S*(*ρ*0) − *<sup>S</sup>*(*ρ*1)| ≥ 1/2*;*

*then,*


Clearly, 3cQED is a particular case of QED, wherein the latter the Hilbert space of *ρ*0 and *ρ*1 does not have to be the same, nor do the densities need to be tripartite.

Obviously, 3cQED is reducible to QED, and therefore it lies in QSZK. It remains to show that it is QSZK hard. To do so, we adapt the proof of Ben-Aroya et al., and reduce QSD*<sup>α</sup>*,*β*, natural complete problem for the class QSZK [25], to 3cQED, for 0 ≤ *α* < *β*2 ≤ 1.

**Problem 4.** *Quantum state distance (QSD<sup>α</sup>*,*β) with* 0 ≤ *α* < *β*2 ≤ 1*:*

	- - *either* ||*ρ*0 − *ρ*1||*tr* ≥ *β;*
	- *or* ||*ρ*0 − *ρ*1||*tr* ≤ *α;*

 *then,*


In Problem 4, ||*ρ*0 − *ρ*1||*tr* denotes the trace distance between the operators *ρ*0 and *ρ*1.

**Theorem 1.** *For any* 0 ≤ *α* < *β*2 ≤ 1*, QSD<sup>α</sup>*,*<sup>β</sup> is reducible to 3cQED.*

*α.*

**Proof of Theorem 1.** The idea of the proof is the following. From quantum circuits *Q*0 and *Q*1 acting on *m* bits that generate, respectively, *ρ*0 and *ρ*1 fulfilling the promise of QSD*<sup>α</sup>*,*β*, we are going to construct, in polynomial-time, two quantum circuits *Q* 0 and

*Q*1 that generate tripartite density operators *ρ*0 and *<sup>ρ</sup>*1, fulfilling the promise of 3cQED, such that QSD*<sup>α</sup>*,*β*(*Q*0, *Q*1)*NO* iff 3cQED(*Q*0, *Q*1)*NO*.

Concretely, given circuits *Q*0, *Q*1, that construct *ρ*0 and *ρ*1, we first apply the polarization lemma (Lemma A1 in Appendix A) with *n* = *m* and obtain circuits *R*0 and *R*1 that output density operators *μ*0, *μ*1, respectively. We then construct two circuits *Z*0 and *Z*1 as follows. *Z*1 is implemented by a circuit which first applies a Hadamard gate on a single qubit *b*, measures *b* and then conditioned on the result it applies either *R*0 or *R*1. The output of *Z*1 is *ξ*1 = 12 |0 0| ⊗ *μ*0 + 12 |1 1| ⊗ *μ*1. Since we need to construct a tripartite system, we introduce a non-orthodox, but useful, notation *ξ AC*1 to denote a copy of *ξ*1 where the qubit part of *ξ*1 belongs to the system of *A* and the remaining part belongs to system of *C*. Similarly, we denote by *ξCA*1 to indicate a copy of *ξ*1 where the qubit part belongs to *C* and remaining part to *A*. Circuit *Z*0 is the same as *Z*1 except that the qubit *b* is traced out. The output of *Z*0 is *ξ*0 = 12*<sup>μ</sup>*0 + 12*<sup>μ</sup>*1. We shall denote by *ξ A*0 and *ξC*0 a copy of *ξ*0 belonging to the subsystem of *A* or *C*, respectively.

Finally, we denote by |*φ*± *AC* two maximally entangled states between *A* and *C*. Moreover, take *ζ* = 12 |*φ*<sup>+</sup> *φ*<sup>+</sup>| + 12 |*φ*− *φ*−| and note that *S*(*ζ*) = 1. We denote by *Q* the circuit that prepares *ζ*. Consider:

• *ρ* = *ξ A*0 ⊗ *ζ AC* ⊗ *ξC*0 ⊗ |0 <sup>0</sup>|*B*;

• *ρ* = *ξ AC*1 ⊗ *ξCA*1 ⊗ |0 <sup>0</sup>|*<sup>B</sup>*.

Note that in *ρ* the subsystem of *A* contains *ξ A*0 and a qubit of *ζ AC*; the subsystem of *C* contains *ξC*0 and the other qubit of *ζ AC*. Moreover, in *ρ*, the subsystem of *A* has a qubit entangled with *μ*0 and *μ*1 in the subsystem *C* (*ξ AC*1 ); and has another *μ*0 and *μ*1 entangled with a qubit of *C* (*ξCA*1).

The reduction outputs the following pair of density operators (*ρ*, *ρ*) together with the circuits that construct them, namely *Q*0 = *Z*0 ⊗ *Z*0 ⊗ *Q* and *Q*1 = *Z*1 ⊗ *Z*1. We ignore the construction of the state |0 <sup>0</sup>|*<sup>B</sup>*, which is trivial.

Start by observing that by tracing *C* from both *ρ* and *ρ* we obtain ( 12 |0 0| + 1 2 |1 1|) ⊗ ( 12*<sup>μ</sup>*0 + 12*<sup>μ</sup>*1) ⊗ |0 <sup>0</sup>|. The same state will be obtained by tracing subsystem *A* from both *ρ* and *ρ*. So, *ρ* and *ρ* have compatible marginals. Part 1

If (*Q*0, *Q*1) ∈ (QSD*<sup>α</sup>*,*β*)*NO* then (*<sup>Z</sup>*0 ⊗ *Z*0 ⊗ *Q*, *Z*1 ⊗ *<sup>Z</sup>*1) ∈ 3cQED*NO*.

We know that *ρ*0 − *ρ*1 tr ≤ *α*. By the Polarization lemma (Lemma A1 in Appendix A) we ge<sup>t</sup> *μ*0 − *μ*1 tr ≤ 2−*m*. By the joint-entropy theorem (Lemma A2),

$$S(\xi\_1) = \frac{1}{2}(S(\mu\_0) + S(\mu\_1)) \, + \, 1. \tag{3}$$

On the other hand, *ξ*0 is very close both to *μ*0 and to *μ*1. Specifically, *ξ*0 − *μ*1 tr = 12*<sup>μ</sup>*0 − 12*<sup>μ</sup>*1 tr ≤ 2−*m*. Thus, by Fannes' inequality (Lemma A3 in Appendix A) |*S*(*ξ*0) − *<sup>S</sup>*(*μ*1)| ≤ 2−*<sup>m</sup>* · poly(*m*) ≤ 0.1 , for large enough *m*0. Similarly, |*S*(*ξ*0) − *<sup>S</sup>*(*μ*0)| ≤ 0.1. It follows that

$$|S(\mathcal{J}\_0) - \frac{1}{2}(S(\mu\_0) + S(\mu\_1))| \le 0.1. \tag{4}$$

Combining the two equations we ge<sup>t</sup> *<sup>S</sup>*(*ξ*1) − *<sup>S</sup>*(*ξ*0) ≥ 0.9. Thus, *<sup>S</sup>*(*ρ*) − *<sup>S</sup>*(*ρ*) ≥ 2 × 0.9 − 1 = 0.8. Therefore, (*<sup>Z</sup>*0 ⊗ *Z*0 ⊗ *Q*, *Z*1 ⊗ *<sup>Z</sup>*1) ∈ 3cQED*NO*. Part 2

If (*Q*0, *Q*1) ∈ (QSD*<sup>α</sup>*,*β*)*YES* then (*<sup>Z</sup>*0 ⊗ *Z*0 ⊗ *Q*, *Z*1 ⊗ *<sup>Z</sup>*1) ∈ 3cQED*YES*.

By the Polarization lemma (Lemma A1 in Appendix A) *μ*0 − *μ*1 tr ≥ 1 − 2−*m*. Using Lemma A5 (in Appendix A), we ge<sup>t</sup> that *<sup>S</sup>*(*ξ*0) ≥ 12 [*S*(*μ*0) + *<sup>S</sup>*(*μ*1)] + 1 − *H*( 12 + *μ*0−*μ*1 tr 2 ) ≥ 12 [*S*(*μ*0) + *<sup>S</sup>*(*μ*1)] + 1 − *<sup>H</sup>*(<sup>2</sup>−*m*<sup>0</sup> ). By Lemma A2 (in Appendix A) we know that *<sup>S</sup>*(*ξ*1) = 12 (*S*(*μ*0) + *<sup>S</sup>*(*μ*1)) + 1. Therefore, *<sup>S</sup>*(*ξ*1) − *<sup>S</sup>*(*ξ*0) = *<sup>H</sup>*(<sup>2</sup>−*<sup>m</sup>*) < 0.1 for sufficiently large *m*.

In particular, *<sup>S</sup>*(*ρ*) − *<sup>S</sup>*(*ρ* ) ≤ 2 ∗ 0.1 − 1 = −0.8 and (*<sup>Z</sup>*0 ⊗ *Z*0 ⊗ *Q*, *Z*1 ⊗ *<sup>Z</sup>*1) ∈ 3cQED*YES*.

It follows that comparing the entropy of a set of compatible marginals is QSZKcomplete, as this problem is also an instance of QED. As a consequence, we expect that finding the maximum entropy state is also generally hard, at least by performing a step by step entropy-increasing procedure. We now focus our attention on a particular sub-case in which this problem can be addressed.

#### **4. Quantum Markov Chains and Trees**

Given that the general problem of finding the maximum entropy state is hard, we focus on a well-behaved subset of density operators, namely *quantum Markov trees* (QMT)— Definition 4—which extends the notion of quantum Markov chains (QMC) [35] to the multi-partite scenario. By defining QMTs, we were able to extend the learning techniques provided by classical graphical models—Bayes composition [16] and Chow–Liu algorithm— to an enlarged set of density operators with respect to the mutually-commuting ones.

We refer to a set of two-body density operators as *tree-structured* when its associated graph—Definition 1—is a tree. In particular, we showed that given a tree-structured set of two-body marginal density operators:


We were then able to show that for QMTs, the MECMP is in P—Theorem 3. Moreover, given a general set of two-body marginals, we found that if all the sub-3-chains are compatible with a QMC, the *optimal-sub-tree*, which is a QMT, can be efficiently determined by generalizing the Chow–Liu learning algorithm—Theorem 5.

The main achievement consists in the exponential speed-up of the general Markov condition. For the case at hand, QMC-compatibility of every tree chain—polynomial in the number of 1-body subsystem *n*—implies the QMC-compatibility of every further sub-chains formed by sub-groups of nodes—exponential in *n*.

In order of proving the mentioned results, first we give the essential background on QMC—Section 4.1, then we formally define QMT—Section 4.2. In Section 4.3 we provide the entropic characterization of QMTs, then in Section 4.4 we derive the compatibility condition for a given set of tree-structured marginals with a QMT, and in Section 4.5 we study the MECM problem restricted to QMTs. Finally, in Section 4.6, we extend the Chow–Liu algorithm for determining the *optimal tree* when the provided set is not tree-structured.

#### *4.1. Background on Quantum Markov Chains*

We consider QMCs that rely on the Hilbert space H = H*A* ⊗ H*B* ⊗ H*C* and take C = {*ρ*{*<sup>A</sup>*,*<sup>B</sup>*} ∈ B H{*AB*} }, *ρ*{*<sup>B</sup>*,*<sup>C</sup>*} ∈ B H{*BC*} }. To simplify notation, we drop the brackets and commas in the indexes and so, for instance, the partial trace *ρ*{*<sup>A</sup>*,*<sup>B</sup>*} is just denoted by *ρAB* (the same simplification is applied for the Hilbert subspaces H{*<sup>A</sup>*,*<sup>B</sup>*}, which are denoted just by H*AB*).

Recall the definition of quantum Markov chain:

**Definition 2** ([36])**.** *A quantum Markov chain (QMC) is a 3-chain A* − *B* − *C for which there exists a recovery map* R*B*→*BC* : B(H*B*) → B(H*BC*)*, i.e., an arbitrary trace-preserving completely positive (CPTP) map (see, for instance, [37,38]), s.t. ρABC* = (<sup>I</sup>*A* ⊗ <sup>R</sup>*B*→*BC*)(*ρAB*)*, where* I*A denotes the identity map on* B(H*A*)*.*

By definition, the recovery map must fulfill that R*B*→*BC*(*ρB*) = *ρBC*.

**Definition 3.** *A family of QMC's* {*ρ*(*n*) *ABC*}*n*∈<sup>N</sup> *is said to be constructed in polynomial time if all elements ρ*(*n*) *ABC rely in the same (finite) Hilbert space* H*A* ⊗ H*B* ⊗ H*C (that does not depend on n) and there is polynomial-time family of quantum circuits that generate both ρ*(*n*) *ABand* R(*n*) *B*→*BC.*

Given that the dimension of (a polynomial-time) quantum Markov chain does not grow with *n*, it can be represented in matrix form in polynomial-time by multiplying all the gates involved in the circuits that generate *ρ*(*n*) *AB* and R(*n*) *B*→*BC*. We stress that to design circuits for density operators and CPTP maps we require only an ancilla space of the same dimension of the support of these operators/maps [39]. Therefore, the number of gates is polynomial in *n*, but the full dimension of the space (including ancillae) does not grow with *n*.

From this point on, we assume that *ρABC* is invertible (on its support), as invertible density operators are dense. To derive the main result of the paper, we need to establish a central lemma listing some known characterizations of QMCs. We give the proof in Appendix B.

**Lemma 1.** *Let ρABC be an invertible density operator. The following four assertions are equivalent:* 


The map P*<sup>B</sup>*→*BC*(*X*) is known as *Petz recovery map* or *transpose map*. Again, to ease notation, we drop the identities whenever they are obvious, for instance, we drop them in the expressions *ρ* 1 2 *BC*((*<sup>ρ</sup>* − 1 2 *B Xρ* − 1 2 *B* ) ⊗ id*C*)*ρ* 1 2 *BC* to just *ρ* 1 2 *BCρ* − 1 2 *B Xρ* − 1 2 *B ρ* 1 2 *BC*, and the same for log *ρABC* − (log *ρAB*) ⊗ id*C*, which we write just log *ρABC* − log *ρAB*.

Observe that we can also recover a tripartite density operator from *ρBC* through P*<sup>B</sup>*→*AB*(·):

$$
\rho\_{AB}^{\frac{1}{2}} \rho\_B^{-\frac{1}{2}} \rho\_{BC} \rho\_B^{-\frac{1}{2}} \rho\_{AB'}^{\frac{1}{2}} \tag{5}
$$

and by uniqueness, since the von Neumann entropy is operator-concave [40,41], we have P*<sup>B</sup>*→*BC*(*ρAB*) = P*<sup>B</sup>*→*AB*(*ρBC*). However, it is not known whether, given a family of QMC that can be constructed in polynomial time via P(*n*) *<sup>B</sup>*→*BC*(*X*), it is possible to build P(*n*) *<sup>B</sup>*→*AB*(*X*) in polynomial-time. The next result states that solution to MECMP (Problem 2) and also QCMP (Problem 1), for 3-chains can be fully determined when a QMC belongs to the compatibility set, the proofs can be found in [29].

**Lemma 2.** *Given a 3-chain* {*ρAB*, *ρBC*} *compatible with a QMC, say ρABC, then the solution of the maximum entropy estimator ρ ABC is precisely ρABC. Moreover, the 3-chain* {*ρAB*, *ρBC*} *is compatible with a QMC in* B(H*ABC*) *iff* Tr*A*(*ρAB*) = Tr*C*(*ρBC*) *and the operator* Θ*ABC* = *ρ* 12*BCρ*<sup>−</sup> 12 *B ρ* 12*AB is normal. Moreover, if two marginals* {*ρAB*, *ρBC*} *are compatible with a QMC on* B(H*ABC*)*, say ρABC, then the operator* Θ*ABC is its square root.*

#### *4.2. Definition of Quantum Markov Trees*

We are now able to extend the above result from 3-chains to a more general setting, namely to trees. From this point on, we make the following assumption.

**Assumption 1.** *Assume the graph* GC *associated to a Maximum Entropy Compatible Marginal Problem* C *over* X = {*<sup>X</sup>*1,... *Xn*} *is a tree, that is,* GC *is an acyclic connected graph over* X*.*

By taking any node as a root of GC , we construct an arborescence (or a directed tree). For the sake of readability, we introduce the following notation. We call a *constructive ordering of* C any total order compatible with the topological order of an arborescence of GC . Without loss of generality, we consider a constructive order of the form *X*1 < ··· < *Xn* and denote by G*k* the induced subgraph of GC containing all the nodes *Vk* = {*<sup>X</sup>*1, ... *Xk*} for *k* ∈ {1 ... *<sup>n</sup>*}. We also denote by C*k* the marginals in C containing nodes in {*<sup>X</sup>*1, ... *Xk*} and by *Yk*, for *k* ≥ 2, the node in *Vk*−<sup>1</sup> connected to *Xk* in G*k* (the adjacent node of *Xk* in G*k*). Finally, we denote by *Yk* the set *Vk*−<sup>1</sup> \ { *Yk*}, which is non-empty for *k* ≥ 3.

The next result follows easily:

**Proposition 1.** *If* GC *is a tree, than all the subgraphs* G*k are trees, and moreover, Xk is a leaf of* G*k.*

We now define a quantum Markov tree, which, as we shall see later on, generalizes the notion of Markov random field, when the underlying graph is a tree.

**Definition 4.** *Let ρ* ∈ B(HX) *with* X := {*<sup>X</sup>*1, ... *Xn*} *be an invertible density operator (over its support) and* C *is a (non-trivial) set of two-body marginals of ρ. We say that ρ is* quantum Markov tree (QMT) *or is factorizable via Petz according to* C *if its square root is such that ρ* = ΘΘ† = Θ†Θ *where* Θ *admits a decomposition, for some constructive order X*1 < ··· < *Xn, of the form*

$$\Theta = \Delta\_{\text{ll}} \dots \Delta\_{\text{3}} (\rho\_{X\_1 X\_2}^{\frac{1}{2}} \otimes id \frac{}{\{X\_1 X\_2\}}) \tag{6}$$

*with* Δ*k* = *ρ* 1 2 *XkYk idXk* ⊗ *ρ* − 1 2 *Yk* ⊗ *id*{*XkYk*}*, for all k* = 3... *n.*

We note that for Equation (6) to be well defined, it must be the case that GC is a tree, that is, that we are working under Assumption 1. It is relatively simple to extend the notion to acyclic graphs (which may not be connected).

#### *4.3. QMT as Max-Entropy Density Operator*

The following result will shed some light on the relationship between Markov random fields and QMTs.

**Theorem 2.** *Let ρ* ∈ B(HX) *be an invertible density operator over (its support) and* C *is a (nontrivial) set of two-body marginals s.t.* GC *is a spanning tree over* X*, then there exists ρ* ∈ Comp(C) *factorizable via Petz according to* C *iff there exists ρ* ∈ B(HX) *such that, equivalently, one of the following two hold:*

*(i)* log *ρ* = ∑C log *ρXiXj* − ∑*n <sup>i</sup>*=<sup>1</sup>(*deg*(*Xi*) − 1)log *ρXi ; (ii) we have* 

$$\forall k = 2, \dots, n: \quad \rho\_k = \text{Tr}\_{\overline{\nabla}\_k}[\rho] \text{ is s.t. } I\_{\overline{\rho}\_k}(X\_k: \overline{Y\_k}|\mathcal{Y}\_k) = 0 \tag{7}$$

*for some constructive ordering X*1 < ··· < *Xn.*

**Proof of Theorem 2.** The proof follows by induction on *k*, that is, by adding one edge per node following a constructive ordering in C. Therefore, we have that

$$\mathcal{C} = \{ \rho\_{X\_k Y\_k} \in \mathcal{B}(\mathcal{H}\_{X\_k Y\_k}) \, : \, \, \mathbf{Y}\_k \in \{ \mathbf{X}\_1, \dots, \mathbf{X}\_{k-1} \} \colon k = 2, \dots, n \}, \tag{8}$$

The proof follows by complete induction on *k*.

(Basis *k* = 3): The first chain occurs when the third node is added, that is, when *k* = 3. Assume there exists *ρ*3 ∈ Comp(C3) that is factorizable via Petz, i.e.,

$$
\Theta\_3 = \rho\_3^{\frac{1}{2}} = \rho\_{X\_3 \mathcal{Y}\_3}^{\frac{1}{2}} \rho\_{\mathcal{Y}\_3}^{-\frac{1}{2}} \rho\_{\widehat{\mathcal{Y}\_3 \mathcal{Y}\_3}}^{\frac{1}{2}} = \rho\_{X\_3 \mathcal{Y}\_3}^{\frac{1}{2}} \rho\_{\mathcal{Y}\_3}^{-\frac{1}{2}} \rho\_{X\_1 X\_2}^{\frac{1}{2}}.\tag{9}
$$

Observe that we can use Lemma 2 and so, Θ3 is exactly the operator described in the lemma, and since it is a square root, it is normal. Then, by Lemma 1, we have the following equivalences: *ρ*3 is a QMC iff *Iρ*3 -*X*3 : *<sup>Y</sup>*3|*<sup>Y</sup>*3 = 0 iff log *ρ*3 = log *ρ<sup>X</sup>*3*Y*3 + log *ρ<sup>X</sup>*1*X*2 − log *ρY*3 . The other direction follows immediately.

(Induction step *k* −→ *k* + 1):

Complete induction hypothesis: ∀*j* = 3, ... , *k* ∃ *ρj* ∈ Comp-C*j*factorizable via Petz according with C*j* iff there exists *ρj* ∈ B <sup>H</sup>*Vj* such that, equivalently, one of the following two hold:


Induction step: Assume there ∃ *ρk*+<sup>1</sup> ∈ Comp(C*k*+<sup>1</sup>) factorizable via Petz according with C*k*+1, then, our goal is to show that the following holds for *ρk*+1:

• log *ρk*+<sup>1</sup> = ∑<sup>C</sup>*k*+<sup>1</sup> log *ρXiXt* − ∑*<sup>k</sup>*+<sup>1</sup> *i*=1 (deg<sup>G</sup>*k* (*Xi*) − 1)log *ρXi* and , •*<sup>I</sup>ρk*+<sup>1</sup>-*Xk*+<sup>1</sup>:*Yk*+<sup>1</sup>|*Yk*+<sup>1</sup> =0.

 

Therefore, assume ∃*ρk*+<sup>1</sup> ∈ Comp(C*k*+<sup>1</sup>) factorizable via Petz, i.e.,

$$
\Theta\_{k+1} = \rho\_{k+1}^{\frac{1}{2}} = \Delta\_{k+1} \Delta\_k \dots \Delta\_3 \rho\_{X\_1 X\_2}^{\frac{1}{2}} \quad \text{where} \quad \Delta\_i := \rho\_{X\_i Y\_i}^{\frac{1}{2}} \rho\_{Y\_i}^{-\frac{1}{2}}.\tag{10}
$$

Then:

$$\begin{split} \rho\_{k+1} &= \boldsymbol{\Theta}\_{k+1} \boldsymbol{\Theta}\_{k+1}^{\dagger} \\ &= \boldsymbol{\Delta}\_{k+1} \boldsymbol{\Delta}\_{k} \dots \boldsymbol{\Delta}\_{2} \rho\_{X\_{1}Y\_{1}} \boldsymbol{\Delta}\_{2}^{\dagger} \dots \boldsymbol{\Delta}\_{k}^{\dagger} \boldsymbol{\Delta}\_{k+1} \\ &= \rho\_{X\_{k+1}Y\_{k+1}}^{\frac{1}{2}} \rho\_{Y\_{k+1}}^{-\frac{1}{2}} \rho\_{k} \rho\_{Y\_{k+1}}^{-\frac{1}{2}} \rho\_{X\_{k+1}Y\_{k+1}}^{\frac{1}{2}} \\ &= \boldsymbol{\Theta}\_{k+1}^{\dagger} \boldsymbol{\Theta}\_{k+1} \\ &= \rho\_{X\_{1}X\_{2}}^{\frac{1}{2}} \boldsymbol{\Delta}\_{3}^{\dagger} \dots \boldsymbol{\Delta}\_{k}^{\dagger} \boldsymbol{\Delta}\_{k+1} \boldsymbol{\Delta}\_{k+1} \boldsymbol{\Delta}\_{k} \dots \boldsymbol{\Delta}\_{3} \rho\_{X\_{1}X\_{2}}^{\frac{1}{2}} \\ &= \rho\_{k}^{\frac{1}{2}} \rho\_{Y\_{k+1}}^{-\frac{1}{2}} \rho\_{X\_{k+1}Y\_{k+1}} \rho\_{Y\_{k+1}}^{-\frac{1}{2}} \rho\_{k}^{\frac{1}{2}}. \end{split} \tag{11}$$

We can use Lemma 2 on the set { *ρXk*+<sup>1</sup>*Yk*+<sup>1</sup> , *ρk* } and conclude that *ρk*+<sup>1</sup> is a QMC in the order *Xk*+<sup>1</sup> − *Yk*+<sup>1</sup> − *Yk*+1. Therefore, using Lemma 1, we have *ρk*+<sup>1</sup> is a QMC iff *<sup>I</sup>ρk*+<sup>1</sup> -*Xk*+<sup>1</sup> : *Yk*+<sup>1</sup>|*Yk*+<sup>1</sup> = 0 iff

$$\begin{split} \log \rho\_{k+1} &= \log \rho\_{\mathbf{X}\_{k+1} \mathbf{Y}\_{k+1}} + \log \rho\_{\overline{\mathbf{Y}\_{k+1}} \mathbf{Y}\_{k+1}} - \log(\rho\_{Y\_{k+1}}) \\ &\overset{I.H.}{=} \sum\_{\mathcal{C}\_{k+1}} \log \rho\_{\mathbf{X}\_i \mathbf{X}\_i} - \sum\_{i=1}^{k+1} (\deg\_{\mathcal{C}\_i} (X\_i) - 1) \log \rho\_{X\_i}. \end{split} \tag{12}$$

The other direction is straightforward. Just notice that Tr*Xk*+<sup>1</sup> (*ρk*+<sup>1</sup>) = *ρk*, and by induction hypothesis *ρk* is compatible with C*k*, and so is *ρk*+1. Moreover, by construction of *ρk*+<sup>1</sup> it is also compatible with C*k*+1.

Note that the proof of the previous theorem does not depend on which constructive ordering one chooses. This follows from the fact that condition (*i*) is equivalent to condition (*ii*), and condition (*i*) does not assume any ordering.

The reader conversant in Markov random fields will identify condition (ii) as the quantum analogue of the *Local Markov Property* of a Markov random field—any variable *Xi* is conditionally independent of the remaining nodes given its adjacent nodes:

$$X\_i \perp \! \! \perp \! \! \{X\_i\} \cup \! \text{Ad}X\_i \mid \text{Ad}X\_{i\prime} \tag{13}$$

where Ad*Xi* is the set of adjacent nodes to *Xi*. The notion of conditional independence is equivalently replaced by the conditional mutual information being null, that is

> *I*(*Xi* : {*Xi*} ∪ Ad*Xi* | Ad*Xi*) = 0, (14)

which, for the case of the tree G*k* and for the node *Xk*, we have

$$I(X\_k : \overline{Y\_k} \mid Y\_k) = 0. \tag{15}$$

The following results state how to compute the solution Maximum Entropy Compatible Marginal Problem when GC is a tree and there exists *ρ* ∈ Comp(C) that factorizes via Petz according to C.

**Corollary 1.** *Let ρ* ∈ B(H) *factorize via Petz according to* C *and* GC *a spanning tree. Then,*

$$\rho = \underset{\rho' \in \text{Comp}(\mathcal{C})}{\text{arg }\max} \, \mathcal{S}(\rho'). \tag{16}$$

**Proof of Corollary 1.** It follows that in the case *ρ* ∈ B(H) factorizes via Petz according to C, we have log *ρ* = ∑C log *ρXiXj* − ∑*ni*=<sup>1</sup>(deg(*Xi*) − 1)log *ρXi* , which saturates the subadditivity of the von Neumann entropy for every 3-chain *Yk* − *Yk* − *Xk*, *k* = 3, ... , *n* in the spanning tree.

#### *4.4. Compatibility with a QMT*

We are now ready to state our main theorem, which gives a stronger characterization for the existence of a compatible density operator that is a QMT. Previously, we needed multivariate measurements to establish whether there exists a QMT in the given compatibility set. Herein, we show that it is enough to consider two-body measurements, which makes the procedure feasible in practice. The proof requires some technical lemmas that we placed in Appendix C.

**Theorem 3.** *Let* C := {*ρXiXj* ∈ B <sup>H</sup>*XiXj*, *i* = *j* ∈ {1, ... , *n*}} *be a set of admissible two-body marginals and such that the associate graph* GC = (*<sup>V</sup>*, *E*) *is a spanning tree. Then, there exists ρ* ˜ ∈ B(H) *such that ρ*˜ ∈ *Comp*(C) *factorizable via Petz according to* C *iff*

$$d\_{\rho}(X\_i: \operatorname{ad} X\_j | X\_j) = 0, \ \forall \rho\_{X\_i X\_j} \in \mathcal{C} \text{ and } \quad \forall \operatorname{ad} X\_j, \operatorname{ad} X\_j \neq X\_i. \tag{17}$$

*where ad Xj indicates an adjacent node of Xj in* GC *, that is adXi* ∈ *AdXi. Moreover,*

$$\bar{\rho} := \underset{\rho' \in \text{Comp}(\mathcal{C})}{\text{arg }\max} \; S(\rho). \tag{18}$$

**Proof of Theorem 3.** As in the previous theorem, we assume a constructive ordering *X*1 < ··· < *Xn* for C which will be used in the induction proof. Moreover, we can rewrite C using such order as in Equation (8). Thus, the set of conditions in Equation (17) are:

$$I\_{\rho}(X\_k: \text{ad } \mathbb{Y}\_k | \, \mathbb{Y}\_k) = 0, \,\forall \text{ad } \mathbb{Y}\_k \in V\_{k-1}, \, k = \mathfrak{d}, \dots, n. \tag{19}$$

(⇒) Using the previous theorem we have that

$$I\_{\rho\_k}(X\_k : \overline{Y\_k} | \mathcal{Y}\_k) = I\_{\rho}(X\_k : \overline{Y\_k} | \mathcal{Y}\_k) = 0. \tag{20}$$

Moreover, by Proposition 1, *Xk* is leaf in G*k* and it is only connected to *Yk*. Finally, by applying the chain rule of the quantum conditional mutual information (c.f. in Appendix C Equation (A13)) and choosing the chain to start in a node adjacent to *Xk*, say ad*Xk*, it follows that *<sup>I</sup>ρ*(*Xk* : ad*Yk*| *Yk*) = 0.

(⇐) The proof follows again by complete induction in the number of nodes *k*, following the assumed constructive ordering of C. Again, the simplest tree where the equation has any meaning requires three nodes.

(Basis *k* = 3): for this case the statement of this theorem coincides with (ii) of Theorem 2, since ad*Y*3 = *Y*3.

(Induction step *k* −→ *k* + 1):

Induction hypothesis: We assume

$$I\_{\rho}(X\_k: \text{ad } \mathbb{Y}\_k | \, \mathbb{Y}\_k) = 0, \,\forall \text{ad } \mathbb{Y}\_k \in V\_{k-1}, \, k = \mathfrak{Z}, \dots, n,\tag{21}$$

and so, by hypothesis, *ρ*- is factorizable via Petz according to C-, and so, by Theorem 2, we have

$$I\_{\rho\_{\ell}}(X\_{\ell}:\overline{Y\_{\ell}}|Y\_{\ell}) = 0 \,\forall \ell = \mathbf{3}, \ldots k. \tag{22}$$

Induction step: We assume *<sup>I</sup>ρ*(*Xk*+<sup>1</sup> : ad*Yk*+<sup>1</sup>|*Yk*+<sup>1</sup>) = 0 ∀ ad*Yk*+<sup>1</sup> ∈ *Vk* and our goal is to show that there exists *ρk*+<sup>1</sup> factorizable via Petz according to C*k*+<sup>1</sup> such that its partial traces hold

$$I\_{\rho\_{k+1}}\left(X\_{k+1} : \overline{Y\_{k+1}} | \mathcal{Y}\_{k+1}\right) = 0. \tag{23}$$

Observe that, by definition, *Yk*+<sup>1</sup> ∈ *Vk*, let *mk*+<sup>1</sup> be some step in which *Yk*+<sup>1</sup> was connected to some node (note that it might connect to some node in many steps). Clearly, we have 3 ≤ *mk*+<sup>1</sup> ≤ *k*. We consider two cases, depending on the degree of *Yk*+<sup>1</sup> in G*<sup>k</sup>*.

Case (1) deg*Yk*+<sup>1</sup> = 1, then by construction, it must be that *Yk*+<sup>1</sup> = *Xmk* and by Equation (22) we have that for *ρmk*its partial traces hold

$$I\_{\rho\_{\mathcal{W}\_k}}\left(X\_{\mathcal{W}\_k} : \overline{Y\_{\mathcal{W}\_k}} | \mathcal{Y}\_{\mathcal{W}\_k}\right) = 0. \tag{24}$$

By Lemma A7 (in Appendix C) since

$$V\_k \nmid \{X\_{m\_{k'}}, Y\_{m\_k}\} \supseteq \overline{Y\_{m\_k}} = V\_{m\_k} \nmid \{X\_{m\_{k'}}, Y\_{m\_k}\},\tag{25}$$

we also have for *ρk* that

$$I\_{\rho}(X\_{m\_k} : V\_k \mid \{X\_{m\_{k'}}, Y\_{m\_k}\} | Y\_{m\_k}) = I\_{\rho}(Y\_{k+1} : V\_k \mid \{Y\_{k+1}, \text{ad}Y\_{k+1}\} | \text{ad}Y\_{k+1}) = 0,\tag{26}$$

where the last equality is obtained by noticing that *Xmk* = *Yk*+<sup>1</sup> and *Ymk* = ad*Yk*+1. Recall that we have,

$$I\_{\rho}(X\_{k+1} : \text{ad } \mathcal{Y}\_{k+1} | \mathcal{Y}\_{k+1}) = 0. \tag{27}$$

Moreover, the set {*Vk*\{*Yk*+1, ad*Yk*+<sup>1</sup>}, ad*Yk*+1,*Yk*+1, *Xk*+<sup>1</sup>}, forms the chain

$$V\_k \backslash \{Y\_{k+1}, \mathbf{ad}Y\_{k+1}\} - \mathbf{ad} \, Y\_{k+1} - Y\_{k+1} - X\_{k+1}.\tag{28}$$

Then, by using Lemma A6 (a) (in Appendix C), there exists a density operator *ρk*+<sup>1</sup> ∈ B-<sup>H</sup>*Vk*+<sup>1</sup> such that its partial traces fulfill

$$I\_{\rho}(X\_{k+1} : V\_k \backslash \{Y\_{k+1}\} | Y\_{k+1}) = I\_{\rho\_{k+1}}(X\_{k+1} : \overline{Y\_{k+1}} | Y\_{k+1}) = 0. \tag{29}$$

Furthermore, by construction of this *ρk*+<sup>1</sup> in Lemma A6 (a) (in Appendix C) we have Tr*Xk*+<sup>1</sup> [*ρk*+<sup>1</sup>] = *ρk*, and so *ρk*+<sup>1</sup> is s.t.:

$$I\_{\rho}(X\_{i}:V\_{i}\backslash\{X\_{i},Y\_{i}\}|Y\_{i})=0 \quad \forall i:2\leq i\leq k+1. \tag{30}$$

(Case 2) deg*Yk*+<sup>1</sup> > 1, then G*<sup>k</sup>*+<sup>1</sup> can be seen as a star centered in *Yk*+1, with as many branches, as many as adjacent nodes (ad*Yk*+<sup>1</sup>)*<sup>i</sup>* in G*<sup>k</sup>*+1, whose number is precisely the degree *rk* of *Yk*+<sup>1</sup> in G*<sup>k</sup>*, plus the new added node *Xk*+<sup>1</sup> (c.f. Figure 1).

**Figure 1.** The associate graph G*<sup>k</sup>*+1: can be seen as a star centered in *Yk*+1, where every branch is an adjacent of *Yk*+<sup>1</sup> in *Vk*, plus the link to *Xk*+1. G*i* indicates the rest of the graph (a tree) that is connected to the i-th adjacent (ad*Yk*+<sup>1</sup>)*<sup>i</sup>*. The number of adjacent nodes to *Yk*+<sup>1</sup> in G*<sup>k</sup>*+<sup>1</sup> is *rk* + 1 by adding *Xk*+<sup>1</sup> to other *rk* nodes in G*<sup>k</sup>*.

To prove the thesis we must find *ρk*+<sup>1</sup> such that, if

$$I\_{\rho}(X\_{k+1} : (\text{ad}\,\mathbf{Y}\_{k+1})\_i | \mathbf{Y}\_{k+1}) = 0 \,\forall i = 1 \ldots r\_k \tag{31}$$

then, accordingly to Theorem 2, it is enough to show:

$$I\_{\mathbb{P}\_{k+1}}\left(X\_{k+1}:\overline{Y\_{k+1}}|Y\_{k+1}\right) = 0.\tag{32}$$

Moreover, by induction hypothesis, we know that

$$I\_{\rho}(X\_{\ell}: \text{ad}\, Y\_{\ell} | Y\_{\ell}) = 0 \,\,\forall \text{ad}\, Y\_{\ell} \in V\_{k'} \,\,\ell = \mathbf{3}, \ldots, k. \tag{33}$$

and again, by Theorem 2, we must have:

$$I\_{\rho\ell} \left( X\_{\ell} : \overline{Y\_{\ell}} | Y\_{\ell} \right) = 0 \,\,\forall \ell = \mathbf{3}, \ldots k. \tag{34}$$

We proceed to show Equation (32) by using Corollary A2 (in Appendix C). Indeed, this results guarantees that the star

$$\{X\_{k+1}, Y\_{k+1}, (\text{ad}\, Y\_{k+1})\_1 \cup \mathcal{G}\_1, \dots, (\text{ad}\, Y\_{k+1})\_{r\_k} \cup \mathcal{G}\_{r\_k}\}\tag{35}$$

factorizes via Petz according to

$$\{X\_{k+1}\boldsymbol{\chi}\_{k+1}, \boldsymbol{\chi}\_{k+1}(\operatorname{ad}\boldsymbol{\chi}\_{k+1})\_1 \cup \mathcal{G}\_1, \dots, \boldsymbol{\chi}\_{k+1}(\operatorname{ad}\boldsymbol{\chi}\_{k+1})\_{r\_k} \cup \mathcal{G}\_{r\_k}\}\tag{36}$$

iff

$$I\_{\rho}(X\_{k+1} : (\text{ad}\,\mathcal{Y}\_{k+1})\_i \cup \mathcal{G}\_i \mid \mathcal{Y}\_{k+1}) = 0, \quad \forall i \in \mathbf{1}, \ldots, r\_k;\tag{37}$$

$$I\_{\rho}\left( (\operatorname{ad}\mathcal{Y}\_{k+1})\_{i} \cup \mathcal{G}\_{i} : (\operatorname{ad}\mathcal{Y}\_{k+1})\_{j} \cup \mathcal{G}\_{j} \mid \mathcal{Y}\_{k+1} \right) = 0, \quad \forall i \neq j \in \mathbf{1} \ldots r\_{k}. \tag{38}$$

Using Theorem 2 in Equation (37), we ge<sup>t</sup> the goal, stated in Equation (32). The conditions in Equation (38) come from the complete induction hypothesis Equation (34). On the other hand, the conditions stated in Equation (37), come from observing that, for every (ad*Yk*+<sup>1</sup>)*<sup>i</sup>*, there is a chain

$$X\_{k+1} - Y\_{k+1} - (\text{ad}Y\_{k+1})\_i - \mathcal{G}\_{i\prime} \tag{39}$$

for which we already have the conditions:

$$I\_{\rho}(\mathcal{X}\_{k+1} : (\text{ad}\,\mathcal{Y}\_{k+1})\_i | \mathcal{Y}\_{k+1}) = 0,\tag{40}$$

$$I\_{\rho}(\mathbf{Y}\_{k+1} : \mathcal{G}\_i | (\operatorname{ad} \mathbf{Y}\_{k+1})\_i) = 0. \tag{41}$$

Equation (40) follows from induction hypothesis Equation (33). Moreover, Equation (41) follows from the fact that, by hypothesis, *ρk* is a QMT, and so

$$Y\_{k+1} - (\text{ad}Y\_{k+1})\_i - \mathcal{G}\_i \tag{42}$$

is a quantum Markov chain. Therefore, by using Lemma A6 (a) (in Appendix C), we ge<sup>t</sup> the desired condition

$$I\_{\rho} \left( X\_{k+1} : (\text{ad} \, Y\_{k+1})\_i \cup \mathcal{G}\_i \mid \text{Y}\_{k+1} \right) = 0. \tag{43}$$

Since the argumen<sup>t</sup> holds for all the adjacent nodes (ad*Yk*+<sup>1</sup>)*<sup>i</sup>*, we derive the whole set of conditions (37), which ends the proof for case (2).

Finally, the fact that the obtained state maximizes the von Neumann entropy with the provided marginals comes for free from Corollary 1.

#### *4.5. QMT and the MECM Problem*

We are now able to show that for QMTs, the MECM problem is in P and that there is a polynomial quantum circuit that constructs the Maximum entropy compatible density operator. Moreover, we also show that it is possible to extend the Chow–Liu algorithm efficiently for quantum Markov networks. To derive these results, we need first to compute the number of 3-chains in a graph with *n* nodes—proof in Appendix D.

**Lemma 3.** *The number of 3-chains* #*c in a tree with n* ≥ 2 *vertices satisfies n* − 2 ≤ #*c* ≤ 12 (*n* − <sup>1</sup>)(*n* − <sup>2</sup>)*. Moreover, the number of 3-chains for any graph is upper-bounded by* 12*n*(*n* − <sup>1</sup>)(*n* − <sup>2</sup>)*, and it reaches the bound for a complete graph of n nodes.*

We are now able to establish a sufficient condition for the MECMP problem to be in *P*.

**Theorem 4.** *The Maximum Entropy Compatible Marginal Problem for* C *is in P when*


*Moreover, there exists a quantum polynomial circuit that constructs the maximum entropy compatible tree.*

**Proof of Theorem 4.** From Theorem 3, the density operator that maximizes the Entropy is a QMT. Moreover, we can compute its entropy in polynomial time, by considering the constructive ordering of point 2. Indeed, from Theorem 2 (i), when *ρ* is a QMT we have that

$$S(\rho) = \sum\_{\mathcal{C}} S(\rho\_{X\_i X\_j}) - \sum\_{i=1}^n (\deg(X\_i) - 1) S(\rho\_{X\_i}).\tag{44}$$

Moreover, since each *ρXiXj* belongs to a QMC constructed in polynomial time, we can compute a matrix representation of the density operator of the QMC in polynomial-time as well. Recall in Definition 3, that the Hilbert space of a polynomial-time QMC is fixed, and does not depend on the complexity parameter, that is, as usual, the dimension of the Hilbert space associated with each node is fixed (regarding) the complexity parameter *n* (the number of nodes).

Moreover, given the constructive order, we are also able to make a quantum circuit (c.f. Figure 2) to construct the maximum entropy compatible tree by constructing the first Markov chain *ρ<sup>X</sup>*1,*X*2,*X*3 and then applying the circuits for the recovery maps R of the remaining nodes.

**Figure 2.** Quantum circuit that outputs the optimal quantum Markov trees (QMT). Note that the *k*-th block <sup>I</sup>*Yk*⊗ <sup>R</sup>*YkXk*operates only over two components *Xk* and *Yk*, for all *k* = 3... *n*.

## *4.6. QMT and Chow–Liu Algorithm*

Two-body marginals for which all 3-chains form a QMC have another interesting property. It is possible to find the QMT closest, with regards to the quantum relative entropy (the generalization of the Kullback–Leibler divergence [42]), to the unknown density operator. Note that the number of spanning trees over a complete graph is given by Cayley's formula [43], *nn*−<sup>2</sup> which is exponential on *n*. To extract the closest QMT, we need to construct a weighted graph (where the nodes are each component of the density operator), and the edges are weighted with the von Neumann mutual information between every two components. The optimal spanning tree, which can be found using the polynomial-time algorithm by Chow–Liu Algorithm 1, gives the support to a QMT. Moreover, this QMT will be the one closest to the unknown state. When the density operators are diagonal, that is, describe a probability distribution, this algorithm coincides with the well-established Chow–Liu algorithm.

#### **Algorithm 1** Chow–Liu Algorithm

Input: { C, *I*C } from a set of RVs *X* = {*<sup>X</sup>*1,... *Xn*}, where


Output: { C*T* ⊆ C s.t. *<sup>H</sup>*(*p*(*X*) | |*pT*(*X*)) is minimal }, where


3. Iterate: while *a* ≤ *M* do

> if C*T* ∪ *pα* s.t. G*T* is a tree then C*T* = C*T* ∪ { *pα* }; *α* = *α* + 1 ; return C*T*

**Theorem 5.** *If the set of two body marginals* C *is s.t. every 3-chain is compatible with a QMC then every subtree is a QMT. A QMT that minimizes the quantum relative entropy with respect to the (unknown) given quantum state, is the maximum weighted tree* G*TC where the weight of each edge is given by the quantum mutual information. Such tree can be obtained efficiently using the (generalized) Chow–Liu learning algorithm [19].*

**Proof of Theorem 5.** The proof consists in applying Theorem 3 to the main result of Section 6 in the paper [29], that we are going to briefly recall.

Let *ρ*X be the unknown quantum state that describes best the quantum system for which the bipartite marginals are known (for instance, they have been measured and collected in C). Moreover, let *<sup>ρ</sup>* <sup>C</sup>*T* be the maximum von Neumann entropy d.o. compatible with a subset C*T* ⊆ C s.t. GC*T* is a tree. We refer to *<sup>ρ</sup>* <sup>C</sup>*T* as quantum tree. Their relative entropy can be written as

$$\mathcal{S}(\rho\_{\mathcal{X}}||\tilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}) = -\mathcal{S}(\rho\_{\mathcal{X}}) - \text{Tr}(\rho\_{\mathcal{X}}\log\tilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}) = \mathcal{S}(\tilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}) - \mathcal{S}(\rho\_{\mathcal{X}}),\tag{45}$$

where we have used condition (i) in Theorem 2 on log *<sup>ρ</sup>* <sup>C</sup>*T*.

Therefore, the optimal maximum entropy estimator *ρ* is computed over the subtree with minimal von Neumann entropy:

$$\widetilde{\rho} = \underset{\mathcal{C}\_{\mathcal{T}} \subseteq \mathcal{C}}{\text{argmin}} \max\_{\rho \in \text{Comp}(\mathcal{C}\_{\mathcal{T}})} S(\rho). \tag{46}$$

Since the number of possible spanning trees is *nn*−<sup>2</sup> [43], we can not choose the best fitting tree efficiently, in general. However, in the case at hand, we can manipulate Equation (45) and derive subcases for which the computation can be performed efficiently. Observe that

$$\sum\_{\mathcal{C}\_T} S\left(\rho\_{X\_i X\_j}\right) - \sum\_{i=1}^n (\deg X\_i - 1) S\left(\rho\_{X\_i}\right) = -\sum\_{\mathcal{C}\_T} I\_\rho(X\_i, X\_j) + \sum\_{i=1}^n S\left(\rho\_{X\_i}\right),\tag{47}$$

and set

$$\Delta S(\tilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}) := \sum\_{\mathcal{C}\_{\mathcal{T}}} S\left(\rho\_{X\_i X\_j}\right) - \sum\_{i=1}^n (\deg X\_i - 1) S\left(\rho\_{X\_i}\right) - S(\tilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}),\tag{48}$$

which is always non-negative. By adding and subtracting the term

$$\sum\_{C\_T} S\left(\rho\_{X\_i X\_j}\right) + \sum\_{i=1}^n (\deg X\_i - 1)S\left(\rho\_{X\_i}\right) \tag{49}$$

to Equation (45), it assumes the form

$$S\left(\rho\_{\mathbb{X}}||\widetilde{\rho}\_{\mathbb{C}\mathcal{T}}\right) = -\sum\_{\mathcal{C}\_{\mathcal{T}}} I\_{\mathbb{P}}(X\_{i\prime}X\_{\mathcal{I}}) - \Delta S(\widetilde{\rho}\_{\mathcal{C}\mathcal{T}}) + \sum\_{i=1}^{n} S\left(\rho\_{X\_{i}}\right) - S(\rho\_{\mathbb{X}}).\tag{50}$$

By using condition (i) of Theorem 2, we can replace the log term of *S<sup>ρ</sup>* <sup>C</sup>*T* of Equation (48) and thus, for a QMT, <sup>Δ</sup>*<sup>S</sup>*(*ρ*C ) = 0. Moreover, we also have the converse, that is, <sup>Δ</sup>*<sup>S</sup>*(*ρ*C ) = 0 holds only for QMTs. The latter result can be derived by observing that

$$\Delta S(\widetilde{\rho}\_{\mathcal{C}\_{\mathcal{T}}}) = \sum\_{i=1}^{n-2} I\_{\rho}(X\_{l\_i} : V\_i \backslash \{X\_{l\_i}, \text{ad}X\_{l\_i}\} | \text{ad}X\_{l\_i}), \tag{51}$$

which, by positivity of quantum conditional mutual information, is 0 iff all the terms in the sum are 0. Then, by Theorem 3, we have <sup>Δ</sup>*<sup>S</sup>*(*ρ*C ) = 0 iff all the 3-chains in C*T* are QMC.

Therefore, when the provided set of marginals C is s.t. every 3-chain is compatible with a QMC, <sup>Δ</sup>*<sup>S</sup>*(*ρ*C ) = 0 in Equation (50). Therefore, the best tree is the one that maximizes the term

$$\sum\_{C\_T} I\_{\rho}(X\_i, X\_j),\tag{52}$$

i.e., the maximum weighted spanning sub-tree, where the weights are given by the mutual information between every couple of linked nodes.

This problem is efficiently solved for classical graphs by the Chow–Liu algorithm, which we have here generalized to quantum states, be replacing the Shannon entropy with the von Neumann entropy.

The general case of efficiently finding the optimal spanning tree which gives the support to a quantum tree remains open. Minimizing the general form of Equation (50) would require the maximization of the quantity ∑<sup>C</sup>*T <sup>I</sup>ρ*(*Xi*, *Xj*) + <sup>Δ</sup>*<sup>S</sup>*(*<sup>ρ</sup>*CT ) with *<sup>S</sup>*(*<sup>ρ</sup>*CT )>0. Already in the tripartite scenario, it is evident that the maximum weighted tree is not a necessarily solution to the problem.

For the sake of completeness, we present the Chow–Liu algorithm in pseudo-code. In its quantum version, the Shannon entropy is replaced by the von Neumann entropy, so as the relative entropy with the quantum relative entropy.

## **5. Conclusions**

In this paper, we addressed the problem of learning the maximum entropy density operator, describing an unknown quantum system on a finite-dimensional Hilbert space, from a set of two-body marginals.

First, we have shown that comparing the entropies of 3-chains—the simplest nontrivial scenario, where two marginals are known in a tripartite quantum system—is QSZKcomplete. The result hints that finding the maximum entropy compatible state should be in general not feasible, with a step by step entropy-monotonic procedure.

Then, we determined a subclass of density operators where the addressed problem is in P. Concretely, by observing that the problem at hand naturally abstracts the inference problem for classical probability distribution within graphical models, we ask whether an exact efficient max-entropy learning procedure is limited to classical Markovian systems— the set of constraints is a tree-structured set of mutually commuting density operators. We generalize and extend the classical procedure to a larger subset of density operators, namely two-body marginals compatible with a quantum Markov tree (QMT), whose 3- chains are polynomial-time quantum Markov chains. In addition, for a general set of quantum states whose 3-subchains are quantum Markov chains, we were able to generalize the Chow–Liu algorithm for extracting the optimal QMT. Moreover, we showed that, in the case at hand, the maximum entropy quantum state could be constructed by a polynomialtime quantum circuit.

We stress that the obtained procedures overcome the quantum marginal problem, for which a solution is known in the case of compatibility of the provided set of marginals with a QMT.

Understanding other classes of quantum states for which this problem is tractable (at least in quantum polynomial time) would be a relevant problem. In particular, a further study on the robustness of the procedure can shed some light on the power of quantum machine learning techniques on solving the same problem beyond the Markovian assumption. Indeed, differently from the classical scenario, quantum Markov chains have been proven to be in general distant in trace distance from approximately-Markovian chains—that is, tripartite density operators *ρABC* s.t. *Iρ* (*A* : *B*|*C*) , > 0—and the result naturally extends to QMT and many body density operators.

**Author Contributions:** Conceptualization, S.D.G. and P.M.; Formal analysis, S.D.G.; Funding acquisition, P.M.; Investigation, S.D.G. and P.M.; Methodology, S.D.G. and P.M.; Supervision, P.M.; Writing—original draft, P.M.; Writing—review & editing, S.D.G. and P.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is supported by Security and Quantum Information Group of Instituto de Telecomunicações, by Programme (COMPETE 2020) of the Portugal 2020 framework [Project Q.DOT with Nr. 039728 (POCI-01-0247-FEDER-039728)] and the Fundação para a CiênciaeaTecnologia (FCT) through national funds, by FEDER, COMPETE 2020, and by Regional Operational Program of Lisbon, under UIDB/50008/2020 (actions QuRUNNER, QUESTS), Project QuantumMining POCI-01- 0145-FEDER-031826 and Project PREDICT PTDC/CCI-CIF/29877/2017.

**Conflicts of Interest:** The authors declare no conflict of interest.
