**1. Introduction**

The importance of a metric to measure the similarity between two distributions has weaved itself into a plethora of applications. Fields concerning statistical inference [1–3], model selection [4–6] and machine learning have found it necessary to quantify the likeness of two distributions. A common approach to measure this similarity is to define a distance or divergence between distributions using the tenets of information geometry, e.g., the Fisher–Rao distance or the f-divergence [7], respectively. To the best of our knowledge, research and results in information geometry have predominantly focused on establishing similarities between two given distributions. Here, we consider an important class of problems where one or both endpoint distributions are not fixed, but instead, constrained to live on subset of the parameter manifold.

When one relaxes the fixed endpoint requirements, the development of finding the shortest path between a given distribution and constraint surface (not single distribution) must be reconsidered using transversality conditions [8,9] for the standard lengthminimizing functional. This is precisely the focus of the present work, where we derive the transversality conditions for working in the Riemannian space of multivariate Gaussian distributions. This approach opens new avenues for research and application areas where one no longer needs to provide the ending distribution but rather a description of the constraint set where the most similar model must be discovered. For example, applications such as domain adaptation [10–12] can be reformulated such that the optimal target domain distribution is discovered among a constraint family starting from a known source distribution estimated from training data, or even model selection [4–6,13,14], where a search in a constrained family of distributions may be better aligned with the desired objective versus evaluating the MLE fit penalized by parameter cardinality.

In this work, we have purposefully chosen to work in the natural parameter space of multivariate Gaussians (mean vector, *μ*, and covariance matrix, Σ) and address the problem formulation in a completely Riemannian context. We are aware of the usual dually

**Citation:** Herntier, T.; Peter, A.M. Transversality Conditions for Geodesics on the Statistical Manifold of Multivariate Gaussian Distributions. *Entropy* **2022**, *24*, 1698. https://doi.org/10.3390/e24111698

Academic Editors: Karagrigoriou Alexandros and Makrides Andreas

Received: 30 September 2022 Accepted: 18 November 2022 Published: 21 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

flat constructions [2,15,16] afforded by information geometry using divergence measures and the Legendre transformation. Though elegant in their algebraic constructions, these alternate parameterizations have yet to be employed in most statistical and machinelearning applications. Hence, we develop all geometric motivations and consequential mathematical derivations using Riemannian geometry and the natural parameter space, i.e., we employ the Fisher information matrix metric tensor and find the length-minimizing curve to the constraint set.

The remainder of this paper is organized as follows. In Section 2, we give a summary of important results concerning geodesics on manifolds. In Section 3, we provide a brief introduction to the techniques of calculus of variations with the goal of developing the Euler-Lagrange equations necessary to find the shortest path between two multivariate Gaussian distributions. In this section, we restrict ourselves to problems in which both the initial and final distributions are known exactly. Following this, in Section 3.3, we develop the conditions required to satisfy transversality constraints, or constraints where either the initial and/or final distribution are determined from those residing on a defined subsurface of the manifold rather than being exactly known. The results from these sections are employed in Section 4, where we explore various constraint surfaces and numerical experiments to demonstrate the utility of our variable-endpoint framework. Finally, some closing perspectives and remarks are given in Section 5.
