Probability via Expectation Measures
Abstract
:1. Introduction
1.1. Organization of the Paper
1.2. Terminology
2. Probability Theory Since Kolmogorov
2.1. Kolmogorov’s Contribution
2.2. Probabilities or Expectations?
2.3. Probability Theory and Category Theory
- For any fixed the mapping
- For every fixed , the mapping
2.4. Preliminaries on Point Processes
- For all the measure is locally finite.
- For all bounded sets the random variable is a count variable.
2.5. Poisson Distributions and Poisson Point Processes
- For all , the random variable is Poisson distributed with mean value .
- If are disjoint, then the random variables and are independent.
2.6. Valuations
- Strictness .
- Monotonicity For all subsets , implies .
- Modularity For all subsets ,
- Continuity for any directed net .
3. Observations
3.1. Observations as Multiset Classifications
3.2. Observations as Empirical Measures
- Addition.
- Restriction.
- Inducing.
3.3. Categorical Properties of the Empirical Measures and Some Generalizations
3.4. Lossless Compression of Data
3.5. Lossy Compression of Data
4. Expectations
4.1. Simple Expectation Measures
- There are many ways of writing t as a product where and n is an integer.
- There are many different sampling schemes that will lead to a multiplication be .
- There are many ways of generating the randomness that is needed to perform the sampling.
4.2. Categorical Properties of the Expectation Measures and Some Generalizations
4.3. The Poisson Interpretation
- is Poisson distributed for any open set B.
- For any open sets the random variable is independent of the random variable given the random variable if and only if .
4.4. Normalization, Conditioning, and Other Operations on Expectation Measures
4.5. Independence
4.6. Information Divergence for Expectation Measures
- with equality when
- is minimal when
- for all
5. Applications
5.1. Goodness-of-Fit Tests
5.2. Improper Prior Distributions
5.3. Markov Chains
5.4. Inequalities for Information Projections
6. Discussion and Conclusions
Probability theory | Expectation theory |
Probability | Expected value |
Outcome | Instance |
Outcome space | Multiset monad |
P-value | E-Value |
Probability measure | Expectation measure |
Binomial distribution | Poisson distribution |
Density | Intensity |
Bernoulli random variable | Count variable |
Empirical distribution | Empirical measure |
KL-divergence | Information divergence |
Uniform distribution | Poisson point process |
State space | State cone |
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
bin | Binomial distribution |
DCC | Descending chain condition |
E-statistic | Evidence statistic |
E-value | Observed value of an E-statistic |
hyp | Hypergeometric distribution |
IID | Independent identically distributed |
KL-divergence | Information divergence restricted to probability measures |
MDL | Minimum description length |
Mset | Multiset |
N | Gaussian distribution |
PM | Probability measure |
Po | Poisson distribution |
mset | Multiset |
Poset | Partially ordered set |
Pr | Probability |
References
- Kolmogorov, A.N. Grundbegriffe der Wahrscheinlichkeitsrechnung; Springer: Berlin, Germany, 1933. [Google Scholar]
- Lardy, T.; Grünwald, P.; Harremoës, P. Reverse Information Projections and Optimal E-statistics. IEEE Trans. Inf. Theory 2024, 70, 7616–7631. [Google Scholar] [CrossRef]
- Perrone, P. Categorical Probability and Stochastic Dominance in Metric Spaces. Ph.D. Thesis, Max Planck, Institute for Mathematics in the Sciences, Leipzig, Germany, 2018. [Google Scholar]
- nLab Authors. Monads of Probability, Measures, and Valuations. Available online: https://ncatlab.org/nlab/show/monads+of+probability%2C+measures%2C+and+valuations (accessed on 20 October 2024).
- Shiryaev, A.N. Probability; Springer: New York, NY, USA, 1996. [Google Scholar]
- Whittle, P. Probability via Expectation, 3rd ed.; Springer Texts in Statistics; Springer: New York, NY, USA, 1992. [Google Scholar]
- Kallenberg, O. Random Measures; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Lawvere. The Category of Probabilistic Mappings, 1962. Lecture Notes. Available online: https://github.com/mattearnshaw/lawvere/blob/master/pdfs/1962-the-category-of-probabilistic-mappings.pdf (accessed on 20 October 2024).
- Scibior, A.; Ghahramani, Z.; Gordon, A.D. Practical probabilistic programming with monads. In Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell, Vancouver, BC, Canada, 3–4 September 2015; pp. 165–176. [Google Scholar]
- Giry, M. A categorical approach to probability theory. In Categorical Aspects of Topology and Analysis; Banaschewski, B., Ed.; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 1982; Volume 915, pp. 68–85. [Google Scholar]
- Lieshout, M.V. Spatial Point Process Theory. In Handbook of Spatial Statistics; Handbooks of Modern Statistical Methods; Chapman and Hall: London, UK, 2010; Chapter 16. [Google Scholar]
- Dash, S.; Staton, S. A Monad for Probabilistic Point Processes. Available online: https://arxiv.org/abs/2101.10479 (accessed on 20 October 2024).
- Jacobs, B. From Multisets over Distributions to Distributions over Multisets. In Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science, Rome, Italy, 29 June–2 July 2021; pp. 1–13. [Google Scholar]
- Last, G.; Penrose, M. Lectures on the Poisson Process; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
- Rényi, A. A characterization of Poisson processes. Magy. Tud. Akad. Mat. Kutaló Int. Közl. 1956, 1, 519–527. [Google Scholar]
- Kallenberg, O. Limits of Compound and Thinned Point Processes. J. Appl. Probab. 2016, 12, 269–278. [Google Scholar] [CrossRef]
- nLab Authors. Valuation (Measure Theory). Available online: https://ncatlab.org/nlab/show/valuation+%28measure+theory%29 (accessed on 20 October 2024).
- Heckmann, R. Spaces of valuations. In Papers on General Topology and Applications; Academy of Sciences: New York, NY, USA, 1996. [Google Scholar]
- Blizard, W.D. The development of multiset theory. Mod. Log. 1991, 1, 319–352. [Google Scholar]
- Monro, G.P. The Concept of Multiset. Math. Log. Q. 1987, 33, 171–178. [Google Scholar] [CrossRef]
- Isah, A.; Teella, Y. The Concept of Multiset Category. Br. J. Math. Comput. Sci. 2015, 9, 427–437. [Google Scholar] [CrossRef]
- Grätzer, G. Lattice Theory; Dover: Downers Grove, IL, USA, 1971. [Google Scholar]
- Wille, R. Formal Concept Analysis as Mathematical Theory. In Formal Concept Analysis; Ganter, B., Stumme, G., Wille, R., Eds.; Number 3626 in Lecture Notes in Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1–33. [Google Scholar]
- Topsøe, F. Compactness in Space of Measures. Stud. Math. 1970, 36, 195–212. [Google Scholar] [CrossRef]
- Alvarez-Manilla, M. Extension of valuations on locally compact sober spaces. Topol. Appl. 2002, 124, 397–433. [Google Scholar] [CrossRef]
- Harremoës, P. Extendable MDL. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 1516–1520. [Google Scholar]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley: Hoboken, NJ, USA, 1991. [Google Scholar]
- Csiszár, I. The Method of Types. IEEE Trans. Inform. Theory 1998, 44, 2505–2523. [Google Scholar] [CrossRef]
- Harremoës, P. Rate Distortion Theory for Descriptive Statistics. Entropy 2023, 25, 456. [Google Scholar] [CrossRef] [PubMed]
- Rényi, A. On an Extremal Property of the Poisson Process. Ann. Inst. Stat. Math. 1964, 16, 129–133. [Google Scholar] [CrossRef]
- McFadden, J.A. The Entropy of a Point Process. J. Soc. Indst. Appl. Math. 1965, 13, 988–994. [Google Scholar] [CrossRef]
- Harremoës, P. Binomial and Poisson Distributions as Maximum Entropy Distributions. IEEE Trans. Inform. Theory 2001, 47, 2039–2041. [Google Scholar] [CrossRef]
- Harremoës, P.; Johnson, O.; Kontoyiannis, I. Thinning, Entropy and the Law of Thin Numbers. IEEE Trans. Inf. Theory 2010, 56, 4228–4244. [Google Scholar] [CrossRef]
- Hillion, E.; Johnson, O. A proof of the Shepp-Olkin entropy concavity conjecture. Bernoulli 2017, 23, 3638–3649. [Google Scholar] [CrossRef]
- Dawid, A.P. Separoids: A mathematical framework for conditional independence and irrelevance. Ann. Math. Artif. Intell. 2001, 32, 335–372. [Google Scholar] [CrossRef]
- Harremoës, P. Entropy inequalities for Lattices. Entropy 2018, 20, 748. [Google Scholar] [CrossRef] [PubMed]
- Leskelä, L. Information Divergences and Likelihood Ratios of Poisson Processes and Point Patterns. IEEE Trans. Inform. Theory 2024, 70, 9084–9101. [Google Scholar] [CrossRef]
- Harremoës, P. Divergence and Sufficiency for Convex Optimization. Entropy 2017, 19, 206. [Google Scholar] [CrossRef]
- Csiszár, I. I-Divergence Geometry of Probability Distributions and Minimization Problems. Ann. Probab. 1975, 3, 146–158. [Google Scholar] [CrossRef]
- Pfaffelhuber, E. Minimax Information Gain and Minimum Discrimination Principle. In Colloquia Mathematica Societatis János Bolyai; Proceedings of the Topics in Information Theory; Csiszár, I., Elias, P., Eds.; János Bolyai Mathematical Society: Budapest, Hungary; North-Holland: Amsterdam, The Netherlands, 1977; Volume 16, pp. 493–519. [Google Scholar]
- Topsøe, F. Information Theoretical Optimization Techniques. Kybernetika 1979, 15, 8–27. [Google Scholar]
- Csiszár, I.; Tusnády, G. Information Geometry and Alternating Minimization Procedures. Stat. Decis. 1984, 1, 205–237. [Google Scholar]
- Li, J.Q. Estimation of Mixture Models. Ph.D. Dissertation, Department of Statistics, Yale University, New Haven, CT, USA, 1999. [Google Scholar]
- Li, J.Q.; Barron, A.R. Mixture Density Estimation. In Proceedings of the Conference on Neural Information Processing Systems: Natural and Synthetic, Cenver, CO, USA, 29 November–4 December 1999. [Google Scholar]
- Harremoës, P. Bounds on tail probabilities for negative binomial distributions. Kybernetika 2016, 52, 943–966. [Google Scholar] [CrossRef]
- Harremoës, P.; Tusnády, G. Information Divergence is more χ2-distributed than the χ2-statistic. In Proceedings of the 2012 IEEE International Symposium on Information Theory, IEEE, Cambridge, MA, USA, 1–6 July 2012; pp. 538–543. [Google Scholar]
- Kass, R.E.; Wasserman, L.A. The Selection of Prior Distributions by Formal Rules. J. Am. Stat. Assoc. 1996, 91, 1343–1370. [Google Scholar] [CrossRef]
- Grünwald, P. The Minimum Description Length Principle; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
- Harremoës, P. Entropy on Spin Factors. In Springer Proceedings in Mathematics & Statistics; Proceedings of the Information Geometry and Its Applications; Ay, N., Gibilisco, P., Matúš, F., Eds.; Springer: Berlin/Heidelberg, Germany, 2018; Volume 252, pp. 247–278. [Google Scholar]
- Harremoës, P.; Matúš, F. Bounds on the Information Divergence for Hypergeometric Distributions. Kybernetika 2020, 56, 1111–1132. [Google Scholar] [CrossRef]
- Harremoës, P.; Ruzankin, P. Rate of Convergence to Poisson Law in Terms of Information Divergence. IEEE Trans. Inf. Theory 2004, 50, 2145–2149. [Google Scholar] [CrossRef]
- Kontoyiannis, I.; Harremoës, P.; Johnson, O. Entropy and the Law of Small Numbers. IEEE Trans. Inform. Theory 2005, 51, 466–472. [Google Scholar] [CrossRef]
- Harremoës, P. Lower Bounds for Divergence in the Central Limit Theorem. In General Theory of Information Transfer and Combinatorics; Springer: Berlin/Heidelberg, Germany, 2006; pp. 578–594. [Google Scholar]
- Harremoës, P. Maximum Entropy on Compact groups. Entropy 2009, 11, 222–237. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Harremoës, P. Probability via Expectation Measures. Entropy 2025, 27, 102. https://doi.org/10.3390/e27020102
Harremoës P. Probability via Expectation Measures. Entropy. 2025; 27(2):102. https://doi.org/10.3390/e27020102
Chicago/Turabian StyleHarremoës, Peter. 2025. "Probability via Expectation Measures" Entropy 27, no. 2: 102. https://doi.org/10.3390/e27020102
APA StyleHarremoës, P. (2025). Probability via Expectation Measures. Entropy, 27(2), 102. https://doi.org/10.3390/e27020102