entropy-logo

Journal Browser

Journal Browser

Entropy, Statistical Evidence, and Scientific Inference: Evidence Functions in Theory and Applications

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Multidisciplinary Applications".

Deadline for manuscript submissions: closed (31 May 2024) | Viewed by 11375

Special Issue Editors


E-Mail Website
Guest Editor
1. Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID 83844, USA
2. Professor Emeritus, Department of Fish and Wildlife Sciences, University of Idaho, Moscow, ID 83844, USA
Interests: statistical ecology; biometrics; mathematical modeling; theoretical ecology; conservation biology; population dynamics

E-Mail Website
Guest Editor
1. Department of Ecology, Montana State University, Bozeman, MT 59717, USA
2. Marine Science Institute, University of California, Santa Barbara, CA 94720, USA
Interests: theoretical ecology; ecological statistics; statistical inference; evolution; philosophy of science

E-Mail Website
Guest Editor
1. Biology Department , University of Florida, Gainesville, FL 32611, USA
2. Mathematics Department, University of Florida, Gainesville, FL 32611, USA
Interests: statistical ecology; population dynamics; theoretical ecology; statistical phylogenetics; conservation biology; mathematical population genetics

Special Issue Information

Dear Colleagues,

Modern statistical evidence compares the relative support in scientific data for mathematical models. The fundamental tool of comparison is the evidence function, which is a contrast of generalized entropy discrepancies. The most commonly used evidence functions are the differences of information criterion values. Statistical evidence has many desirable properties, combining attractive features of both Bayesian and classical frequentist analysis while simultaneously avoiding many of their philosophical and practical issues. The goals of this Special Issue are to stimulate the further theoretical development of statistical evidence and present real-world examples where the use of statistical evidence clarifies scientific inference. While many of the applications featured here are ecological, reflecting the editors’ areas of expertise, we welcome and anticipate accounts or critiques of evidence functions applied in other scientific areas.

Prof. Dr. Brian Dennis
Dr. Mark L. L. Taper
Prof. Dr. Jose Miguel Ponciano
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • entropy
  • evidential statistics
  • evidence
  • hypothesis testing
  • information theory
  • Kullback–Leibler discrepancy
  • model misspecification
  • model selection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

8 pages, 251 KiB  
Editorial
Entropy, Statistical Evidence, and Scientific Inference: Evidence Functions in Theory and Applications
by Mark L. Taper, José Miguel Ponciano and Brian Dennis
Entropy 2022, 24(9), 1273; https://doi.org/10.3390/e24091273 - 9 Sep 2022
Cited by 1 | Viewed by 1730
Abstract
Scope and Goals of the Special Issue: There is a growing realization that despite being the essential tool of modern data-based scientific discovery and model testing, statistics has major problems [...] Full article

Research

Jump to: Editorial

20 pages, 1179 KiB  
Article
Empirical Bayes Methods, Evidentialism, and the Inferential Roles They Play
by Samidha Shetty, Gordon Brittan, Jr. and Prasanta S. Bandyopadhyay
Entropy 2024, 26(10), 859; https://doi.org/10.3390/e26100859 - 12 Oct 2024
Viewed by 606
Abstract
Empirical Bayes-based Methods (EBM) is an increasingly popular form of Objective Bayesianism (OB). It is identified in particular with the statistician Bradley Efron. The main aims of this paper are, first, to describe and illustrate its main features and, [...] Read more.
Empirical Bayes-based Methods (EBM) is an increasingly popular form of Objective Bayesianism (OB). It is identified in particular with the statistician Bradley Efron. The main aims of this paper are, first, to describe and illustrate its main features and, second, to locate its role by comparing it with two other statistical paradigms, Subjective Bayesianism (SB) and Evidentialism. EBM’s main formal features are illustrated in some detail by schematic examples. The comparison between what Efron calls their underlying “philosophies” is by way of a distinction made between confirmation and evidence. Although this distinction is sometimes made in the statistical literature, it is relatively rare and never to the same point as here. That is, the distinction is invariably spelled out intra- and not inter-paradigmatically solely in terms of one or the other accounts. The distinction made in this paper between confirmation and evidence is illustrated by two well-known statistical paradoxes: the base-rate fallacy and Popper’s paradox of ideal evidence. The general conclusion reached is that each of the paradigms has a basic role to play and all are required by an adequate account of statistical inference from a technically informed and fine-grained philosophical perspective. Full article
Show Figures

Figure 1

21 pages, 360 KiB  
Article
Statistics in Service of Metascience: Measuring Replication Distance with Reproducibility Rate
by Erkan O. Buzbas and Berna Devezer
Entropy 2024, 26(10), 842; https://doi.org/10.3390/e26100842 - 5 Oct 2024
Viewed by 745
Abstract
Motivated by the recent putative reproducibility crisis, we discuss the relationship between the replicability of scientific studies, the reproducibility of results obtained in these replications, and the philosophy of statistics. Our approach focuses on challenges in specifying scientific studies for scientific inference via [...] Read more.
Motivated by the recent putative reproducibility crisis, we discuss the relationship between the replicability of scientific studies, the reproducibility of results obtained in these replications, and the philosophy of statistics. Our approach focuses on challenges in specifying scientific studies for scientific inference via statistical inference and is complementary to classical discussions in the philosophy of statistics. We particularly consider the challenges in replicating studies exactly, using the notion of the idealized experiment. We argue against treating reproducibility as an inherently desirable property of scientific results, and in favor of viewing it as a tool to measure the distance between an original study and its replications. To sensibly study the implications of replicability and results reproducibility on inference, such a measure of replication distance is needed. We present an effort to delineate such a framework here, addressing some challenges in capturing the components of scientific studies while identifying others as ongoing issues. We illustrate our measure of replication distance by simulations using a toy example. Rather than replications, we present purposefully planned modifications as an appropriate tool to inform scientific inquiry. Our ability to measure replication distance serves scientists in their search for replication-ready studies. We believe that likelihood-based and evidential approaches may play a critical role towards building statistics that effectively serve the practical needs of science. Full article
Show Figures

Figure 1

17 pages, 452 KiB  
Article
Bootstrap Approximation of Model Selection Probabilities for Multimodel Inference Frameworks
by Andres Dajles and Joseph Cavanaugh
Entropy 2024, 26(7), 599; https://doi.org/10.3390/e26070599 - 15 Jul 2024
Viewed by 786
Abstract
Most statistical modeling applications involve the consideration of a candidate collection of models based on various sets of explanatory variables. The candidate models may also differ in terms of the structural formulations for the systematic component and the posited probability distributions for the [...] Read more.
Most statistical modeling applications involve the consideration of a candidate collection of models based on various sets of explanatory variables. The candidate models may also differ in terms of the structural formulations for the systematic component and the posited probability distributions for the random component. A common practice is to use an information criterion to select a model from the collection that provides an optimal balance between fidelity to the data and parsimony. The analyst then typically proceeds as if the chosen model was the only model ever considered. However, such a practice fails to account for the variability inherent in the model selection process, which can lead to inappropriate inferential results and conclusions. In recent years, inferential methods have been proposed for multimodel frameworks that attempt to provide an appropriate accounting of modeling uncertainty. In the frequentist paradigm, such methods should ideally involve model selection probabilities, i.e., the relative frequencies of selection for each candidate model based on repeated sampling. Model selection probabilities can be conveniently approximated through bootstrapping. When the Akaike information criterion is employed, Akaike weights are also commonly used as a surrogate for selection probabilities. In this work, we show that the conventional bootstrap approach for approximating model selection probabilities is impacted by bias. We propose a simple correction to adjust for this bias. We also argue that Akaike weights do not provide adequate approximations for selection probabilities, although they do provide a crude gauge of model plausibility. Full article
Show Figures

Figure 1

8 pages, 226 KiB  
Article
Multimodel Approaches Are Not the Best Way to Understand Multifactorial Systems
by Benjamin M. Bolker
Entropy 2024, 26(6), 506; https://doi.org/10.3390/e26060506 - 11 Jun 2024
Cited by 2 | Viewed by 923
Abstract
Information-theoretic (IT) and multi-model averaging (MMA) statistical approaches are widely used but suboptimal tools for pursuing a multifactorial approach (also known as the method of multiple working hypotheses) in ecology. (1) Conceptually, IT encourages ecologists to perform tests on sets of artificially simplified [...] Read more.
Information-theoretic (IT) and multi-model averaging (MMA) statistical approaches are widely used but suboptimal tools for pursuing a multifactorial approach (also known as the method of multiple working hypotheses) in ecology. (1) Conceptually, IT encourages ecologists to perform tests on sets of artificially simplified models. (2) MMA improves on IT model selection by implementing a simple form of shrinkage estimation (a way to make accurate predictions from a model with many parameters relative to the amount of data, by “shrinking” parameter estimates toward zero). However, other shrinkage estimators such as penalized regression or Bayesian hierarchical models with regularizing priors are more computationally efficient and better supported theoretically. (3) In general, the procedures for extracting confidence intervals from MMA are overconfident, providing overly narrow intervals. If researchers want to use limited data sets to accurately estimate the strength of multiple competing ecological processes along with reliable confidence intervals, the current best approach is to use full (maximal) statistical models (possibly with Bayesian priors) after making principled, a priori decisions about model complexity. Full article
15 pages, 1792 KiB  
Article
Likelihood Ratio Test and the Evidential Approach for 2 × 2 Tables
by Peter M. B. Cahusac
Entropy 2024, 26(5), 375; https://doi.org/10.3390/e26050375 - 28 Apr 2024
Viewed by 1392
Abstract
Categorical data analysis of 2 × 2 contingency tables is extremely common, not least because they provide risk difference, risk ratio, odds ratio, and log odds statistics in medical research. A χ2 test analysis is most often used, although some researchers use [...] Read more.
Categorical data analysis of 2 × 2 contingency tables is extremely common, not least because they provide risk difference, risk ratio, odds ratio, and log odds statistics in medical research. A χ2 test analysis is most often used, although some researchers use likelihood ratio test (LRT) analysis. Does it matter which test is used? A review of the literature, examination of the theoretical foundations, and analyses of simulations and empirical data are used by this paper to argue that only the LRT should be used when we are interested in testing whether the binomial proportions are equal. This so-called test of independence is by far the most popular, meaning the χ2 test is widely misused. By contrast, the χ2 test should be reserved for where the data appear to match too closely a particular hypothesis (e.g., the null hypothesis), where the variance is of interest, and is less than expected. Low variance can be of interest in various scenarios, particularly in investigations of data integrity. Finally, it is argued that the evidential approach provides a consistent and coherent method that avoids the difficulties posed by significance testing. The approach facilitates the calculation of appropriate log likelihood ratios to suit our research aims, whether this is to test the proportions or to test the variance. The conclusions from this paper apply to larger contingency tables, including multi-way tables. Full article
Show Figures

Figure 1

23 pages, 501 KiB  
Article
How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
by Aris Spanos
Entropy 2024, 26(1), 95; https://doi.org/10.3390/e26010095 - 22 Jan 2024
Viewed by 1157
Abstract
The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It [...] Read more.
The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing p-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a p-value, and accept/reject H0, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject H0 results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing H0 and H1 which is often exploited to undermine the credibility of frequentist testing. Full article
Show Figures

Figure 1

14 pages, 334 KiB  
Article
Profile Likelihood for Hierarchical Models Using Data Doubling
by Subhash R. Lele
Entropy 2023, 25(9), 1262; https://doi.org/10.3390/e25091262 - 25 Aug 2023
Viewed by 1061
Abstract
In scientific problems, an appropriate statistical model often involves a large number of canonical parameters. Often times, the quantities of scientific interest are real-valued functions of these canonical parameters. Statistical inference for a specified function of the canonical parameters can be carried out [...] Read more.
In scientific problems, an appropriate statistical model often involves a large number of canonical parameters. Often times, the quantities of scientific interest are real-valued functions of these canonical parameters. Statistical inference for a specified function of the canonical parameters can be carried out via the Bayesian approach by simply using the posterior distribution of the specified function of the parameter of interest. Frequentist inference is usually based on the profile likelihood for the parameter of interest. When the likelihood function is analytical, computing the profile likelihood is simply a constrained optimization problem with many numerical algorithms available. However, for hierarchical models, computing the likelihood function and hence the profile likelihood function is difficult because of the high-dimensional integration involved. We describe a simple computational method to compute profile likelihood for any specified function of the parameters of a general hierarchical model using data doubling. We provide a mathematical proof for the validity of the method under regularity conditions that assure that the distribution of the maximum likelihood estimator of the canonical parameters is non-singular, multivariate, and Gaussian. Full article
Show Figures

Figure 1

16 pages, 924 KiB  
Article
Evidence of an Absence of Inbreeding Depression in a Wild Population of Weddell Seals (Leptonychotes weddellii)
by John H. Powell, Steven T. Kalinowski, Mark L. Taper, Jay J. Rotella, Corey S. Davis and Robert A. Garrott
Entropy 2023, 25(3), 403; https://doi.org/10.3390/e25030403 - 22 Feb 2023
Cited by 1 | Viewed by 1809
Abstract
Inbreeding depression can reduce the viability of wild populations. Detecting inbreeding depression in the wild is difficult; developing accurate estimates of inbreeding can be time and labor intensive. In this study, we used a two-step modeling procedure to incorporate uncertainty inherent in estimating [...] Read more.
Inbreeding depression can reduce the viability of wild populations. Detecting inbreeding depression in the wild is difficult; developing accurate estimates of inbreeding can be time and labor intensive. In this study, we used a two-step modeling procedure to incorporate uncertainty inherent in estimating individual inbreeding coefficients from multilocus genotypes into estimates of inbreeding depression in a population of Weddell seals (Leptonychotes weddellii). The two-step modeling procedure presented in this paper provides a method for estimating the magnitude of a known source of error, which is assumed absent in classic regression models, and incorporating this error into inferences about inbreeding depression. The method is essentially an errors-in-variables regression with non-normal errors in both the dependent and independent variables. These models, therefore, allow for a better evaluation of the uncertainty surrounding the biological importance of inbreeding depression in non-pedigreed wild populations. For this study we genotyped 154 adult female seals from the population in Erebus Bay, Antarctica, at 29 microsatellite loci, 12 of which are novel. We used a statistical evidence approach to inference rather than hypothesis testing because the discovery of both low and high levels of inbreeding are of scientific interest. We found evidence for an absence of inbreeding depression in lifetime reproductive success, adult survival, age at maturity, and the reproductive interval of female seals in this population. Full article
Show Figures

Figure 1

Back to TopTop