Next Issue
Volume 7, September
Previous Issue
Volume 7, March
 
 

Stats, Volume 7, Issue 2 (June 2024) – 14 articles

Cover Story (view full-size image): In this paper, we propose an innovative approach to studying consumers’ preferences for coffee, which integrates a choice experiment with consumer sensory tests and chemical analyses. The same choice experiment is administered on two consecutive occasions, i.e., before and after the guided tasting session, to analyze the role of tasting and awareness about coffee composition in the consumers’ preferences. A Bayesian optimal design, based on a compound design criterion, is applied to build the choice experiment; the compound criterion addresses two main issues related to the efficient estimation of the attributes and the evaluation of the sensorial part. Mixed logit models are applied; the results are promising, confirming the validity of the proposed approach. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
27 pages, 1193 KiB  
Article
Assessing Spillover Effects of Medications for Opioid Use Disorder on HIV Risk Behaviors among a Network of People Who Inject Drugs
by Joseph Puleo, Ashley Buchanan, Natallia Katenka, M. Elizabeth Halloran, Samuel R. Friedman and Georgios Nikolopoulos
Stats 2024, 7(2), 549-575; https://doi.org/10.3390/stats7020034 - 19 Jun 2024
Viewed by 902
Abstract
People who inject drugs (PWID) have an increased risk of HIV infection partly due to injection behaviors often related to opioid use. Medications for opioid use disorder (MOUD) have been shown to reduce HIV infection risk, possibly by reducing injection risk behaviors. MOUD [...] Read more.
People who inject drugs (PWID) have an increased risk of HIV infection partly due to injection behaviors often related to opioid use. Medications for opioid use disorder (MOUD) have been shown to reduce HIV infection risk, possibly by reducing injection risk behaviors. MOUD may benefit individuals who do not receive it themselves but are connected through social, sexual, or drug use networks with individuals who are treated. This is known as spillover. Valid estimation of spillover in network studies requires considering the network’s community structure. Communities are groups of densely connected individuals with sparse connections to other groups. We analyzed a network of 277 PWID and their contacts from the Transmission Reduction Intervention Project. We assessed the effect of MOUD on reductions in injection risk behaviors and the possible benefit for network contacts of participants treated with MOUD. We identified communities using modularity-based methods and employed inverse probability weighting with community-level propensity scores to adjust for measured confounding. We found that MOUD may have beneficial spillover effects on reducing injection risk behaviors. The magnitudes of estimated effects were sensitive to the community detection method. Careful consideration should be paid to the significance of community structure in network studies evaluating spillover. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

12 pages, 855 KiB  
Perspective
Redefining Significance: Robustness and Percent Fragility Indices in Biomedical Research
by Thomas F. Heston
Stats 2024, 7(2), 537-548; https://doi.org/10.3390/stats7020033 - 17 Jun 2024
Viewed by 1206
Abstract
The p-value has long been the standard for statistical significance in scientific research, but this binary approach often fails to consider the nuances of statistical power and the potential for large sample sizes to show statistical significance despite trivial treatment effects. Including [...] Read more.
The p-value has long been the standard for statistical significance in scientific research, but this binary approach often fails to consider the nuances of statistical power and the potential for large sample sizes to show statistical significance despite trivial treatment effects. Including a statistical fragility assessment can help overcome these limitations. One common fragility metric is the fragility index, which assesses statistical fragility by incrementally altering the outcome data in the intervention group until the statistical significance flips. The robustness index takes a different approach by maintaining the integrity of the underlying data distribution while examining changes in the p-value as the sample size changes. The percent fragility index is another useful alternative that is more precise than the fragility index and is more uniformly applied to both the intervention and control groups. Incorporating these fragility metrics into routine statistical procedures could address the reproducibility crisis and increase research efficacy. Using these fragility indices can be seen as a step toward a more mature phase of statistical reasoning, where significance is a multi-faceted and contextually informed judgment. Full article
(This article belongs to the Section Biostatistics)
Show Figures

Figure 1

16 pages, 617 KiB  
Article
An Optimal Design through a Compound Criterion for Integrating Extra Preference Information in a Choice Experiment: A Case Study on Moka Ground Coffee
by Rossella Berni, Nedka Dechkova Nikiforova and Patrizia Pinelli
Stats 2024, 7(2), 521-536; https://doi.org/10.3390/stats7020032 - 8 Jun 2024
Viewed by 1209
Abstract
In this manuscript, we propose an innovative approach to studying consumers’ preferences for coffee, which integrates a choice experiment with consumer sensory tests and chemical analyses (caffeine contents obtained through a High-Performance Liquid Chromatography (HPLC) method). The same choice experiment is administered on [...] Read more.
In this manuscript, we propose an innovative approach to studying consumers’ preferences for coffee, which integrates a choice experiment with consumer sensory tests and chemical analyses (caffeine contents obtained through a High-Performance Liquid Chromatography (HPLC) method). The same choice experiment is administered on two consecutive occasions, i.e., before and after the guided tasting session, to analyze the role of tasting and awareness about coffee composition in the consumers’ preferences. To this end, a Bayesian optimal design, based on a compound design criterion, is applied in order to build the choice experiment; the compound criterion allows for addressing two main issues related to the efficient estimation of the attributes and the evaluation of the sensorial part, e.g., the HPLC effects and the scores obtained through the consumer sensory test. All these elements, e.g., the attributes involved in the choice experiment, the scores obtained for each coffee through the sensory tests, and the HPLC quantitative evaluation of caffeine, are analyzed through suitable Random Utility Models. The initial results are promising, confirming the validity of the proposed approach. Full article
Show Figures

Figure 1

13 pages, 947 KiB  
Article
A Spatial Gaussian-Process Boosting Analysis of Socioeconomic Disparities in Wait-Listing of End-Stage Kidney Disease Patients across the United States
by Sounak Chakraborty, Tanujit Dey, Lingwei Xiang and Joel T. Adler
Stats 2024, 7(2), 508-520; https://doi.org/10.3390/stats7020031 - 7 Jun 2024
Viewed by 841
Abstract
In this study, we employed a novel approach of combining Gaussian processes (GPs) with boosting techniques to model the spatial variability inherent in End-Stage Kidney Disease (ESKD) data. Our use of the Gaussian processes boosting, or GPBoost, methodology underscores the efficacy of this [...] Read more.
In this study, we employed a novel approach of combining Gaussian processes (GPs) with boosting techniques to model the spatial variability inherent in End-Stage Kidney Disease (ESKD) data. Our use of the Gaussian processes boosting, or GPBoost, methodology underscores the efficacy of this hybrid method in capturing intricate spatial dynamics and enhancing predictive accuracy. Specifically, our analysis demonstrates a notable improvement in out-of-sample prediction accuracy regarding the percentage of the population remaining on the wait list within geographic regions. Furthermore, our investigation unveils race and gender-based factors that significantly influence patient wait-listing. By leveraging the GPBoost approach, we identify these pertinent factors, shedding light on the complex interplay between demographic variables and access to kidney transplantation services. Our findings underscore the imperative for a multifaceted strategy aimed at reducing spatial disparities in kidney transplant wait-listing. Key components of such an approach include mitigating gender disparities, bolstering access to healthcare services, fostering greater awareness of transplantation options, and dismantling structural barriers to care. By addressing these multifactorial challenges, we can strive towards a more equitable and inclusive landscape in kidney transplantation. Full article
(This article belongs to the Special Issue Bayes and Empirical Bayes Inference)
Show Figures

Figure 1

17 pages, 3974 KiB  
Article
Residual Analysis for Poisson-Exponentiated Weibull Regression Models with Cure Fraction
by Cleanderson R. Fidelis, Edwin M. M. Ortega and Gauss M. Cordeiro
Stats 2024, 7(2), 492-507; https://doi.org/10.3390/stats7020030 - 20 May 2024
Viewed by 793
Abstract
The use of cure-rate survival models has grown in recent years. Even so, proposals to perform the goodness of fit of these models have not been so frequent. However, residual analysis can be used to check the adequacy of a fitted regression model. [...] Read more.
The use of cure-rate survival models has grown in recent years. Even so, proposals to perform the goodness of fit of these models have not been so frequent. However, residual analysis can be used to check the adequacy of a fitted regression model. In this context, we provide Cox–Snell residuals for Poisson-exponentiated Weibull regression with cure fraction. We developed several simulations under different scenarios for studying the distributions of these residuals. They were applied to a melanoma dataset for illustrative purposes. Full article
Show Figures

Figure 1

11 pages, 1044 KiB  
Case Report
Testing for Level–Degree Interaction Effects in Two-Factor Fixed-Effects ANOVA When the Levels of Only One Factor Are Ordered
by J. C. W. Rayner and G. C. Livingston, Jr.
Stats 2024, 7(2), 481-491; https://doi.org/10.3390/stats7020029 - 15 May 2024
Cited by 1 | Viewed by 761
Abstract
In testing for main effects, the use of orthogonal contrasts for balanced designs with the factor levels not ordered is well known. Here, we consider two-factor fixed-effects ANOVA with the levels of one factor ordered and one not ordered. The objective is to [...] Read more.
In testing for main effects, the use of orthogonal contrasts for balanced designs with the factor levels not ordered is well known. Here, we consider two-factor fixed-effects ANOVA with the levels of one factor ordered and one not ordered. The objective is to extend the idea of decomposing the main effect to decomposing the interaction. This is achieved by defining level–degree coefficients and testing if they are zero using permutation testing. These tests give clear insights into what may be causing a significant interaction, even for the unbalanced model. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

19 pages, 4945 KiB  
Article
Multivariate Time Series Change-Point Detection with a Novel Pearson-like Scaled Bregman Divergence
by Tong Si, Yunge Wang, Lingling Zhang, Evan Richmond, Tae-Hyuk Ahn and Haijun Gong
Stats 2024, 7(2), 462-480; https://doi.org/10.3390/stats7020028 - 13 May 2024
Viewed by 1659
Abstract
Change-point detection is a challenging problem that has a number of applications across various real-world domains. The primary objective of CPD is to identify specific time points where the underlying system undergoes transitions between different states, each characterized by its distinct data distribution. [...] Read more.
Change-point detection is a challenging problem that has a number of applications across various real-world domains. The primary objective of CPD is to identify specific time points where the underlying system undergoes transitions between different states, each characterized by its distinct data distribution. Precise identification of change points in time series omics data can provide insights into the dynamic and temporal characteristics inherent to complex biological systems. Many change-point detection methods have traditionally focused on the direct estimation of data distributions. However, these approaches become unrealistic in high-dimensional data analysis. Density ratio methods have emerged as promising approaches for change-point detection since estimating density ratios is easier than directly estimating individual densities. Nevertheless, the divergence measures used in these methods may suffer from numerical instability during computation. Additionally, the most popular α-relative Pearson divergence cannot measure the dissimilarity between two distributions of data but a mixture of distributions. To overcome the limitations of existing density ratio-based methods, we propose a novel approach called the Pearson-like scaled-Bregman divergence-based (PLsBD) density ratio estimation method for change-point detection. Our theoretical studies derive an analytical expression for the Pearson-like scaled Bregman divergence using a mixture measure. We integrate the PLsBD with a kernel regression model and apply a random sampling strategy to identify change points in both synthetic data and real-world high-dimensional genomics data of Drosophila. Our PLsBD method demonstrates superior performance compared to many other change-point detection methods. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

17 pages, 293 KiB  
Article
Multivariate and Matrix-Variate Logistic Models in the Real and Complex Domains
by A. M. Mathai
Stats 2024, 7(2), 445-461; https://doi.org/10.3390/stats7020027 - 11 May 2024
Viewed by 978
Abstract
Several extensions of the basic scalar variable logistic density to the multivariate and matrix-variate cases, in the real and complex domains, are given where the extended forms end up in extended zeta functions. Several cases of multivariate and matrix-variate Bayesian procedures, in the [...] Read more.
Several extensions of the basic scalar variable logistic density to the multivariate and matrix-variate cases, in the real and complex domains, are given where the extended forms end up in extended zeta functions. Several cases of multivariate and matrix-variate Bayesian procedures, in the real and complex domains, are also given. It is pointed out that there are a range of applications of Gaussian and Wishart-based matrix-variate distributions in the complex domain in multi-look data from radar and sonar. It is hoped that the distributions derived in this paper will be highly useful in such applications in physics, engineering, statistics and communication problems, because, in the real scalar case, a logistic model is seen to be more appropriate compared to a Gaussian model in many industrial applications. Hence, logistic-based multivariate and matrix-variate distributions, especially in the complex domain, are expected to perform better where Gaussian and Wishart-based distributions are currently used. Full article
11 pages, 1897 KiB  
Brief Report
Bayesian Inference for Multiple Datasets
by Renata Retkute, William Thurston and Christopher A. Gilligan
Stats 2024, 7(2), 434-444; https://doi.org/10.3390/stats7020026 - 10 May 2024
Viewed by 1334
Abstract
Estimating parameters for multiple datasets can be time consuming, especially when the number of datasets is large. One solution is to sample from multiple datasets simultaneously using Bayesian methods such as adaptive multiple importance sampling (AMIS). Here, we use the AMIS approach to [...] Read more.
Estimating parameters for multiple datasets can be time consuming, especially when the number of datasets is large. One solution is to sample from multiple datasets simultaneously using Bayesian methods such as adaptive multiple importance sampling (AMIS). Here, we use the AMIS approach to fit a von Mises distribution to multiple datasets for wind trajectories derived from a Lagrangian Particle Dispersion Model driven from 3D meteorological data. A posterior distribution of parameters can help to characterise the uncertainties in wind trajectories in a form that can be used as inputs for predictive models of wind-dispersed insect pests and the pathogens of agricultural crops for use in evaluating risk and in planning mitigation actions. The novelty of our study is in testing the performance of the method on a very large number of datasets (>11,000). Our results show that AMIS can significantly improve the efficiency of parameter inference for multiple datasets. Full article
(This article belongs to the Section Bayesian Methods)
Show Figures

Figure 1

32 pages, 10179 KiB  
Article
Contrastive Learning Framework for Bitcoin Crash Prediction
by Zhaoyan Liu, Min Shu and Wei Zhu
Stats 2024, 7(2), 402-433; https://doi.org/10.3390/stats7020025 - 8 May 2024
Viewed by 995
Abstract
Due to spectacular gains during periods of rapid price increase and unpredictably large drops, Bitcoin has become a popular emergent asset class over the past few years. In this paper, we are interested in predicting the crashes of Bitcoin market. To tackle this [...] Read more.
Due to spectacular gains during periods of rapid price increase and unpredictably large drops, Bitcoin has become a popular emergent asset class over the past few years. In this paper, we are interested in predicting the crashes of Bitcoin market. To tackle this task, we propose a framework for deep learning time series classification based on contrastive learning. The proposed framework is evaluated against six machine learning (ML) and deep learning (DL) baseline models, and outperforms them by 15.8% in balanced accuracy. Thus, we conclude that the contrastive learning strategy significantly enhance the model’s ability of extracting informative representations, and our proposed framework performs well in predicting Bitcoin crashes. Full article
Show Figures

Figure 1

13 pages, 445 KiB  
Article
On Non-Occurrence of the Inspection Paradox
by Diana Rauwolf and Udo Kamps
Stats 2024, 7(2), 389-401; https://doi.org/10.3390/stats7020024 - 24 Apr 2024
Viewed by 1266
Abstract
The well-known inspection paradox or waiting time paradox states that, in a renewal process, the inspection interval is stochastically larger than a common interarrival time having a distribution function F, where the inspection interval is given by the particular interarrival time containing [...] Read more.
The well-known inspection paradox or waiting time paradox states that, in a renewal process, the inspection interval is stochastically larger than a common interarrival time having a distribution function F, where the inspection interval is given by the particular interarrival time containing the specified time point of process inspection. The inspection paradox may also be expressed in terms of expectations, where the order is strict, in general. A renewal process can be utilized to describe the arrivals of vehicles, customers, or claims, for example. As the inspection time may also be considered a random variable T with a left-continuous distribution function G independent of the renewal process, the question arises as to whether the inspection paradox inevitably occurs in this general situation, apart from in some marginal cases with respect to F and G. For a random inspection time T, it is seen that non-trivial choices lead to non-occurrence of the paradox. In this paper, a complete characterization of the non-occurrence of the inspection paradox is given with respect to G. Several examples and related assertions are shown, including the deterministic time situation. Full article
(This article belongs to the Section Applied Stochastic Models)
Show Figures

Figure 1

17 pages, 1471 KiB  
Article
New Goodness-of-Fit Tests for the Kumaraswamy Distribution
by David E. Giles
Stats 2024, 7(2), 373-388; https://doi.org/10.3390/stats7020023 - 22 Apr 2024
Viewed by 1219
Abstract
The two-parameter distribution known as the Kumaraswamy distribution is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed in the field of hydrology, it has subsequently received a good deal of positive attention in both the theoretical [...] Read more.
The two-parameter distribution known as the Kumaraswamy distribution is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed in the field of hydrology, it has subsequently received a good deal of positive attention in both the theoretical and applied statistics literatures. Interestingly, the problem of testing formally for the appropriateness of the Kumaraswamy distribution appears to have received little or no attention to date. To fill this gap, in this paper, we apply a “biased transformation” methodology to several standard goodness-of-fit tests based on the empirical distribution function. A simulation study reveals that these (modified) tests perform well in the context of the Kumaraswamy distribution, in terms of both their low size distortion and respectable power. In particular, the “biased transformation” Anderson–Darling test dominates the other tests that are considered. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

12 pages, 988 KiB  
Article
Bayesian Mediation Analysis with an Application to Explore Racial Disparities in the Diagnostic Age of Breast Cancer
by Wentao Cao, Joseph Hagan and Qingzhao Yu
Stats 2024, 7(2), 361-372; https://doi.org/10.3390/stats7020022 - 19 Apr 2024
Viewed by 1010
Abstract
A mediation effect refers to the effect transmitted by a mediator intervening in the relationship between an exposure variable and a response variable. Mediation analysis is widely used to identify significant mediators and to make inferences on their effects. The Bayesian method allows [...] Read more.
A mediation effect refers to the effect transmitted by a mediator intervening in the relationship between an exposure variable and a response variable. Mediation analysis is widely used to identify significant mediators and to make inferences on their effects. The Bayesian method allows researchers to incorporate prior information from previous knowledge into the analysis, deal with the hierarchical structure of variables, and estimate the quantities of interest from the posterior distributions. This paper proposes three Bayesian mediation analysis methods to make inferences on mediation effects. Our proposed methods are the following: (1) the function of coefficients method; (2) the product of partial difference method; and (3) the re-sampling method. We apply these three methods to explore racial disparities in the diagnostic age of breast cancer patients in Louisiana. We found that African American (AA) patients are diagnosed at an average of 4.37 years younger compared with Caucasian (CA) patients (57.40 versus 61.77, p< 0.0001). We also found that the racial disparity can be explained by patients’ insurance (12.90%), marital status (17.17%), cancer stage (3.27%), and residential environmental factors, including the percent of the population under age 18 (3.07%) and the environmental factor of intersection density (9.02%). Full article
(This article belongs to the Section Bayesian Methods)
Show Figures

Figure 1

11 pages, 437 KiB  
Article
Combined Permutation Tests for Pairwise Comparison of Scale Parameters Using Deviances
by Scott J. Richter and Melinda H. McCann
Stats 2024, 7(2), 350-360; https://doi.org/10.3390/stats7020021 - 28 Mar 2024
Viewed by 995
Abstract
Nonparametric combinations of permutation tests for pairwise comparison of scale parameters, based on deviances, are examined. Permutation tests for comparing two or more groups based on the ratio of deviances have been investigated, and a procedure based on Higgins’ RMD statistic was found [...] Read more.
Nonparametric combinations of permutation tests for pairwise comparison of scale parameters, based on deviances, are examined. Permutation tests for comparing two or more groups based on the ratio of deviances have been investigated, and a procedure based on Higgins’ RMD statistic was found to perform well, but two other tests were sometimes more powerful. Thus, combinations of these tests are investigated. A simulation study shows a combined test can be more powerful than any single test. Full article
(This article belongs to the Section Statistical Methods)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop