Stats

17 pages, 3964 KiB

Open AccessArticle

Residual Analysis for Poisson-Exponentiated Weibull Regression Models with Cure Fraction

by Cleanderson R. Fidelis, Edwin M. M. Ortega and Gauss M. Cordeiro

Stats 2024, 7(2), 492-508; https://doi.org/10.3390/stats7020030 - 20 May 2024

Viewed by 164

The use of cure-rate survival models has grown in recent years. Even so, proposals to perform the goodness of fit of these models have not been so frequent. However, residual analysis can be used to check the adequacy of a fitted regression model. [...] Read more.

The use of cure-rate survival models has grown in recent years. Even so, proposals to perform the goodness of fit of these models have not been so frequent. However, residual analysis can be used to check the adequacy of a fitted regression model. In this context, we provide Cox–Snell residuals for Poisson-exponentiated Weibull regression with cure fraction. We developed several simulations under different scenarios for studying the distributions of these residuals. They were applied to a melanoma dataset for illustrative purposes. Full article

11 pages, 1044 KiB

Open AccessCase Report

Testing for Level–Degree Interaction Effects in Two-Factor Fixed-Effects ANOVA When the Levels of Only One Factor Are Ordered

by J. C. W. Rayner and G. C. Livingston, Jr.

Stats 2024, 7(2), 481-491; https://doi.org/10.3390/stats7020029 - 15 May 2024

Viewed by 275

Abstract

In testing for main effects, the use of orthogonal contrasts for balanced designs with the factor levels not ordered is well known. Here, we consider two-factor fixed-effects ANOVA with the levels of one factor ordered and one not ordered. The objective is to [...] Read more.

In testing for main effects, the use of orthogonal contrasts for balanced designs with the factor levels not ordered is well known. Here, we consider two-factor fixed-effects ANOVA with the levels of one factor ordered and one not ordered. The objective is to extend the idea of decomposing the main effect to decomposing the interaction. This is achieved by defining level–degree coefficients and testing if they are zero using permutation testing. These tests give clear insights into what may be causing a significant interaction, even for the unbalanced model. Full article

(This article belongs to the Section Statistical Methods)

► Show Figures

Figure 1

19 pages, 4945 KiB

Open AccessArticle

Multivariate Time Series Change-Point Detection with a Novel Pearson-like Scaled Bregman Divergence

by Tong Si, Yunge Wang, Lingling Zhang, Evan Richmond, Tae-Hyuk Ahn and Haijun Gong

Stats 2024, 7(2), 462-480; https://doi.org/10.3390/stats7020028 - 13 May 2024

Viewed by 469

Abstract

Change-point detection is a challenging problem that has a number of applications across various real-world domains. The primary objective of CPD is to identify specific time points where the underlying system undergoes transitions between different states, each characterized by its distinct data distribution. [...] Read more.

Change-point detection is a challenging problem that has a number of applications across various real-world domains. The primary objective of CPD is to identify specific time points where the underlying system undergoes transitions between different states, each characterized by its distinct data distribution. Precise identification of change points in time series omics data can provide insights into the dynamic and temporal characteristics inherent to complex biological systems. Many change-point detection methods have traditionally focused on the direct estimation of data distributions. However, these approaches become unrealistic in high-dimensional data analysis. Density ratio methods have emerged as promising approaches for change-point detection since estimating density ratios is easier than directly estimating individual densities. Nevertheless, the divergence measures used in these methods may suffer from numerical instability during computation. Additionally, the most popular

α

-relative Pearson divergence cannot measure the dissimilarity between two distributions of data but a mixture of distributions. To overcome the limitations of existing density ratio-based methods, we propose a novel approach called the Pearson-like scaled-Bregman divergence-based (PLsBD) density ratio estimation method for change-point detection. Our theoretical studies derive an analytical expression for the Pearson-like scaled Bregman divergence using a mixture measure. We integrate the PLsBD with a kernel regression model and apply a random sampling strategy to identify change points in both synthetic data and real-world high-dimensional genomics data of Drosophila. Our PLsBD method demonstrates superior performance compared to many other change-point detection methods. Full article

(This article belongs to the Section Statistical Methods)

► Show Figures

Figure 1

17 pages, 293 KiB

Open AccessArticle

Multivariate and Matrix-Variate Logistic Models in the Real and Complex Domains

by A. M. Mathai

Stats 2024, 7(2), 445-461; https://doi.org/10.3390/stats7020027 - 11 May 2024

Viewed by 358

Abstract

Several extensions of the basic scalar variable logistic density to the multivariate and matrix-variate cases, in the real and complex domains, are given where the extended forms end up in extended zeta functions. Several cases of multivariate and matrix-variate Bayesian procedures, in the [...] Read more.

Several extensions of the basic scalar variable logistic density to the multivariate and matrix-variate cases, in the real and complex domains, are given where the extended forms end up in extended zeta functions. Several cases of multivariate and matrix-variate Bayesian procedures, in the real and complex domains, are also given. It is pointed out that there are a range of applications of Gaussian and Wishart-based matrix-variate distributions in the complex domain in multi-look data from radar and sonar. It is hoped that the distributions derived in this paper will be highly useful in such applications in physics, engineering, statistics and communication problems, because, in the real scalar case, a logistic model is seen to be more appropriate compared to a Gaussian model in many industrial applications. Hence, logistic-based multivariate and matrix-variate distributions, especially in the complex domain, are expected to perform better where Gaussian and Wishart-based distributions are currently used. Full article

11 pages, 1897 KiB

Open AccessBrief Report

Bayesian Inference for Multiple Datasets

by Renata Retkute, William Thurston and Christopher A. Gilligan

Stats 2024, 7(2), 434-444; https://doi.org/10.3390/stats7020026 - 10 May 2024

Viewed by 536

Abstract

Estimating parameters for multiple datasets can be time consuming, especially when the number of datasets is large. One solution is to sample from multiple datasets simultaneously using Bayesian methods such as adaptive multiple importance sampling (AMIS). Here, we use the AMIS approach to [...] Read more.

Estimating parameters for multiple datasets can be time consuming, especially when the number of datasets is large. One solution is to sample from multiple datasets simultaneously using Bayesian methods such as adaptive multiple importance sampling (AMIS). Here, we use the AMIS approach to fit a von Mises distribution to multiple datasets for wind trajectories derived from a Lagrangian Particle Dispersion Model driven from 3D meteorological data. A posterior distribution of parameters can help to characterise the uncertainties in wind trajectories in a form that can be used as inputs for predictive models of wind-dispersed insect pests and the pathogens of agricultural crops for use in evaluating risk and in planning mitigation actions. The novelty of our study is in testing the performance of the method on a very large number of datasets (>11,000). Our results show that AMIS can significantly improve the efficiency of parameter inference for multiple datasets. Full article

(This article belongs to the Section Bayesian Methods)

► Show Figures

Figure 1

32 pages, 10179 KiB

Open AccessArticle

Contrastive Learning Framework for Bitcoin Crash Prediction

by Zhaoyan Liu, Min Shu and Wei Zhu

Stats 2024, 7(2), 402-433; https://doi.org/10.3390/stats7020025 - 8 May 2024

Viewed by 357

Abstract

Due to spectacular gains during periods of rapid price increase and unpredictably large drops, Bitcoin has become a popular emergent asset class over the past few years. In this paper, we are interested in predicting the crashes of Bitcoin market. To tackle this [...] Read more.

Due to spectacular gains during periods of rapid price increase and unpredictably large drops, Bitcoin has become a popular emergent asset class over the past few years. In this paper, we are interested in predicting the crashes of Bitcoin market. To tackle this task, we propose a framework for deep learning time series classification based on contrastive learning. The proposed framework is evaluated against six machine learning (ML) and deep learning (DL) baseline models, and outperforms them by 15.8% in balanced accuracy. Thus, we conclude that the contrastive learning strategy significantly enhance the model’s ability of extracting informative representations, and our proposed framework performs well in predicting Bitcoin crashes. Full article

► Show Figures

Figure 1

13 pages, 445 KiB

Open AccessArticle

On Non-Occurrence of the Inspection Paradox

by Diana Rauwolf and Udo Kamps

Stats 2024, 7(2), 389-401; https://doi.org/10.3390/stats7020024 - 24 Apr 2024

Viewed by 559

Abstract

The well-known inspection paradox or waiting time paradox states that, in a renewal process, the inspection interval is stochastically larger than a common interarrival time having a distribution function F, where the inspection interval is given by the particular interarrival time containing [...] Read more.

The well-known inspection paradox or waiting time paradox states that, in a renewal process, the inspection interval is stochastically larger than a common interarrival time having a distribution function F, where the inspection interval is given by the particular interarrival time containing the specified time point of process inspection. The inspection paradox may also be expressed in terms of expectations, where the order is strict, in general. A renewal process can be utilized to describe the arrivals of vehicles, customers, or claims, for example. As the inspection time may also be considered a random variable T with a left-continuous distribution function G independent of the renewal process, the question arises as to whether the inspection paradox inevitably occurs in this general situation, apart from in some marginal cases with respect to F and G. For a random inspection time T, it is seen that non-trivial choices lead to non-occurrence of the paradox. In this paper, a complete characterization of the non-occurrence of the inspection paradox is given with respect to G. Several examples and related assertions are shown, including the deterministic time situation. Full article

(This article belongs to the Section Applied Stochastic Models)

► Show Figures

Figure 1

17 pages, 1471 KiB

Open AccessArticle

New Goodness-of-Fit Tests for the Kumaraswamy Distribution

by David E. Giles

Stats 2024, 7(2), 373-388; https://doi.org/10.3390/stats7020023 - 22 Apr 2024

Viewed by 511

Abstract

The two-parameter distribution known as the Kumaraswamy distribution is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed in the field of hydrology, it has subsequently received a good deal of positive attention in both the theoretical [...] Read more.

The two-parameter distribution known as the Kumaraswamy distribution is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed in the field of hydrology, it has subsequently received a good deal of positive attention in both the theoretical and applied statistics literatures. Interestingly, the problem of testing formally for the appropriateness of the Kumaraswamy distribution appears to have received little or no attention to date. To fill this gap, in this paper, we apply a “biased transformation” methodology to several standard goodness-of-fit tests based on the empirical distribution function. A simulation study reveals that these (modified) tests perform well in the context of the Kumaraswamy distribution, in terms of both their low size distortion and respectable power. In particular, the “biased transformation” Anderson–Darling test dominates the other tests that are considered. Full article

(This article belongs to the Section Statistical Methods)

► Show Figures

Figure 1

12 pages, 988 KiB

Open AccessArticle

Bayesian Mediation Analysis with an Application to Explore Racial Disparities in the Diagnostic Age of Breast Cancer

by Wentao Cao, Joseph Hagan and Qingzhao Yu

Stats 2024, 7(2), 361-372; https://doi.org/10.3390/stats7020022 - 19 Apr 2024

Viewed by 503

Abstract

A mediation effect refers to the effect transmitted by a mediator intervening in the relationship between an exposure variable and a response variable. Mediation analysis is widely used to identify significant mediators and to make inferences on their effects. The Bayesian method allows [...] Read more.

A mediation effect refers to the effect transmitted by a mediator intervening in the relationship between an exposure variable and a response variable. Mediation analysis is widely used to identify significant mediators and to make inferences on their effects. The Bayesian method allows researchers to incorporate prior information from previous knowledge into the analysis, deal with the hierarchical structure of variables, and estimate the quantities of interest from the posterior distributions. This paper proposes three Bayesian mediation analysis methods to make inferences on mediation effects. Our proposed methods are the following: (1) the function of coefficients method; (2) the product of partial difference method; and (3) the re-sampling method. We apply these three methods to explore racial disparities in the diagnostic age of breast cancer patients in Louisiana. We found that African American (AA) patients are diagnosed at an average of 4.37 years younger compared with Caucasian (CA) patients (57.40 versus 61.77,

p <

0.0001). We also found that the racial disparity can be explained by patients’ insurance (12.90%), marital status (17.17%), cancer stage (3.27%), and residential environmental factors, including the percent of the population under age 18 (3.07%) and the environmental factor of intersection density (9.02%). Full article

(This article belongs to the Section Bayesian Methods)

► Show Figures

Figure 1

11 pages, 437 KiB

Open AccessArticle

Combined Permutation Tests for Pairwise Comparison of Scale Parameters Using Deviances

by Scott J. Richter and Melinda H. McCann

Stats 2024, 7(2), 350-360; https://doi.org/10.3390/stats7020021 - 28 Mar 2024

Viewed by 634

Abstract

Nonparametric combinations of permutation tests for pairwise comparison of scale parameters, based on deviances, are examined. Permutation tests for comparing two or more groups based on the ratio of deviances have been investigated, and a procedure based on Higgins’ RMD statistic was found [...] Read more.

Nonparametric combinations of permutation tests for pairwise comparison of scale parameters, based on deviances, are examined. Permutation tests for comparing two or more groups based on the ratio of deviances have been investigated, and a procedure based on Higgins’ RMD statistic was found to perform well, but two other tests were sometimes more powerful. Thus, combinations of these tests are investigated. A simulation study shows a combined test can be more powerful than any single test. Full article

(This article belongs to the Section Statistical Methods)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Stats, Volume 7, Issue 2 (June 2024) – 10 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI