Statistics and Data Science

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: 31 October 2024 | Viewed by 254

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA 23529, USA
Interests: big data analytics; machine learning; computational statistics; quantitative finance; statistical process control; robust statistics; nonparametric and semiparametric techniques
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Mathematical and Physical Sciences, University of New England, Biddeford, ME 04005, USA
Interests: cluster analysis; robust statistical procedures

Special Issue Information

Dear Colleagues,

As large datasets become more ubiquitous, the demand for data-driven methodologies that provide valuable insights into complex phenomena and facilitate computer-guided decision making continues growing. Fueled by theoretical and methodological advancements and the increasing availability of computing resources, increasingly more cutting-edge developments are taking place at the interface of statistics and data science and have proven to be a major driver of innovation in science and technology. This Special Issue aims to promote the convergence between modern research agendas and practices in data science and statistics and to explore collaborative synergies in addressing theoretical and real-world problems.

While the role of data science becomes increasingly important in statistics, leading to a broader use of computationally intensive methods, a heavier reliance on resampling and bootstrapping techniques, the utilization of multi-modal datasets, etc., traditional statistical approaches have catalyzed major developments in tree-based learning, Bayesian deep learning, robust uncertainty quantification, model reduction, feature selection, as well as many other areas of data science. This emphasizes the importance of interdisciplinary research that connects experts from both fields.

This Special Issue welcomes original manuscripts on a broad variety of topics in statistics and data science including, but not limited to, the following: theoretical and methodological developments in multivariate and high-dimensional statistics, robust and nonparametric statistics, statistical learning, computational statistics, Bayesian statistics, machine learning, big data analytics, deep learning, image analysis and computer vision, text mining and large language models, multimodal learning, explainable AI, the application of advanced data analytics in solving real-world problems, and reviews of the modern data science and statistics literature.

Dr. Michael Pokojovy
Dr. Andrews T. Anum
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data science
  • big data analytics
  • advanced data analytics
  • computational statistics
  • robust inference
  • Bayesian inference
  • statistical learning
  • machine learning
  • deep learning
  • explainable AI
  • trustworthy AI
  • text mining
  • image mining
  • multimodal learning
  • supervised learning
  • unsupervised learning
  • reinforcement learning
  • transfer learning

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 533 KiB  
Article
Testing Informativeness of Covariate-Induced Group Sizes in Clustered Data
by Hasika K. Wickrama Senevirathne and Sandipan Dutta
Mathematics 2024, 12(11), 1623; https://doi.org/10.3390/math12111623 - 22 May 2024
Viewed by 80
Abstract
Clustered data are a special type of correlated data where units within a cluster are correlated while units between different clusters are independent. The number of units in a cluster can be associated with that cluster’s outcome. This is called the informative cluster [...] Read more.
Clustered data are a special type of correlated data where units within a cluster are correlated while units between different clusters are independent. The number of units in a cluster can be associated with that cluster’s outcome. This is called the informative cluster size (ICS), which is known to impact clustered data inference. However, when comparing the outcomes from multiple groups of units in clustered data, investigating ICS may not be enough. This is because the number of units belonging to a particular group in a cluster can be associated with the outcome from that group in that cluster, leading to an informative intra-cluster group size or IICGS. This phenomenon of IICGS can exist even in the absence of ICS. Ignoring the existence of IICGS can result in a biased inference for group-based outcome comparisons in clustered data. In this article, we mathematically formulate the concept of IICGS while distinguishing it from ICS and propose a nonparametric bootstrap-based statistical hypothesis-testing mechanism for testing any claim of IICGS in a clustered data setting. Through simulations and real data applications, we demonstrate that our proposed statistical testing method can accurately identify IICGS, with substantial power, in clustered data. Full article
(This article belongs to the Special Issue Statistics and Data Science)
Back to TopTop