ijerph-logo

Journal Browser

Journal Browser

Innovations in Biostatistical Methods and Data for Public Health Research

A special issue of International Journal of Environmental Research and Public Health (ISSN 1660-4601).

Deadline for manuscript submissions: closed (31 December 2018) | Viewed by 33817

Special Issue Editors


E-Mail Website
Guest Editor
Department of Health Policy and Community Health, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA 30460, USA
Interests: HIV/AIDS services and programs; eHealth and public health informatics; data improvement tools; practice-based public health services and systems research (PHSSR); public health finances
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Jiann-Ping Hsu College Public Health, Georgia Southern University, Hendricks Hall 1006 P.O. Box 8015, Statesboro, GA 30460, USA
Interests: biostatistics methods; ranked set sample; bootstrap and resampling methods; parametric and nonparametric inference; Monte Carlo methods; MCMC methods; data analysis; inference for randomized clinical trials; mediator with regression analysis; simultaneous equations models latent class analysis; missing data; diagnostics measure and inference

Special Issue Information

Dear Colleagues,

Innovations in public health data analytics are critically important to take a full advantage of big data from public health and healthcare sectors, as well as from numerous other industries with potential impacts on health, through social determinants of health. Meanwhile, evidence-based public health is increasingly becoming indispensable, given that public health systems are becoming more complex and are presented with new challenges by communities around the globe. After the economic recession of 2008, public health agencies operated in an environment characterized by increasing demands to improve quality, adopt evidence-based decision-making, and improve efficiency in disease surveillance and outbreak detection. At the same time, large amounts of electronic data from public health, healthcare, and other sectors, such as economic, social, environmental, transportation, and education to name a few, are available to innovatively analyze and inform healthcare and public health. In a dynamic public health environment, filled with emerging demands for evidence-based public health practice, it is ever more imperative for public health and healthcare agencies to have access to evidence created from innovative biostatistical analyses of real time or nearly-real-time data.

To take advantage of the new advances in big data, data integration, and data analytics, this Special Issue is open for contributions in the field of biostatistics and public health. The International Journal of Environmental Research and Public Health invites research articles, systematic reviews, and short reports showcasing innovations in theories, methods, modeling, data integration, and data analytics, and their application to current public health issues with respect to policy and practice. Examples of topics of innovative biostatistical and data integration include:

  • Meta analysis methods for public health research.
  • Data improvement, integration (e.g., probabilistic record linkage), and data improvement innovation in using big data
  • Biostatiscal statistical modelling for current public health topics
  • Innovative methods and designs for public health research

The public health domains to which biostatistical and data integration innovations may be applied include (but are not limited to) the following:

  • Health systems and services, and infrastructure
  • Health disparities and inequities,
  • Health economics
  • Health policy and management
  • Epidemiology
  • Environmental public health
  • Comparative effectiveness research models in healthcare/public health

Prof. Dr. Gulzar Shah
Prof. Dr. Hani Michel Samawi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Environmental Research and Public Health is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2500 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Innovative public health data modeling.
  • Probabilistic data linkage
  • Longitudinal public health data modeling
  • Partially correlated public health data analysis
  • Survival analysis for public health data
  • Modeling incomplete or missing data
  • Analysis of censored public health data
  • Health systems and services, and infrastructure
  • Health disparities and inequities,
  • Health economics
  • Environmental public health
  • Comparative effectiveness research models

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 979 KiB  
Article
Asking Sensitive Questions Using the Randomized Response Approach in Public Health Research: An Empirical Study on the Factors of Illegal Waste Disposal
by Andy C. Y. Chong, Amanda M. Y. Chu, Mike K. P. So and Ray S. W. Chung
Int. J. Environ. Res. Public Health 2019, 16(6), 970; https://doi.org/10.3390/ijerph16060970 - 18 Mar 2019
Cited by 16 | Viewed by 4082
Abstract
A survey study is a research method commonly used to quantify population characteristics in biostatistics and public health research, two fields that often involve sensitive questions. However, if answering sensitive questions could cause social undesirability, respondents may not provide honest responses to questions [...] Read more.
A survey study is a research method commonly used to quantify population characteristics in biostatistics and public health research, two fields that often involve sensitive questions. However, if answering sensitive questions could cause social undesirability, respondents may not provide honest responses to questions that are asked directly. To mitigate the response distortion arising from dishonest answers to sensitive questions, the randomized response technique (RRT) is a useful and effective statistical method. However, research has seldom addressed how to apply the RRT in public health research using an online survey with multiple sensitive questions. Thus, we help fill this research gap by employing an innovative unrelated question design method. To illustrate how the RRT can be implemented in a multivariate analysis setting, we conducted a survey study to examine the factors affecting the intention of illegal waste disposal. This study demonstrates an application of the RRT to investigate the factors affecting people’s intention of illegal waste disposal. The potential factors of the intention were adopted from the theory of planned behavior and the general deterrence theory, and a self-administered online questionnaire was employed to collect data. Using the RRT, a covariance matrix was extracted for examining the hypothesized model via structural equation modeling. The survey results show that people’s attitude toward the behavior and their perceived behavioral control significantly positively affect their intention. This paper is useful for showing researchers and policymakers how to conduct surveys in environmental or public health related research that involves multiple sensitive questions. Full article
Show Figures

Figure 1

14 pages, 2427 KiB  
Article
Prediction of Prehypertenison and Hypertension Based on Anthropometry, Blood Parameters, and Spirometry
by Byeong Mun Heo and Keun Ho Ryu
Int. J. Environ. Res. Public Health 2018, 15(11), 2571; https://doi.org/10.3390/ijerph15112571 - 16 Nov 2018
Cited by 21 | Viewed by 3976
Abstract
Hypertension and prehypertension are risk factors for cardiovascular diseases. However, the associations of both prehypertension and hypertension with anthropometry, blood parameters, and spirometry have not been investigated. The purpose of this study was to identify the risk factors for prehypertension and hypertension in [...] Read more.
Hypertension and prehypertension are risk factors for cardiovascular diseases. However, the associations of both prehypertension and hypertension with anthropometry, blood parameters, and spirometry have not been investigated. The purpose of this study was to identify the risk factors for prehypertension and hypertension in middle-aged Korean adults and to study prediction models of prehypertension and hypertension combined with anthropometry, blood parameters, and spirometry. Binary logistic regression analysis was performed to assess the statistical significance of prehypertension and hypertension, and prediction models were developed using logistic regression, naïve Bayes, and decision trees. Among all risk factors for prehypertension, body mass index (BMI) was identified as the best indicator in both men [odds ratio (OR) = 1.429, 95% confidence interval (CI) = 1.304–1.462)] and women (OR = 1.428, 95% CI = 1.204–1.453). In contrast, among all risk factors for hypertension, BMI (OR = 1.993, 95% CI = 1.818–2.186) was found to be the best indicator in men, whereas the waist-to-height ratio (OR = 2.071, 95% CI = 1.884–2.276) was the best indicator in women. In the prehypertension prediction model, men exhibited an area under the receiver operating characteristic curve (AUC) of 0.635, and women exhibited a predictive power with an AUC of 0.777. In the hypertension prediction model, men exhibited an AUC of 0.700, and women exhibited an AUC of 0.845. This study proposes various risk factors for prehypertension and hypertension, and our findings can be used as a large-scale screening tool for controlling and managing hypertension. Full article
Show Figures

Figure 1

18 pages, 3673 KiB  
Article
Robust Compositional Analysis of Physical Activity and Sedentary Behaviour Data
by Nikola Štefelová, Jan Dygrýn, Karel Hron, Aleš Gába, Lukáš Rubín and Javier Palarea-Albaladejo
Int. J. Environ. Res. Public Health 2018, 15(10), 2248; https://doi.org/10.3390/ijerph15102248 - 14 Oct 2018
Cited by 26 | Viewed by 4928
Abstract
Although there is an increasing awareness of the suitability of using compositional data methodology in public health research, classical methods of statistical analysis have been primarily used so far. The present study aims to illustrate the potential of robust statistics to model movement [...] Read more.
Although there is an increasing awareness of the suitability of using compositional data methodology in public health research, classical methods of statistical analysis have been primarily used so far. The present study aims to illustrate the potential of robust statistics to model movement behaviour using Czech adolescent data. We investigated: (1) the inter-relationship between various physical activity (PA) intensities, extended to model relationships by age; and (2) the associations between adolescents’ PA and sedentary behavior (SB) structure and obesity. These research questions were addressed using three different types of compositional regression analysis—compositional covariates, compositional response, and regression between compositional parts. Robust counterparts of classical regression methods were used to lessen the influence of possible outliers. We outlined the differences in both classical and robust methods of compositional data analysis. There was a pattern in Czech adolescents’ movement/non-movement behavior—extensive SB was related to higher amounts of light-intensity PA, and vigorous PA ratios formed the main source of potential aberrant observations; aging is associated with more SB and vigorous PA at the expense of light-intensity PA and moderate-intensity PA. The robust counterparts indicated that they might provide more stable estimates in the presence of outlying observations. The findings suggested that replacing time spent in SB with vigorous PA may be a powerful tool against adolescents’ obesity. Full article
Show Figures

Figure 1

21 pages, 20182 KiB  
Article
Wave2Vec: Vectorizing Electroencephalography Bio-Signal for Prediction of Brain Disease
by Seonho Kim, Jungjoon Kim and Hong-Woo Chun
Int. J. Environ. Res. Public Health 2018, 15(8), 1750; https://doi.org/10.3390/ijerph15081750 - 15 Aug 2018
Cited by 17 | Viewed by 4596
Abstract
Interest in research involving health-medical information analysis based on artificial intelligence, especially for deep learning techniques, has recently been increasing. Most of the research in this field has been focused on searching for new knowledge for predicting and diagnosing disease by revealing the [...] Read more.
Interest in research involving health-medical information analysis based on artificial intelligence, especially for deep learning techniques, has recently been increasing. Most of the research in this field has been focused on searching for new knowledge for predicting and diagnosing disease by revealing the relation between disease and various information features of data. These features are extracted by analyzing various clinical pathology data, such as EHR (electronic health records), and academic literature using the techniques of data analysis, natural language processing, etc. However, still needed are more research and interest in applying the latest advanced artificial intelligence-based data analysis technique to bio-signal data, which are continuous physiological records, such as EEG (electroencephalography) and ECG (electrocardiogram). Unlike the other types of data, applying deep learning to bio-signal data, which is in the form of time series of real numbers, has many issues that need to be resolved in preprocessing, learning, and analysis. Such issues include leaving feature selection, learning parts that are black boxes, difficulties in recognizing and identifying effective features, high computational complexities, etc. In this paper, to solve these issues, we provide an encoding-based Wave2vec time series classifier model, which combines signal-processing and deep learning-based natural language processing techniques. To demonstrate its advantages, we provide the results of three experiments conducted with EEG data of the University of California Irvine, which are a real-world benchmark bio-signal dataset. After converting the bio-signals (in the form of waves), which are a real number time series, into a sequence of symbols or a sequence of wavelet patterns that are converted into symbols, through encoding, the proposed model vectorizes the symbols by learning the sequence using deep learning-based natural language processing. The models of each class can be constructed through learning from the vectorized wavelet patterns and training data. The implemented models can be used for prediction and diagnosis of diseases by classifying the new data. The proposed method enhanced data readability and intuition of feature selection and learning processes by converting the time series of real number data into sequences of symbols. In addition, it facilitates intuitive and easy recognition, and identification of influential patterns. Furthermore, real-time large-capacity data analysis is facilitated, which is essential in the development of real-time analysis diagnosis systems, by drastically reducing the complexity of calculation without deterioration of analysis performance by data simplification through the encoding process. Full article
Show Figures

Figure 1

1174 KiB  
Article
Cusp Catastrophe Regression and Its Application in Public Health and Behavioral Research
by Ding-Geng Chen and Xinguang Chen
Int. J. Environ. Res. Public Health 2017, 14(10), 1220; https://doi.org/10.3390/ijerph14101220 - 13 Oct 2017
Cited by 12 | Viewed by 4856
Abstract
The cusp catastrophe model is an innovative approach for investigating a phenomenon that consists of both continuous and discrete changes in one modeling framework. However, its application to empirical health and behavior data has been hindered by the complexity in data-model fit. In [...] Read more.
The cusp catastrophe model is an innovative approach for investigating a phenomenon that consists of both continuous and discrete changes in one modeling framework. However, its application to empirical health and behavior data has been hindered by the complexity in data-model fit. In this study, we reported our work in the development of a new modeling method—cusp catastrophe regression (RegCusp in short) by casting the cusp catastrophe into a statistical regression. With the RegCusp approach, unbiased model parameters can be estimated with the maximum likelihood estimation method. To validate the RegCusp method, a series of simulations were conducted to demonstrate the unbiasedness of parameter estimation. Since the estimated residual variance with the Fisher information matrix method was over-dispersed, a bootstrap re-sampling procedure was developed and used as a remedy. We also demonstrate the practical applicability of the RegCusp with empirical data from an NIH-funded project to evaluate an HIV prevention intervention program to educate adolescents in the Bahamas for condom use. Study findings indicated that the model parameters estimated with RegCusp were practically more meaningful than those estimated with comparable methods, especially the estimated cusp point. Full article
Show Figures

Figure 1

1201 KiB  
Article
Using Structural Equation Modeling to Assess the Links between Tobacco Smoke Exposure, Volatile Organic Compounds, and Respiratory Function for Adolescents Aged 6 to 18 in the United States
by Bonnie E. Shook-Sa, Ding-Geng Chen and Haibo Zhou
Int. J. Environ. Res. Public Health 2017, 14(10), 1112; https://doi.org/10.3390/ijerph14101112 - 25 Sep 2017
Cited by 7 | Viewed by 4885
Abstract
Asthma is an inflammatory airway disease that affects 22 million Americans in the United States. Research has found associations between impaired respiratory function, including asthma and increased symptoms among asthmatics, and common indoor air pollutants, including tobacco smoke exposure and volatile organic compounds [...] Read more.
Asthma is an inflammatory airway disease that affects 22 million Americans in the United States. Research has found associations between impaired respiratory function, including asthma and increased symptoms among asthmatics, and common indoor air pollutants, including tobacco smoke exposure and volatile organic compounds (VOCs). However, findings linking VOC exposure and asthma are inconsistent and studies are of mixed quality due to design limitations, challenges measuring VOC exposure, small sample sizes, and suboptimal statistical methodologies. Because of the correlation between tobacco smoke exposure and VOCs, and associations between both tobacco smoke and VOCs with respiratory function, it is crucial that statistical methodology employed to assess links between respiratory function and individual air pollutants control for these complex relationships. This research uses Structural Equation Modeling (SEM) to assess the relationships between respiratory function, tobacco smoke exposure, and VOC exposure among a nationally-representative sample of adolescents. SEM allows for multiple outcome variables, the inclusion of both observed and latent variables, and controls the effects of confounding and correlated variables, which is critically important and is lacking in earlier studies when estimating the effects of correlated air pollutants on respiratory function. We find evidence of associations between respiratory function and some types of VOCs, even when controlling for the effects of tobacco smoke exposure and additional covariates. Furthermore, we find that poverty has an indirect effect on respiratory function through its relationships with tobacco smoke exposure and some types of VOCs. This analysis demonstrates how SEM is a robust analytic tool for assessing associations between respiratory function and multiple exposures to pollutants. Full article
Show Figures

Figure 1

1762 KiB  
Article
Longitudinal Study-Based Dementia Prediction for Public Health
by HeeChel Kim, Hong-Woo Chun, Seonho Kim, Byoung-Youl Coh, Oh-Jin Kwon and Yeong-Ho Moon
Int. J. Environ. Res. Public Health 2017, 14(9), 983; https://doi.org/10.3390/ijerph14090983 - 30 Aug 2017
Cited by 18 | Viewed by 5721
Abstract
The issue of public health in Korea has attracted significant attention given the aging of the country’s population, which has created many types of social problems. The approach proposed in this article aims to address dementia, one of the most significant symptoms of [...] Read more.
The issue of public health in Korea has attracted significant attention given the aging of the country’s population, which has created many types of social problems. The approach proposed in this article aims to address dementia, one of the most significant symptoms of aging and a public health care issue in Korea. The Korean National Health Insurance Service Senior Cohort Database contains personal medical data of every citizen in Korea. There are many different medical history patterns between individuals with dementia and normal controls. The approach used in this study involved examination of personal medical history features from personal disease history, sociodemographic data, and personal health examinations to develop a prediction model. The prediction model used a support-vector machine learning technique to perform a 10-fold cross-validation analysis. The experimental results demonstrated promising performance (80.9% F-measure). The proposed approach supported the significant influence of personal medical history features during an optimal observation period. It is anticipated that a biomedical “big data”-based disease prediction model may assist the diagnosis of any disease more correctly. Full article
Show Figures

Figure 1

Back to TopTop