**5. Application: Creatine Kinase Levels during Football Preseason**

In this section, we apply the BDCP to a data set from a biomedical setting. The goal of this application is to understand the changes in creatine kinase (CK) levels observed on the blood samples of college football players during preseason training. In order to properly explain the variation of CK, we must select between competing models that use different demographic and clinical variables. We will analyze the models selected by the *kb* corrected, the *k* corrected and the uncorrected BDCP, and we will compare the results to the selection of models via the more conventional *p*-value approach.

#### *5.1. Overview of Application*

During strenuous exercise, skeletal muscle cells break down and release a variety of intracellular contents. When in excess, a condition known as exertional rhabdomyolysis (ER) can occur, which may result in life-threatening complications such as renal failure, cardiac arrhythmia and compartment syndrome. Creatine kinase (CK) is one of the proteins released during muscle breakdown, and measuring its levels is the most sensitive test for assessing muscular damage that could lead to ER [7].

During the off-season workouts in January 2011, a group of 13 University of Iowa football players developed ER. This event led to a prospective study where 30 University of Iowa football athletes were followed during a 34-day preseason workout camp. Variables such as body mass index (BMI) and CK levels were obtained from blood samples that were drawn at the first, third, and seventh day of the camp. Other demographic and clinical variables such as age, number of semesters in the program and history of rhabdomyolysis were also collected.

The initial results of the study, published by Smoot et al. [8], show that the CK levels at later time points were significantly different than the levels at earlier times. However, most of the clinical and demographic variables were not significant in explaining the levels of CK. One of the underlying issues with this type of modeling analysis is that the significance of each variable can only be assessed by hypothesis tests with nested models. For example, suppose that we wish to determine the significance of BMI in the presence of semesters in the program. To obtain a *p*-value for BMI, we need to formulate a hypothesis test where the null model only contains semesters in the program, while the alternative model contains both BMI and semesters in the program.

Although this setting may be useful in some scenarios, it is too limiting. For instance, suppose that we wish to choose between two non-nested models where one contains BMI and the other contains semesters in the program. Although a conventional test based on linear regression models would not be able to answer this question, the BDCP approach could indeed determine the propriety of either model in this type of non-nested setting.

In the analysis of this data set, we let *CK*3 be the log of CK levels measured at the seventh day of the camp, *CK*1 be the log of CK levels measured at the first day of the camp, and *Semesters* be the number of semesters at the program. Of note, the log transformation is routinely applied in studies involving CK levels in order to justify approximate normality, as the raw levels tend to have heavily right-skewed distributions.

Now, consider the following hypothesis testing settings.

Setting 1: *Testing the propriety of the model containing CK*1*.*

$$\begin{aligned} H\_1 &: CK3 = \beta\_1, \\ H\_2 &: CK3 = \beta\_1 + \beta\_2 \, CK1. \end{aligned}$$

Setting 2: *Testing the propriety of the model containing CK*1 *and Semesters over the model containing only CK*1*.*

$$\begin{aligned} H\_1 &: \text{CK3} = \beta\_1 + \beta\_2 \text{ CK1}, \\ H\_2 &: \text{CK3} = \beta\_1 + \beta\_2 \text{ CK1} + \beta\_3 \text{ Semeters.} \end{aligned}$$

Setting 3: *Head-to-head comparison of non-nested models.*

$$\begin{aligned} H\_1: \mathbb{C}K3 &= \beta\_1 + \beta\_2 \, \mathbb{C}K1 + \beta\_3 \, BMI, \\ H\_2: \mathbb{C}K3 &= \beta\_1 + \beta\_2 \, \mathbb{C}K1 + \beta\_3 \, Semeters. \end{aligned}$$

#### *5.2. Results of Application*

The results for the application are summarized in Table 13. Settings 1 and 2 illustrate the congruence between BDCP and *p*-values in the case of hypothesis testing based on nested models. Setting 1 assesses the propriety of a model that includes only the intercept against a model that includes both the intercept and the levels of *CK*1. The *p*-value for *CK*1 in this setting is 0.001, which means that, using a level *α* of 0.05, *CK*1 is significant in explaining the variation in *CK*3 levels. Both the BDCPk and BDCPb are 0.075, which means that there is a 7.5% chance that the null model is preferred over multiple bootstrap samples, indicating that the model containing *CK*1 is superior.

Once we establish that *CK*1 is an important variable to include in our model, the next step is to determine if additional variables can improve our model fit. Setting 2 displays a hypothesis test where the null model only contains *CK*1, while the alternative contains both *CK*1 and *Semesters*. The *p*-value for *Semesters* is 0.734, which means that *Semesters* is not statistically significant, and a reasonable investigator would choose to exclude *Semesters* from the final model. The corrected BDCP values arrive at the same conclusion. For instance, the BDCPb is 0.995, which indicates that the across multiple bootstrap samples, the null model is chosen 99.5% of the time; therefore, the BDCP encourages us to choose the model that excludes *Semesters*.

**Table 13.** From left to right: results for Setting 1, Setting 2, and Setting 3. BDCPk is the BDCP corrected by *k*, BDCPb is the BDCP corrected by *kb*, and BDCP is the uncorrected BDCP. Results are based on 200 bootstraps samples.


The rationale for testing *Semesters* is based on the idea that more senior athletes tend to rigorously maintain their workout habits during the off season, mostly because of experience and maturity. Therefore, *Semesters* is a variable that may confound the effects of *CK*1 on the variation of *CK*3. Additionally, medical literature has shown that BMI highly correlates with CK levels and the development of ER [9], which means that one should also test for the propriety of models that include *BMI*. Thus, one could ask if a model featuring *BMI* would be better than a model featuring *Semesters*. This results in a hypothesis testing scenario where the null and alternative models are non-nested, as exhibited in Setting 3.

First, note that the *p*-values displayed in the table for Setting 3 do not answer the question at hand. These *p*-values are obtained from partial tests applied to the full model containing both variables. On the other hand, the BDCP gives us meaningful information about the performance of adding *BMI* versus adding *Semesters*. The BDCPb tells us that there is a 78% probability that the model containing *BMI* is a better fit than the model containing *Semesters*. If we use the BDCPk instead, the probability increases to 81.5%. In both cases, if we are debating weather to include *BMI* or *Semesters* as an adjusting variable, the BDCP clearly favors the inclusion of *BMI*.
