Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity

Zahed, Mostafa; Lalonde, Trent; Skafyan, Maryam

doi:10.3390/math11081973

Open AccessArticle

Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity

by

Mostafa Zahed

^1,*,

Trent Lalonde

² and

Maryam Skafyan

³

¹

Department of Mathematics and Statistics, East Tennessee State University, Johnson City, TN 37614, USA

²

Colorado Department of Human Services, Denver, CO 80203, USA

³

Department of Applied Statistics and Research Methods, University of Northern Colroado, Greeley, CO 80639, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(8), 1973; https://doi.org/10.3390/math11081973

Submission received: 20 March 2023 / Revised: 15 April 2023 / Accepted: 18 April 2023 / Published: 21 April 2023

(This article belongs to the Special Issue New Advance in Operations Research and Analytics)

Download

Browse Figures

Versions Notes

Abstract

:

This study proposes an intensive longitudinal functional model with multiple time-varying scales and subject-specific random intercepts through mixed model equivalence that includes multiple functional predictors, one or more scalar covariates, and one or more scalar covariates. An estimation framework is proposed for estimating a time-varying coefficient function that is modeled as a linear combination of time-invariant functions with time-varying coefficients. The model takes advantage of the information structure of the penalty, while the estimation procedure utilizes the equivalence between penalized least squares estimation and linear mixed models. A number of simulations are conducted in order to empirically evaluate the process. In the simulation, it was observed that mean square errors for functional coefficients decreased with increasing sample size and level of association. Additionally, sample size had a greater impact on a smaller level of association, and level of association also had a greater impact on a smaller sample size. These results provide empirical evidence that ILFMM estimates of functional coefficients are close to the true functional estimate (basically unchanged). In addition, the results indicated that the AIC could be used to guide the choice of ridge weights. Moreover, when ridge weight ratios were sufficiently large, there was minimal impact on estimation performance. Studying two time scales is important in a wide range of fields, including physics, chemistry, biology, engineering, economics, and more. It allows researchers to gain a better understanding of complex systems and processes that operate over different time frames. Consequently, studying physical activities with two time scales is critical for advancing our understanding of human performance and health and for developing effective strategies to optimize physical activity and exercise programs. Therefore, the proposed model was applied to analyze the physical activity data from the Active Schools Institute of the University of Northern Colorado to determine what kind of time-structure patterns of activities could adequately describe the relationship between daily total magnitude and kids’ daily and weekly physical activity.

Keywords:

functional data analysis; intensive longitudinal data; functional penalized regression; decomposition based penalty; generalized ridge regression; activity pattern; accelerometery

MSC:

62P99

1. Introduction

Functional Data Analysis (FDA) deals with the analysis and theory of data that are in the form of functions, images and shapes, or more general objects [1,2]. The atom of functional data is a function, where for each subject in a random sample one or several functions are recorded. While the term “functional data analysis” was coined by [1,3], the history of this area is much older and dates back to Greven et al. (2020) [4] and Rao et al. (1950) [5]. There are several ways that the data can called functional. For instance, in archaeological studies the form of a three-dimensional image of each bone can be presented as functional data [6].

FDA has many applications in areas such as biostatistics, environmental monitoring, finance, and image analysis. For example, in biostatistics, FDA can be used to study the patterns of gene expression over time or to model the progression of a disease. In finance, FDA can be used to analyze the dynamics of financial markets over time, and to develop models for predicting future stock prices or other financial variables [7].

The functional data comes in many forms; however, their definition is a based on functions which are often smooth curves [8]. The smoothness is a characteristic that filter the noises inside the raw data as efficiently as possible [9]. They are several approaches for representing functional data in smoothing form: basis expansion, least-squares, and roughness penalty. One of basis expansions is based on B-splines. In this approach, both functional coefficient and functional predictor expand in the form of B-splines and thereafter the regularized estimate of functional coefficient will be applied [10]. Müller (2005) [11] expressed that the regression coefficient can be expressed in orthonormal basis determined by the eigenfunctions of the covariance of functional regressor. Goldmsmith [12] in 2011, combined both approaches introduced by Ramsay [10] and Müller [11]. This method has been considered in this study for smoothness and it is called roughness penalty because it penalizes fits that are too rough [13].

In many cases, data are observed longitudinally and collected over time. Longitudinal data is epansion of a time series, which is a sequence of data points that are ordered based on time, such that each data point corresponds to a specific moment in time [14]. Longitudinal data analysis can provide more accurate estimates of the effects of interventions or treatments over time [15]. The mixed-effects regression models have been used to study the longitudinal data [16]. These models allow us to make subject-specific conclusions and present statements about individuals changes, trends, or effects over time.

Sometimes in practice, the data has both characteristics of functional and longitudinal. One of the case can be when the outcome is not functional; however, covariates are collected functionally and longitudinally over one-time scale. People often use two approaches. One is based on function models where functional covariates have been smoothed according to roughness penalty and it is called Longitudinal Penalized Functional Regression (LPFR) [17]. The other one is built on according to principal component analysis for longitudinal functional data [18]. One of the main disadvantage of both approaches is the functional covariates are remained constant over time. But sometime in practice, we might have situation where functional covariates and scalar outcome have been changed over time which is considered in this study. Longitudinal functional models have not been studied to incorporate longitudinal functional for multiple time scales; however; Kundu et al. (2016) [19] published their work about the same data situation for one time-scale. Therefore, this study is an extension of Kund’s work to incorporate multilevel longitudinal functional covariates.

Studying two time scales is important in a wide range of fields, including physics, chemistry, biology, engineering, economics, and more. It allows researchers to gain a better understanding of complex systems and processes that operate over different time frames [20].

One of the key benefits of studying two time scales is that it can help researchers identify the underlying mechanisms that contribute to the behavior of a system. For example, in physics, studying the behavior of particles over different time scales can help researchers understand how those particles interact with each other and how their interactions lead to the formation of larger structures.

In chemistry, studying chemical reactions over different time scales can help researchers identify the intermediates and transition states that are involved in the reaction, and help them design more efficient and effective reactions [21].

Studying two time scales can also help researchers design and optimize systems for specific applications. For example, in engineering, studying the behavior of materials over different time scales can help researchers design materials with specific mechanical, electrical, or chemical properties.

Overall, studying two time scales is critical for advancing our understanding of complex systems and processes, and for developing new technologies and applications that can improve our lives.

Studying physical activities with two time scales is important because it allows us to understand how different processes operate at different timescales and how they interact with each other. Physical activities involve a wide range of processes that occur at different timescales, from short-term processes such as muscle activation and joint movements, to longer-term processes such as training adaptations and recovery [22].

By studying physical activities with two time scales, researchers can better understand the underlying mechanisms that contribute to performance and health outcomes [23]. For example, studying the short-term and long-term effects of exercise on muscle function and metabolism can help us understand how exercise training improves overall health and athletic performance [24].

Furthermore, studying physical activities with two time scales can help inform the development of effective training programs and interventions. By understanding how different processes interact over time, we can design training programs that are optimized for different goals, such as improving strength or endurance, and minimizing the risk of injury or overtraining [25].

Overall, studying physical activities with two time scales is critical for advancing our understanding of human performance and health and for developing effective strategies to optimize physical activity and exercise programs [26].

The proposed model was appropriate to apply on the data collected by the Active Schools Institute of the University of Northern Colorado. They were interested in the associations between daily and weekly activity profile, as measured by accelerometers, and academic and behavioral outcomes. To address these interests, students from one primary school in a suburban area in the western United States wore accelerometers during the school day for 5 consecutive days over 4 different weeks of the year. Corresponding academic and behavioral data were also obtained.

The data collected from the Active Schools Institute assumed daily student activity crossed within weeks. Due to nature of this data, establishing a multilevel mixed-effects regression model, including demographics or teacher effects, and accelerometer wear-time was essential. Hence, the proposed model was suited to apply on the physical activity data set.

As a consequence, this article addresses the following questions:

According to Section 2, the following question has been addressed: How can we represent a longitudinal functional regression model with time-varying regression functions for multiple time scales when time components are crossed?

Under Section 2.3, the following question is addressed: How can the mean parameters of the propsed model be estimated?

A response to this question can be found in Section 5: How can the model and estimation be implemented in software?

A detailed answer to this question can be found in Section 3: How do the proposed model compare to similar existing models with single time scales in terms of model MSE, across sample sizes and levels of association?

2. Intensive Longitudinal Functional Model with Multiple Time Scales

The intensive longitudinal method represents sequences of repeated measurements frequently recorded to characterize a separate change process for each subject [27].

Intensive longitudinal data refers to data that is collected repeatedly and frequently over a relatively short period of time, often at a high frequency (e.g., several times a day). This type of data is particularly useful for studying processes that unfold over time, such as mood fluctuations, stress reactivity, or medication effects [27,28].

Intensive longitudinal data can be collected using a variety of methods, including ecological momentary assessment (EMA), experience sampling methods (ESM), ambulatory monitoring, and diary or journal methods [29]. These methods typically involve asking participants to report on their experiences, thoughts, and behaviors at multiple points throughout the day, either through self-reports on electronic devices (such as smartphones or smartwatches), or by filling out paper or online forms [30].

The analysis of intensive longitudinal data requires specialized statistical methods that can capture the dynamic nature of the data, such as time-series analysis, multilevel modeling, and dynamic systems modeling. These methods allow researchers to examine how variables change over time, how they are related to each other, and how they are influenced by external factors [31].

Intensive longitudinal data has been used in a variety of fields, including psychology, medicine, and public health, to study phenomena such as mood disorders, substance use, physical activity, and sleep patterns.

This study considers a longitudinal model where the time scales of functional predictors cross. Additionally, the proposed model is suitable when data is collected frequently over a short period of time, often at a high frequency (e.g., several times a day). As a consequence, the proposed model may be referred to as an Intensive Longitudinal Functional Model with Multiple Time Scales (ILFMM) [32].

A specific estimation process is also included in the ILFMM. Estimating parameters is performed using a generalized ridge estimate. As a result of this estimation process, a linear mixed model representation can be derived. By using a mixed model approach, tuning parameters can be automatically selected. It is also possible to estimate parameters by fitting a linear mixed model based on this linear mixed model representation. It is worthwhile to note that the model presented has some advantages. There is no restriction on the time course of regression functions. In addition, it is possible to incorporate the structure of the regression function directly into the estimation process [19]. The approach described by Kundu et al. (2016)[19] was extended here to incorporate longitudinal functional predictors spanning multiple time scales.

2.1. Statistical Model of ILFMM

To represent a longitudinal functional regression model with time-varying regression functions for multiple time scales when time components are crossed, assume

X

presents univariate predictors,

W_{i t_{w}}

denotes a predictor function for the ith subject at timepoint

t_{w}

, and

D_{i t_{d}}

denotes a predictor function for the ith subject at timepoint

t_{d}

, where

i = {1, 2, \dots, N}

,

t = {t_{1}, t_{2}, \dots, t_{n_{i}}}

, and longitudinal-time points t can be decomposed in terms of

t_{w}

and

t_{d}

, where each

t_{k}

(

k = {1, 2, \dots, n_{i}}

) corresponds to one value each from

t_{w_{k}}

and

t_{d_{k}}

. It is assumed that each observed predictor is sampled at the same p locations,

s_{1}, \dots, s_{p} \in Ω

, and there are an equal number of observations per subject (i.e.,

n_{i}

are equal). Let

W_{i t_{w}} : = {[W_{i t_{w}} (s_{1}), \dots, W_{i t_{w}} (s_{p})]}^{⊤}

and

D_{i t_{d}} : = {[D_{i t_{d}} (s_{1}), \dots, D_{i t_{d}} (s_{p})]}^{⊤}

be the

p \times 1

vector of values sampled from the realized functions

W_{i t_{w}}

and

D_{i t_{d}}

, respectively. Then, the observed data have a form of

{y_{i t}; X_{i t}; W_{i t_{w}}; D_{i t_{d}}},

where

y_{i t}

is a scalar outcome,

X_{i t}

is a

K \times 1

column vector of measurements on K scalar predictors,

W_{i t_{w}}

is the sampled predictor from the ith subject corresponding to time

t_{w}

, and

D_{i t_{d}}

is the sampled predictor from the ith subject corresponding to time

t_{d}

. The longitudinal functional linear model with the scalar outcome and functional predictor along with multiple time scales could be written as

y_{i t} = X_{i t}^{⊤} β + \int_{Ω} W_{i t_{w}} (s) γ (t_{w}, s) d s + \int_{Ω} D_{i t_{d}} (s) η (t_{d}, s) d s + z_{i t}^{⊤} b_{i} + ε_{i t_{w} t_{d}},

(1)

where

γ \equiv γ (t_{w}, s)

are functional coefficients at time

t_{w}

,

η \equiv η (t_{d}, s)

are functional coefficients at time

t_{d}

,

b_{i}

is the vector of r random effects pertaining to subject i, and

ε_{i t_{w} t_{d}}

denote the subject-specific random effect and random error term, respectively.

Similar to a linear mixed model with time-related slope for longitudinal data, it is assumed that

γ

and

η

can be decomposed into several time-invariant component functions

γ_{0}, \dots, γ_{W}

and

γ_{0}, \dots, γ_{D}

as follows:

γ (t_{w}, s) = γ_{0} (s) + f_{1} (t_{1}) γ_{1} (s) + \dots + f_{W} (t_{W}) γ_{W} (s),

(2)

η (t_{d}, s) = η_{0} (s) + g_{1} (t_{1}) η_{1} (s) + \dots + g_{D} (t_{D}) η_{D} (s),

(3)

where

f_{1}, \dots, f_{W}

and

g_{1}, \dots, g_{D}

are functions of

t_{w}

and

t_{d}

, respectively. Additionally,

γ

and

η

are functions of s. In general,

γ (t_{w}, s)

and

η (t_{d}, s)

have three components:

t_{w}

,

t_{d}

, and s. The time component

t_{w}

enters into

γ

through

f_{1} (t_{1}), \dots, f_{W} (t_{W})

and the functional component enters into

γ (t_{w}, s)

through

γ_{0} (s), \dots, γ_{W} (s)

. Additionally, the time component

t_{d}

enters into

η (t_{d}, s)

through

g_{1} (t_{1}), \dots, g_{D} (t_{D})

and the functional component enters into

η (t_{d}, s)

through

η_{0} (s), \dots, η_{D} (s)

. The model is flexible with any function of

t_{w}

and

t_{d}

with

f (0) = 0

(e.g.,

f (t_{w}) = t_{w}

or

t_{w} e_{w}^{t}

and

f (t_{d}) = t_{d}

or

t_{d} e_{d}^{t}

). Additionally,

ε_{i t_{w} t_{d}} \sim N (0, σ_{ε}^{2})

and

b_{i}

is distributed as

N (0, Σ_{b_{i}})

. It is assumed that

ε_{i t_{w} t_{d}} and b_{i}

are independent,

ε_{i t_{w} t_{d}} and ε_{i^{'} {t_{w}}^{'} {t_{d}}^{'}}

are independent whenever

i \neq i^{'} or t_{w} \neq {t_{w}}^{'} or t_{d} \neq {t_{d}}^{'}

or all, and

b_{i}

and

b_{i^{'}}

are independent if

i \neq i^{'}

.

In Model (1)

X_{i t}^{⊤} β

is the standard fixed effect. In addition,

Z_{i t} b_{i}

is the standard multilevel random effect and

\int_{Ω} W_{i t_{w}} (s) γ (t_{w}, s) d s

is the subject and time specific functional effect corresponding to level w. Here

\int_{Ω} D_{i t_{d}} (s) η (t_{d}, s) d s

is the subject and time specific functional effect corresponding to level d. And both the variances and covariances across time are assumed to be constant as

Var (y_{i t_{w} t_{d}}) = σ_{b}^{2} + σ_{ϵ}^{2}

and

Cov (y_{i t_{w} t_{d}}, y_{i t_{w}^{'} t_{d}^{'}}) = σ_{b}^{2}

, where

t_{w} \neq t_{w}^{'}

and

t_{d} \neq t_{d}^{'}

. This implies a compound symmetry assumption for the variances and covariances.

Model (1) can be written based on Equations (2) and (3) as

\begin{matrix} y_{i t} & = & X_{i t}^{⊤} β + \int_{Ω} W_{i t_{w}} (s) (γ_{0} (s) + f_{1} (t_{1}) γ_{1} (s) + \dots + f_{W} (t_{W}) γ_{W} (s)) d s \\ + & \int_{Ω} D_{i t_{d}} (s) (η_{0} (s) + g_{1} (t_{1}) η_{1} (s) + \dots + g_{D} (t_{D}) η_{D} (s)) d s \\ + & z_{i t}^{⊤} b_{i} + ε_{i t_{w} t_{d}} . \end{matrix}

(4)

A primary goal in developing Model (4) is the estimation of

β

,

γ_{w} = {[γ_{w} (s_{1}), \dots, γ_{w} (s_{p})]}^{⊤}

,

0 \leq w \leq W

, and

η_{d} = {[η_{d} (s_{1}), \dots, η_{d} (s_{p})]}^{⊤}, 0 \leq d \leq D

.

2.2. Matrix Form of ILFMM

In the following an explanation will be given demonstrating how the proposed model can be written in matrix form. For this purpose, the observed values and the design matrix of fixed effect of Model (1) can be considered as:

\begin{matrix} y & = & (\begin{matrix} y_{1 t_{1}} \\ y_{1 t_{2}} \\ ⋮ \\ y_{1 t_{n_{1}}} \\ y_{2 t_{1}} \\ y_{2 t_{2}} \\ ⋮ \\ y_{2 t_{n_{2}}} \\ ⋮ \\ y_{N t_{1}} \\ y_{N t_{2}} \\ ⋮ \\ y_{N t_{n_{N}}} \end{matrix}) \in R^{\sum_{i = 1}^{N} n_{i} \times 1}, \end{matrix}

(5)

\begin{matrix} X & = & (\begin{matrix} X_{1 t_{1}} & X_{1 t_{2}} & \dots & X_{1 t_{n_{1}}} \\ X_{2 t_{1}} & X_{2 t_{2}} & \dots & X_{2 t_{n_{2}}} \\ ⋮ & ⋮ & \dots & ⋮ \\ X_{N t_{1}} & X_{N t_{2}} & \dots & X_{N t_{n_{N}}} \end{matrix}) \in M_{\sum_{i = 1}^{N} n_{i} \times k} (R) . \end{matrix}

(6)

It is important to point out that it is assumed there are an equal number of observations per subject (i.e.,

n_{i}

are equal). In addition, the matrices of functional coefficients,

W

and

D

, have the form

\begin{matrix} W = (\begin{matrix} W_{1} \\ ⋮ \\ W_{N} \end{matrix}) \in M_{\sum_{i = 1}^{N} n_{i} \times (W + 1) p} (R) and D = (\begin{matrix} D_{1} \\ ⋮ \\ D_{N} \end{matrix}) \in M_{\sum_{i = 1}^{N} n_{i} \times (D + 1) p} (R), \end{matrix}

(7)

where

\begin{matrix} W_{i} & = & (\begin{matrix} W_{i t_{1}}^{⊤} & f_{1} (t_{1}) W_{i t_{1}}^{⊤} & \dots & f_{W} (t_{1}) W_{i t_{1}}^{⊤} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ W_{i t_{n_{i}}}^{⊤} & f_{1} (t_{n_{i}}) W_{i t_{n_{i}}}^{⊤} & \dots & f_{W} (t_{n_{i}}) W_{i t_{i}}^{⊤} \end{matrix}) \end{matrix}

(8)

and

\begin{matrix} D_{i} & = & (\begin{matrix} D_{i t_{1}}^{⊤} & g_{1} (t_{1}) D_{i t_{1}}^{⊤} & \dots & g_{D} (t_{1}) D_{i t_{1}}^{⊤} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ D_{i t_{n_{i}}}^{⊤} & g_{1} (t_{n_{i}}) D_{i t_{n_{i}}}^{⊤} & \dots & g_{D} (t_{n_{i}}) D_{i t_{i}}^{⊤} \end{matrix}) . \end{matrix}

(9)

So Model (4) would be

Y = X β + W γ + D η + Zb + ε,

(10)

where

β

is the associated coefficient vector,

γ = {[γ_{0}^{⊤}, γ_{1}^{⊤}, \dots, γ_{W}^{⊤}]}^{⊤} \in R^{(W + 1) p \times 1}

and

η = {[η_{0}^{⊤}, η_{1}^{⊤}, \dots, η_{D}^{⊤}]}^{⊤} \in R^{(D + 1) p \times 1}

are coefficient vectors of function predictors,

b \in R^{r N \times 1}

is a vector of random effects and

Z \in M_{\sum_{i = 1}^{N} n_{i} \times r N} (R)

is the corresponding design matrix.

In this framework, the estimation approach is an extension of generalized ridge regression that takes the form of a penalty to the longitudinal setting in a manner that allows the estimated regression functions

γ (t_{w}, s)

and

η (t_{d}, s)

to vary with time.

2.3. Estimation Based on the Mixed Model Representation

The goal of this section is to estimate parameters based on the matrix form of Model (4). This model was described in Section 2.2 and it can be written as:

Y = X β + W γ + D η + Z b + ϵ,

(11)

where

X

,

W

,

D

, and

Z

are known matrices,

β

is an unobservable vector of fixed effects,

γ

and

η

are unobservable functional vectors of fixed effects of two time scales, and

b

is an unobservable vector of random effects with

E (b) = 0, C o v (b) = B

, and

C o v (b, ϵ) = 0

. Let

C o v (ϵ) = R

and

E (ϵ) = 0

, then

Z b + ε

is distributed as

N (0, V)

because

E (Z b + ε) = E (ε) = 0

and

C o v (Z b + ε) = Z B Z^{'} + R \equiv V .

In addition, it is assumed that

B

and

R

are known.

The well-known mixed model equation, developed by Henderson (1950) [33], can be applied at this time. Henderson’s mixed model equation is similar in spirit to normal equations, but the mixed model equation simultaneously provides Best Linear Unbiasted Estimators (BLUEs) and Best Linear Unbiased Predictors (BLUPs) by estimating the variance component

V

.

Theorem 1.

If

(\begin{matrix} {\hat{α}}^{'} {\hat{b}}^{'} \end{matrix})

is a solution to the mixed model

Y = C α + Z b + ϵ,

(12)

where

C = [X W D]

,

α = (β^{^{'}} γ^{^{'}} η^{^{'}})

,

C o v (b) = B

, and

C o v (ϵ) = R

, then

C \hat{α}

is a BLUE of

C α

and simultaneously

\hat{b}

is a BLUP of

b

if

B

and

R

are known and also by estimating the variance component

V

.

Theorem 1 demonstrates how the mixed Model (11) can provide BLUEs and BLUPs simultaneously. Appendix A.1 includes the proof of the Theorem 1.

2.4. Roughness Penalties Estimate

In this section a roughness penalties estimate based on the matrix form of Model (4) is discussed, which was described in Section 2.2. As mentioned, the mixed model can be written as:

Y = X β + W γ + D η + Z b + ϵ,

(13)

where

X

,

W

,

D

, and

Z

are known matrices,

β

is an unobservable vector of fixed effects,

γ

and

η

are unobservable functional vectors of fixed effects for two time scales, and

b

is an unobservable vector of random effects with

E (b) = 0, C o v (b) = B

, and

C o v (b, ϵ) = 0

. Let

C o v (ϵ) = R

and

E (ϵ) = 0

, then

Z b + ε

is distributed as

N (0, V)

(see Section 2.3).

It is also assumed that

V

is known.

Theorem 2.

For each

w = 0, \dots, W

and

d = 0, \dots, D

, let

L_{d}

and

L_{w}

be penalty operators for

η_{w}

and

γ_{d}

with

τ_{w}^{2}

and

λ_{d}^{2}

as the associated tuning parameters. A generalized ridge estimate associated with Model (13) for minimizing β, γ, and η

\begin{matrix} {∥ y - X β - W γ - D η ∥}_{v^{- 1}}^{2} & + & τ_{0}^{2} {∥η_{0}∥}_{L_{0}^{⊤} L_{0}}^{2} + \dots + τ_{W}^{2} {∥η_{W}∥}_{L_{W}^{⊤} L_{W}}^{2} \\ + & λ_{0}^{2} {∥γ_{0}∥}_{L_{0}^{⊤} L_{0}}^{2} + \dots + λ_{D}^{2} {∥γ_{D}∥}_{L_{D}^{⊤} L_{D}}^{2}, \end{matrix}

(14)

is obtained as

(\begin{matrix} \hat{β} \\ \hat{η} \\ \hat{γ} \end{matrix}) = {(C^{⊤} V^{- 1} C - M)}^{- 1} C^{⊤} V^{- 1} y,

(15)

where

C = [X W D], M = blockdiag \{0, {L_{Γ}}^{⊤} L_{Γ}, {L_{H}}^{⊤} L_{H}\},

L_{Γ} = blockdiag \{τ_{0} L_{0}, \dots, τ_{W} L_{W}\},

and

L_{H} = blockdiag \{γ_{0} L_{0}, \dots, γ_{D} L_{D}\} .

In Theorem 2, an expression for the generalized ridge estimate of Model (13) will be derived. The results of both theorems illustrated that the ridge estimate is BLUP from an equivalent mixed model; hence, the estimation procedure takes advantage of the equivalence between penalized least-squares estimation and linear mixed model representation. An Appendix A.2 contains the proof of the Theorem 2.

2.5. Selection of Time-Structure in $γ (t_{w}, s)$ and $η (t_{d}, s)$

In the proposed approach there is freedom to choose a time-structure for both

γ (t_{w}, s)

and

η (t_{d}, s)

. The matter of selecting appropriate time-structure in

γ (t_{w}, s)

and

η (t_{d}, s)

are analogous, in essence, to that of selecting time structure in a linear mixed-effects model.

For example, selection of time-structure

η_{1} + t_{d} η_{2}

or

η_{1} + t_{d} η_{1} + t_{d}^{2} η_{2}

or

η_{1} + t_{d} η_{1} + t_{d}^{2} η_{2} + t_{d}^{3} η_{3}

is similar to selection of variables in mutiple linear regression models based on linear mixed-effects model.

When comparing different time structures, it is common to apply Akaike Information Criterion (AIC). The formula

- 2 l + 2 p

, where l refers to a Restricted, or a Residual Maximum Likelihood (REML) estimation of mixed models and p refers to the number of parameters of the model, for information criterion (AIC) was used to determine time-structure in

γ (t_{w}, s)

and

η (t_{d}, s)

.

2.6. Selection of a, b, c, and d for Decomposition Penalties

The set of values

{1, 2, 3}

for each of a, b, c, and d are considered in this study. Those are treated as weights of a trade-off between preferred and non-preferred subspaces with

a b =

constant and

c d =

constant.

Additionally, REML is used to estimate tuning parameters when b and d are considered as fixed values and a grid search determines a and c values. These values are used to jointly select the tuning parameters.

Then, the selected values of a and c can be applied to the preferred subspace. The formula for information criterion (AIC)

- 2 l + 2 p

creates values for a and c, where l refers to a Restricted, or a Residual Maximum Likelihood (REML) estimation of mixed models and p refers to the number of parameters of the model.

3. Simulation Steps

In this section, the process of simulating the proposed model is explored. Without loss of generality, it was assumed that

b = 1

and

d = 1

; however, a and c could be varied on the set values of

{1, 2, 3}

. The largest values of a and c indicate there was much prior information in the estimation process. The following was used to generate the response data for 100 subjects by simulating functional covariates

W

in 4 time points

(t_{w} = 1, 2, 3, 4)

and functional covariates

D

in 5 time points

(t_{d} = 1, 2, 3, 4, 5)

:

\begin{matrix} y_{i t} & = & β_{0} + β_{1} t + \int_{0}^{1} W_{i t_{w}} (s) γ (s, t_{w}) d s \\ + & \int_{0}^{1} D_{i t_{d}} (s) η (s, t_{d}) d s + b_{i} + ε_{i t_{w} t_{d}}, i = 1, \dots, 100, \end{matrix}

(16)

where

γ (s, t_{w}) = γ_{1} (s) + t_{w} γ_{2} (s)

,

η (s, t_{d}) = η_{1} (s) + t_{d} η_{2} (s)

. Functional covariates

W

and

D

have the following form:

W = (\begin{matrix} W_{1} & W_{2} \end{matrix})

and

D = (\begin{matrix} D_{1} & D_{2} \end{matrix})

. Consequently, a rewritten of Model (16) would be

\begin{matrix} y_{i t} & = & β_{0} + β_{1} t + \int_{0}^{1} W_{1_{i t_{w}}} (s) γ_{1} (s) d s + \int_{0}^{1} t_{w} W_{2_{i t_{w}}} (s) γ_{2} (s) d s \\ + & \int_{0}^{1} D_{1_{i t_{d}}} (s) γ_{1} (s) d s + \int_{0}^{1} t_{d} D_{2_{i t_{d}}} (s) γ_{2} (s) d s \\ + & b_{i} + ε_{i t_{w} t_{d}}, i = 1, \dots, 100 . \end{matrix}

(17)

Functional covariates

W

were generated by considering three sets of sampling points: wide, moderate, and narrow corresponding to one, five, and two bumps, respectively [19].

As an example, the first set of 100 columns in

W

was represented as follows:

a.: We have recorded all of the observed values for the first subject at the first time point in the first $100 \times 1$ row of the $W$ matrix.
b.: As a result, all observation values for $W$ at all time points and all subjects in each following row have been arranged. This is so that each row represents an observation value for $W$ from a specific time point and subject. In accordance with $y$ , the rows were ordered by time and subject.

Having constructed the first set of 100 columns, we next proceeded to construct the second set of 100 columns by multiplying the first 100 columns by

t_{w}

.

Since it was impossible to generate the curves with functional form only by considering discrete observations at different points of the domain interval, it was necessary to consider some set of sampling points on the interval

[0, 1]

[19]. To simulate the true functional forms of the curves, prior knowledge about the functional variable was applied. Due to this, the observed value of

W

was calculated considering the degrees of curvature at the wide, moderate, and narrow bumps [19]. White noise was added to the predictor functions to account for the instrumental measurement noise [19].

Functional covariates

D

were generated by considering three sets of sampling points: wide, moderate, and narrow corresponding to one, five, and two bumps respectively as

H_{wide} = {0.40}, H_{moderate} = {0.10, 0.25, 0.65, 0.75, 0.80},

and

H_{narrow} = {0.50, 0.85} .

The first set of 100 columns in

D

were represented similar to process of presenting

W

as discussed ealier. Consequently, the observed value of

D

was generated in the same way as

W

.

The functional coefficients

γ_{1}

,

γ_{2}

were generated according to [19] considering different degrees of curvature at the wide, moderate, and narrow bumps. But

η_{1}

, and

η_{2}

were generated as follows with bumps centered at

H_{η_{1}} = {0.10, 0.45, 0.75}

, and

H_{η_{2}} = {0.25, 0.65}

,

\begin{matrix} η_{1} (s) & = & \sum_{h \in H_{η_{1}}} a_{w} (h) exp [- c_{w} (h) * {(\frac{s - h}{100})}^{2}], \\ η_{2} (s) & = & \sum_{h \in H_{η_{n}}} a_{n} (h) exp [- c_{n} (h) * {(\frac{s - h}{100})}^{2}], \forall s \in [0, 1], \end{matrix}

(18)

where

a_{w} (h)

,

c_{w} (h)

,

a_{n} (h)

, and

c_{n} (h)

correspond to amplitude and degree of curvature respectively. In fact, “bump” is the same as the “turning” of the polynomial graph. While the proper term is local maximum, the term “bump” referred to the “turning” of functional coefficients. The amplitude is the height from the centerline to the peak (dip) of the polynomial, and it measures the height from highest to lowest points divided by two. The amplitude values as

0.1, 0.4, 0.5,

and

0.6

were considered to simulate functional coefficients [19]. Another term that played an important role in simulating was degree of curvature. It was referred to the degree of a polynomial [34]. The amplitude and curvature used to generate the functional coefficients

η_{1}

, and

η_{2}

are specified in Table 1.

The model was applied to analyze the physical activities of kids with logitudonal functional data with two-time scale. The results suggested that

β_{0} = 0.06

and

β_{1} = 0.4

were reasonable choices for fixed parameters in the simulation study. Having made these choices the fixed effect, the random effect, and errors were generated and are shown in Table 2.

Throughout this simulation the following decomposition penalties, an extension of Kundu et al. (2016) [19] for multiple time scales, were considered:

b = 1

and

d = 1

while a and c were varied over

1, 2,

and 3, also structured penalties were

L_{Q_{w}} = 10^{a} (I - P_{Q_{w}}) + b P_{Q_{w}}

and

L_{Q_{d}} = 10^{c} (I - P_{Q_{d}}) + d P_{Q_{d}} .

R 3.5.2 GUI 1.70 El Capitan build (7612) and RStudio Version 1.1.383 were used to analyze the simulated data. The method discussed in this section was implemented in the R package nlme via the lme function. The R code has been uploaded to GitHub and it is viewable at https://github.com/Mostafa2020-bit/PackageILFMM since November 2020.

About 100 data were generated for two different sample sizes(100 and 200) along with two different

R^{2}

(

0.6

and

0.9

).

The primary interests were in the estimation of the functional coefficients and the measurement of the squared distance between the fitted and the true value of

y

. Estimation error was summarized in terms of the Mean Squared Errors (MSE) of the functional coefficients. Also, the MSE of

\tilde{y}

was obtained.

With respect to the MSE of both the functional covariates and the response, we investigated the effect of increasing the sample size and

R^{2}

on these values of

M S E

. If these were increased, would

M S E (γ_{1})

,

M S E (γ_{2})

,

M S E (η_{1})

,

M S E (η_{2})

, and

M S E (\tilde{y})

decrease or not? In this scenario, the following criteria were computed:

M S E (γ_{1}) = {∥ γ_{1} - \tilde{γ_{1}} ∥}^{2} = \sum_{s = 1}^{p} \int_{Ω} {({γ_{1}}_{s} - {\tilde{γ_{1}}}_{s})}^{2} d s,

where

\tilde{γ_{1}}

is the estimate of

γ_{1}

and each observed predictor

γ_{1}

is sampled at the same p locations,

s_{1}, \dots, s_{p} \in Ω

. Similarly,

M S E (γ_{2})

,

M S E (η_{1})

,

M S E (η_{2})

were computed; however, the calculation of

M S E (\tilde{y})

was different because of its discrete characteristics and it was computed as

M S E (\tilde{y}) = \frac{∥ y - \tilde{y} ∥^{2}}{N} = \frac{\sum_{i = 1}^{N} \sum_{j = 1}^{n_{i}} {(y_{i t_{j}} - {\tilde{y}}_{i t_{j}})}^{2}}{N},

where

\tilde{y}

denotes the model fitted values of the true

y

.

The

M S E (γ_{1})

,

M S E (γ_{2})

,

M S E (η_{1})

, and

M S E (η_{2})

could provide empirical evidence on whether the estimates are close to the corresponding true function or not.

To determine the effect of the decomposed penalty weights a, b, c, and d without loss of generality, b and d were considered a fixed value of 1. Then a and c were increased up to 3 on an exponential scale (i.e.,

{1, 2, 3}

). The reason for this was to determine which combination of a, b, c, and d values would improve the estimation of

γ_{1}

,

γ_{2}

,

η_{1}

, and

η_{2}

. In this manner, an evaluation of estimation performance would remain almost unchanged.

Decomposition penalty values were selected according to the

- 2 l + 2 p

formula for AIC. It was expected that minimized AIC for selected decomposition penalty values leads to a minimized MSE for different sample sizes and

R^{2}

s. AIC values were computed for all combinations of a and c values. Consequently, the simulation parameters were a and c values, the four sample sizes, and the four

R^{2}

s.

4. Simulation Conclusions

The simulation study illustrated the potential advantage of an Intensive Longitudinal Functional Model with Multiple time scales (ILFMM) estimate in exploiting an informed, structured penalty. The simulation suggested that as the sample size and

R^{2}

increased, the MSE for both functional coefficients and the fitted values decreased, Figure 1 and Table 3.

Furthermore, sample size had a larger impact for smaller

R^{2}

, and also

R^{2}

had a greater impact for smaller sample size, Figure 2.

These results provided empirical evidence that the ILFMM estimates of functional coefficients were close to the true functional estimate (basically unchanged). In other words, the estimation of the functional coefficients rose to the level of the estimation of true functions. These results suggested that AIC could guide the choice of ridge weights, Figure 3 and Figure 4.

The results for AIC are displayed graphically in Figure 3. It has been demonstrated, when c = 3, AIC is minimized among all cases. Therefore, the choice of decomposition penalties is a = 3, b = 1, c = 3, and d = 1. Also with sufficiently large values there was minimal impact on the estimation performance.

In addition, the model with the selected ridge penalties had the lowest MSE values of

\tilde{y}

as compared to the model with the defined selection of ridge penalties, Figure 4 and Table 3. These results implied that the model with the selected ridge penalties would perform better than the model with the basic selection of ridge penalties.

Furthermore, for smaller sample sizes and

R^{2}

s, the ILFMM estimate may oversmooth the estimated regression function. However, by increasing the sample size to 200 or

R^{2}

to

0.9

, it was observed that the average ILFMM estimate of the functional coefficients approached the true functions.

All in all, the model with ridge penalties

a = 3, b = 1, c = 3,

and

d = 1

appears to perform better than the remaining models with other selections of the ridge penalties.

In the absence of a model with two time scales for comparison, it was decided to evaluate the effect of sample size and

R^{2}

on MSE of functional coefficients, the fitted and AIC values, which would indicate whether the patterns are consistent with those seen in previous models.

Kundu’s approach was based on a single time scale. The results of the proposed approach was lined up with consideration to a single time scale comparing AIC values [19] (i.e., single versus multiple time scales).

Also, Kudu concluded the selected ridge penalties had the lowest MSE values as when compared to the model with the basic selection. This demonstrated that the selected ridge penalties were reasonable. In addition, when the sample size and

R^{2}

were increased, MSE values were decreased. Furthermore, sample size had a larger impact for smaller

R^{2}

, and also

R^{2}

had a greater impact for smaller sample size [19].

5. Physical Activity Study Application

The proposed method could be implemented in a ILFMM() (i.e., Intensive Longitudinal Functional Model with Multiple time scales) package in R. The author has written the ILFMM() package for a longitudinal functional model with multiple time scales along with scalar outcome, multiple functional predictors, one or more scalar covariates, and subject-specific random intercepts through mixed model equivalence. This package also works for a longitudinal functional model with a single time scale. The ILFMM() package has been uploaded to GitHub and it is viewable at https://github.com/Mostafa2020-bit/PackageILFMM sicne Novebmer 2020.

It is more common in the Physical Activity (PA) field to calculate the amount of PA a child gets in a single day, rather than a week. So this might be why it has been difficult to find information on weekly amounts of PA. Consequently, it is standard in PA projects to get week data by taking the average of all weekdays [35,36].

Up until about five years ago most physical activity researchers only measured and reported five days of activity at a time. Most reviews of handling accelerometer data focus on the epochs, the device or the type of outcome reported (Moderate-to-Vigorous Physical Activity (MVPA) or light PA) [37].

In PA field, the daily accelerometer-assessed time in MVPA over a week is usually calculated as the mean of data over 7 days. For participants with less than 7 valid days of data, the following formula is common to use for standardizing measurements to one week for all participants [37]:

[(5 \times

mean daily weekday MVPA time

+ 2 \times

mean daily weekend MVPA time

)] / 7 .

The dataset was stored in a secure password-protected Super R machine server utilized by IT at University of Northern Colorado for the Active Schools Institute. The author had permission from the Active Schools Institute to access the dataset. The data used included the five-second records from all students in school 1 for all weeks 1, 2, 3, and 4 over five days of school time. The standard approach was used to get week data.

For all weeks the “Axis-1” measurements ranged approximately from 0 to

2647.0

, with a mean of

20.75

, where “Axis-1” referred to the physical activities over x-axis coordinate.

For week 1, many observations were irrelevant (i.e., zeros recorded late in the evening), so observations were truncated to consider only the time between 8:00 AM and 3:30 PM. For example, by considering the first day of data collection the number of observations were

1, 059, 840

. Further limitation on the number of observations came from the restriction of the hours from 8:00 AM to 3:30 PM on the first day. This resulted in a final data set of

496, 800

observations. This is around 5900 observations per student. This number is close but a little high for 5-second observations. It would be anticipated that there would be 5400 observations per student (12 per minute, 60 min. per hour,

7.5

h). However, 8 students show twice as many observations, suggesting that they have two days labeled as 9 November 2016. For all other weeks there was no such a problem.

For all weeks, the Stata default for time-stamps was New Year’s 1960. All official time stamps showed dates around this time, but for analysis, dates were inherited from the “Date” variable that the research team included.

Time stamps were translated in an unexpected way. For Day 1, all activity appeared to begin at 1:00 AM for each student. We assumed this should have been 8:00 AM and translated the time-stamps accordingly. For all other days, recording appeared to begin at 5:00 PM, but all activity appeared to begin at midnight. We again assumed these times should have been 8:00 AM and translated time-stamps accordingly. We then removed all observations outside the hours of 8:00 AM through 3:30 PM. Plots of the five-second data are usually “smoothed” to show the true patterns in the data. Smoothing is a method of taking the data in “windows of time” to estimate means or regression trends. The size of the window has a strong impact on the graphic produced but is subjective, and it is usually determined by a span value.

In designing the matrix for weekly activities, 162,000 data points per subject were considered for each week. The researcher chose 162,000 data points because it was large enough to capture physical activity patterns from 8:00 AM to 3:30 PM. This implies 684,000 observations per subject for four weeks (i.e., 162,000 × 4 = 684,000). And since 80 subjects were considered, the total number of observations were 51,840,000. These design matrices are graphically shown for 11 subjects in Figure 5 and Figure 6. It provides us with information on the overall patterns of physical activity on a daily and weekly basis.

For daily physical activity analysis, we chose 162,000 because it was large enough to capture physical activity patterns every five seconds from 8:00 AM to 3:30 PM. This implies 3,240,000 observations per subject for five days across four weeks (i.e., 162,000 × 5 × 4 = 3,240,000). And since 80 subjects were considered the total number of observations were 259,200,000. These design matrices are graphically shown in Figure 7 and Figure 8.

The proposed model with two-time scales along with

a = 3

,

b = 1

,

c = 3

, and

d = 1

as ridge penalties was applied to real data as

y_{i t_{w}} = β_{0} + β_{p} T e a c h e r_{p} + \int_{0}^{1} W_{i t_{w}} (s) γ (s, t_{w}) d s + \int_{0}^{1} D_{i t_{d}} (s) η (s, t_{d}) d s + b_{i} + ε_{i t_{w} t_{d}},

(19)

where

γ (s, t_{w}) = γ_{1} (s) + t_{w} γ_{2} (s)

,

η (t_{d}, s) = η_{1} (s) + t_{d} η_{2} (s) + t_{d}^{2} η_{2} (s) + t_{d}^{3} η_{4} (s)

,

i = 1, \dots, 100

, and

p = 1, 2, 3, 4, 5, 6

. The expanded form of Model (19) would be

\begin{matrix} y_{i t_{w} t_{d}} & = & β_{0} + β_{2} T e a c h e r_{2} + β_{3} T e a c h e r_{3} + β_{4} T e a c h e r_{4} \\ + & β_{5} T e a c h e r_{5} + β_{6} T e a c h e r_{6} \\ + & \int_{0}^{1} W_{1_{i t_{w}}} (s) γ_{1} (s) d s + \int_{0}^{1} t_{w} W_{2_{i t_{w}}} (s) γ_{2} (s) d s \\ + & \int_{0}^{1} D_{1_{i t_{d}}} (s) η_{1} (s) d s + \int_{0}^{1} t_{d} D_{2_{i t_{d}}} (s) η_{2} (s) d s + \int_{0}^{1} t_{d}^{2} D_{3_{i t_{d}}} (s) η_{3} (s) d s \\ + & \int_{0}^{1} t_{d}^{3} D_{4_{i t_{d}}} (s) η_{4} (s) \\ + & b_{i} + ε_{i t_{w}} . \end{matrix}

(20)

The ILFMM estimates were obtained as BLUP assuming

ε_{i t_{w} t_{d}} \sim N (0, σ_{ε}^{2})

and the subject-specific random intercepts

b_{i} \sim N (0, σ_{b}^{2})

. The model was fitted into ILFMM() R package and the estimates of

σ_{ε}^{2}

and

σ_{b}^{2}

were

0.000320

and

0.7281

, respectively.

We compared models based on AIC values with a different selection of weight penalties and time structures (linear, quadratic, and cubic). The results are summarized in Table 4.

Based on the AIC, Model 16, with a cubic time structure for days and linear time structure for weeks and weight penalties

a = 3

,

b = 1

,

c = 3

, and

d = 1

appears to be better than the other models.

6. Discussion

The proposed intensive longitudinal functional model with multiple time-varying scales with scalar outcome, multiple functional predictors, one or more scalar covariates, and subject-specific random intercepts through mixed model equivalence has been defined. There were three primary advantages of this framework. First, this model estimated a time-dependent regression function. Second, it was also able to incorporate structural information into the estimation process. And third, it was easily implemented through linear mixed model equivalence.

One of the significant limitations of this study was the hardware needs for the organization and analysis of the data. Storage space to house the data, networking bandwidth to transfer it to and from analytics systems, and computing resources to perform those analytics were very time-consuming.

The physical activity data study results indicate that the proposed method is an appropriate one for the tested application. By using the proposed model, we could investigate what kind of time-structure for activity patterns would adequately describe the relationship between the subjects’ daily total magnitude and their weekly physical activities. Likewise, it described this relationship for daily physical activities.

So, the interest of the Active Schools Institute has been addressed through the proposed model. The proposed model concluded that the time structure of weekly activity was linear, but for daily activity was cubic. This conclusion was reasonable because the general patterns of movement throughout the day had more fluctuations than weekly movement. The weekly movements showed more smoothed patterns. Also, several dips in data points around the sedentary category for daily movement were observed. However, for weekly movement, it appeared this sedentary behavior was changed (i.e., it is needed to consider two separate definitions for sedentary behavior for weekly and daily activities). This analysis also gives us a slightly better sense of how movement intensity changes over time for both time scales.

The following are recommendations for future studies. One area of further investigation is the effect of different time structures. In this study, a linear form for simulation was considered; however, other forms of

f (t)

, such as

e^{(t)} - 1

or

l o g (t + 1)

could be explored. A relative improvement in AIC due to the use of these complex structures would be anticipated. For instance, it would be expected to have a minimal AIC associated with

γ (t, s) = γ_{0} (s) + [e^{(t)} - 1] γ_{1} (s)

in comparison to AIC for

γ (t, s) = γ_{0} (s) + t γ_{1} (s)

.

In the proposed model, the ridge weights were set as

b = d = 1

while a and c varied on an exponential scale. Larger values of a and c indicated greater emphasis of prior information on the estimation process. Future study could investigate other scales of ridge weights to determine more accurately and quickly the estimation process.

A possible extension of this work could be to incorporate multiple functional predictors. For example, in PA projects, the physical activity and patterns might be studied over an entire year. This implies there would be three-time scales: days, weeks, and years. Consequently, an extension of the proposed approach would be needed.

Additionally, an interesting and powerful use of the framework underlying the proposed model is when the response variable is binary. In this case when the response variable is binary, different problems arise. Multicollinearity and high dimensionality prejudice the estimation of the model and the interpretation of its parameters. This prejudice can be overcome by using an intensive longitudinal multilevel functional logistic regression model with principal component analysis. This study could be extended to any differentiable Hilbert space for analyzing of image, spatial, and spatial-temporal data.

Author Contributions

Conceptualization, M.Z.; methodology, M.Z.; software, M.Z.; validation, M.Z., T.L. and M.S.; formal analysis, M.Z.; investigation, M.Z.; resources, M.Z. and M.S.; data curation, M.Z.; writing—original draft preparation, M.Z.; writing—review and editing, M.S.; visualization, M.Z. and M.S.; supervision, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Due to legal and ethical restrictions, the data supporting the findings of this study cannot be made publicly available.

Acknowledgments

The authors would like to thank the Active Schools Institute at the University of Northern Colorado, for allowing us to use the data set.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LPFR	Longitudinal Penalized Functional Regression
ILFMM	Intensive Longitudinal Functional Model with Multiple time scales
BLUEs	Best Linear Unbiasted Estimators
BLUPs	Best Linear Unbiased Predictors
AIC	Akaike Information Criterion
REML	Residual Maximum Likelihood
MSE	Mean Squared Errors
PA	Physical Activity
MVPA	Moderate-to-Vigorous Physical Activity

Appendix A

Appendix A.1. Proof of Theorem 1

Proof.

If

\hat{α}

is a solution to

C^{'} V^{- 1} C α = C^{'} V^{- 1} Y

, then

C \hat{α}

will be a BLUE of

C α

. To implement this equation it is necessary to know a form for

V^{- 1}

. According to Proposition B.56 page 356 of [38], the inverse of

V

in terms of

Z

,

B

, and

R

is as follows:

V^{- 1} = R^{- 1} - R^{- 1} Z {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1} .

If

\hat{α}

and

\hat{b}

are solutions, then the second row of the normal equation for the mixed model Equation (12) gives

Z^{'} R^{- 1} C \hat{α} + [B^{- 1} + C^{'} R^{- 1} C] \hat{b} = Z^{'} R^{- 1} Y

(A1)

or

\hat{b} = {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1} (Y - C \hat{α}) .

(A2)

The first row the normal equation of Equation (12) is

C^{'} R^{- 1} C \hat{α} + C^{'} R^{- 1} C \hat{b} = C^{'} R^{- 1} Y .

(A3)

Substituting for

\hat{b}

gives

C^{'} R^{- 1} C \hat{α} + C^{'} R^{- 1} Z {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1} (Y - C \hat{α}) = C^{'} R^{- 1} Y

(A4)

or

\begin{matrix} C^{'} R^{- 1} C \hat{α} - C^{'} R^{- 1} Z {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} & Z^{'} R^{- 1} C \hat{α} \\ = C^{'} R^{- 1} Y - C^{'} R^{- 1} Z {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1} Y, \end{matrix}

(A5)

which is

C^{'} V^{- 1} C \hat{α} = C^{'} V^{- 1} Y

. Therefore,

\hat{α}

is a generalized least squares solution and

C \hat{α}

is a BLUE.

Equation (A2) can be rewritten as follows:

\begin{matrix} \hat{b} & = (B [B^{- 1} + Z^{'} R^{- 1} Z] - B Z^{'} R^{- 1} Z) {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1} (Y - C \hat{α}) \\ = (B Z^{'} R^{- 1} - B Z^{'} R^{- 1} Z {[B^{- 1} + Z^{'} R^{- 1} Z]}^{- 1} Z^{'} R^{- 1}) (Y - C \hat{α}) \\ = B Z^{'} V^{- 1} (Y - C \hat{α}), \end{matrix}

(A6)

which is the BLUP of

\hat{b}

of

b

. □

Appendix A.2. Proof of Theorem 2

Proof.

Let

U = (\begin{matrix} β \\ γ \\ η \end{matrix}) .

By knowing that

{∥ a ∥}_{B}^{2} = a^{⊤} B a

Equation (14) can be written as

\begin{matrix} {(y - X β - W γ - D η)}^{⊤} v^{- 1} (y - X β - W γ - D η) + U^{⊤} M U \end{matrix}

or

\begin{matrix} {(y - C U^{⊤})}^{⊤} v^{- 1} (y - C U^{⊤}) + U^{⊤} M U . \end{matrix}

Now, consider

l = {(y - C U^{⊤})}^{⊤} v^{- 1} (y - C U^{⊤}) + U^{⊤} M U

. By taking the partial derivative with respect to

U^{⊤}

the following would be true:

\begin{matrix} \frac{\partial l}{\partial U^{⊤}} & = & 2 C^{⊤} V^{- 1} (y - C U^{⊤}) + 2 M U^{⊤} . \end{matrix}

Set

\begin{matrix} \frac{\partial l}{\partial U^{⊤}} & = & 0 . \end{matrix}

Then, the following verdict becomes apparent:

\begin{matrix} 2 C^{⊤} V^{- 1} (y - C U^{⊤}) & = & - 2 M U^{⊤} . \end{matrix}

It implies that

\begin{matrix} C^{⊤} V^{- 1} y - C^{⊤} V^{- 1} C U^{⊤} & = & - M U^{⊤} \\ C^{⊤} V^{- 1} y & = & C^{⊤} V^{- 1} C U^{⊤} - M U^{⊤} \\ C^{⊤} V^{- 1} y & = & (C^{⊤} V^{- 1} C - M) U^{⊤} . \end{matrix}

So,

\begin{matrix} {\hat{U}}^{⊤} & = & {(C^{⊤} V^{- 1} C - M)}^{- 1} C^{⊤} V^{- 1} y . \end{matrix}

Consequently, a generalized ridge estimate of

β

,

γ

, and

η

can be obtained

(\begin{matrix} \hat{β} \\ \hat{η} \\ \hat{γ} \end{matrix}) = {(C^{⊤} V^{- 1} C - M)}^{- 1} C^{⊤} V^{- 1} y .

(A7)

□

References

Ramsay, J.O. When the data are functions. Psychometrika 1982, 47, 379–396. [Google Scholar] [CrossRef]
Ramsay, J.O.; Hooker, G.; Graves, S. Functional Data Analysis with R and MATLAB; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
Ramsay, J.O.; Dalzell, C. Some tools for functional data analysis. J. R. Stat. Soc. Ser. B (Methodol.) 1991, 53, 539–572. [Google Scholar] [CrossRef]
Greven, S.; Crainiceanu, C.; Caffo, B.; Reich, D. Longitudinal functional principal component analysis. Electron. J. Stat. 2010, 4, 1022–1054. [Google Scholar] [CrossRef] [PubMed]
Rao, C.R. Some statistical methods for comparison of growth curves. Biometrics 1950, 14, 1–17. [Google Scholar] [CrossRef]
Wang, J.L.; Chiou, J.M.; Müller, H.G. Functional data analysis. Annu. Rev. Stat. Its Appl. 2016, 3, 257–295. [Google Scholar] [CrossRef]
Di, C.; Crainiceanu, C.M.; Mueller, H.G. Functional Data Analysis. In Handbook of Big Data Analytics; Springer: Berlin/Heidelberg, Germany, 2020; pp. 409–435. [Google Scholar]
Cardot, H.; Ferraty, F.; Sarda, P. Functional linear model. Stat. Probab. Lett. 1999, 45, 11–22. [Google Scholar] [CrossRef]
Cuevas, A.; Febrero, M.; Fraiman, R. Linear functional regression: The case of fixed design and functional response. Can. J. Stat. 2002, 39, 285–300. [Google Scholar] [CrossRef]
Ramsay, J.O.; Heckman, N.; Silverman, B.W. Spline smoothing with model-based penalties. Behav. Res. Methods Instruments Comput. 1997, 29, 99–106. [Google Scholar] [CrossRef]
Müller, H.G. Functional modelling and classification of longitudinal data. Scand. J. Stat. 2005, 32, 223–240. [Google Scholar] [CrossRef]
Goldsmith, J.; Bobb, J.; Crainiceanu, C.; Caffo, B.; Reich, D. Penalized functional regression. J. Comput. Graph. Stat. 2011, 4, 453–469. [Google Scholar] [CrossRef]
Ruppert, D.; Wand, M.; Carroll, R. Semiparametric Regression; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Zahed, M. Forecasting tourist’s arrivals to the USA with SARIMA models, Paper 61-2018, SAS 9.4-2018. In Proceedings of the Western Users of SAS Software (WUSS) 2018, Sacramento, CA, USA, 5–7 September 2018. [Google Scholar]
Hailemariam, M.; O’Neill, S.; O’Hare, G.M. Longitudinal data analysis of mental health outcomes among postpartum women using a Bayesian multilevel model. J. Affect. Disord. 2021, 292, 699–707. [Google Scholar]
Hedeker, D.; Gibbons, R.D. Longitudinal Data Analysis; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
Goldsmith, J.; Crainiceanu, C.M.; Caffo, B.; Reich, D. Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2012, 3, 453–469. [Google Scholar] [CrossRef]
Gertheiss, J.; Goldsmith, J.; Crainiceanu, C.; Greven, S. Longitudinal scalar-on-functions regression with application to tractography data. Biostatistics 2013, 3, 447–461. [Google Scholar] [CrossRef]
Kundu, M.G.; Harezlak, J.; Randolph, T.W. Longitudinal functional models with structured penalties. Biostatistics 2016, 2, 114–139. [Google Scholar] [CrossRef]
Fukagata, K.; Iwamoto, K.; Kasagi, N. Contribution of Reynolds stress distribution to the skin friction in wall-bounded flows. Phys. Fluids 2002, 14, L73–L76. [Google Scholar] [CrossRef]
Sokolov, I.M. Models of anomalous diffusion in crowded environments. Soft Matter 2012, 8, 9043–9052. [Google Scholar] [CrossRef]
Costell, M.H.; Ancellin, N.; Bernard, R.E.; Zhao, S.; Upson, J.J.; Morgan, L.A.; Maniscalco, K.; Olzinski, A.R.; Ballard, V.L.; Herry, K.; et al. Comparison of soluble guanylate cyclase stimulators and activators in models of cardiovascular disease associated with oxidative stress. Front. Pharmacol. 2012, 3, 128. [Google Scholar] [CrossRef]
Zahed, M.; Skafyan, M. Application of Feature Selection and Dimension Reduction Techniques on Large-Scale CT Dataset for Lung Cancer Diagnosis Based on Radiomics, Paper 222-2023, 2018. In Proceedings of the Southeast SAS Users Group (SESUG), Mobile, AL, USA, 14–17 October 2018. [Google Scholar]
Burt, D.; Lamb, K.; Nicholas, C.; Twist, C. Lower-volume muscle-damaging exercise protects against high-volume muscle-damaging exercise and the detrimental effects on endurance performance. Eur. J. Appl. Physiol. 2015, 115, 1523–1532. [Google Scholar] [CrossRef]
Meeusen, R.; Duclos, M.; Foster, C.; Fry, A.; Gleeson, M.; Nieman, D.; Raglin, J.; Rietjens, G.; Steinacker, J.; Urhausen, A. Prevention, diagnosis and treatment of the overtraining syndrome: Joint consensus statement of the European College of Sport Science (ECSS) and the American College of Sports Medicine (ACSM). Eur. J. Sport Sci. 2013, 13, 1–24. [Google Scholar] [CrossRef]
Seiler, S.; Jøranson, K.; Olesen, B.; Hetlelid, K. Adaptations to aerobic interval training: Interactive effects of exercise intensity and total work duration. Scand. J. Med. Sci. Sport. 2013, 23, 74–83. [Google Scholar] [CrossRef]
Bolger, N.; Laurenceau, J.P. Intensive Longitudinal Methods: An Introduction to Diary and Experience Sampling Research; Guilford Press: New York, NY, USA, 2013. [Google Scholar]
Stone, A.; Shiffman, S.; Atienza, A.; Nebeling, L. The Science of Real-Time Data Capture: Self-Reports in Health Research; Oxford University Press: Oxford, UK, 2007. [Google Scholar]
Trull, T.; Ebner-Priemer, U. Using experience sampling methods/ecological momentary assessment (ESM/EMA) in clinical assessment and clinical research: Introduction to the special section. Psychol. Assess. 2009, 21, 457–462. [Google Scholar] [CrossRef] [PubMed]
Fahrenberg, J.; Myrtek, M.; Pawlik, K.; Perrez, M. Ambulatory assessment-monitoring behavior in daily life settings. Eur. J. Psychol. Assess. 2007, 23, 206–213. [Google Scholar] [CrossRef]
Hektner, J.M.; Schmidt, J.A.; Csikszentmihalyi, M. Experience Sampling Method: Measuring the Quality of Everyday Life; Sage: Thousand Oaks, CA, USA, 2007. [Google Scholar]
Zahed, M. An Intensive Longitudinal Functional Linear Model with Multiple Time Scales; University of Northern Colorado, ProQuest LLC: Greeley, CO, USA, 2020. [Google Scholar]
Henderson, C. Estimation of genetic parameters (abstract). Ann. Math. Stat. 1950, 1, 817–827. [Google Scholar]
Carden, R.L.; Jahromi, M.Z. The inverse q-numerical range problem and connections to the Davis–Wielandt shell and the pseudospectra of a matrix. Linear Algebra Appl. 2017, 531, 479–497. [Google Scholar] [CrossRef]
Laird, D.A. Relative performance of college students as conditioned by time of day and day of week. J. Exp. Psychol. 1925, 8, 50. [Google Scholar] [CrossRef]
McLellan, G.; Arthur, R.; Donnelly, S.; Buchan, D.S. Segmented sedentary time and physical activity patterns throughout the week from wrist-worn ActiGraph GT3X+ accelerometers among children 7–12 years old. J. Sport Health Sci. 2020, 9, 179–188. [Google Scholar] [CrossRef]
Menai, M.; Van Hees, V.T.; Elbaz, A.; Kivimaki, M.; Singh-Manoux, A.; Sabia, S. Accelerometer assessed moderate-to-vigorous physical activity and successful ageing: Results from the Whitehall II study. Sci. Rep. 2017, 7, 45772. [Google Scholar] [CrossRef]
Christensen, R. Plane Answers to Complex Questions; Springer: New York, NY, USA, 2002; Volume 35. [Google Scholar]

Figure 1. Average MSE of functional coefficients versus simulation cases (3-D plot).

Figure 2. Average MSE of functional coefficients versus simulation cases (2-D plot).

Figure 3. Comparison of AIC for selection of c (

a = 3, b = 1, d = 1

).

Figure 3. Comparison of AIC for selection of c (

a = 3, b = 1, d = 1

).

Figure 4. Average MSEs of

M S E (\tilde{y})

for the defined selection of ridge penalties.

Figure 4. Average MSEs of

M S E (\tilde{y})

for the defined selection of ridge penalties.

Figure 5. First column of weekly activity for 11 subjects over five days.

Figure 6. Second column of weekly activity for 11 subjects over five days.

Figure 7. First column of daily activity for 11 subjects across four weeks.

Figure 8. Second column of daily activity for 11 subjects across four weeks.

Table 1. The amplitude and curvature parameters for

η_{1}

and

η_{2}

.

Table 1. The amplitude and curvature parameters for

η_{1}

and

η_{2}

.

h	Amplitude $a_{w} (h)$	Curvature $c_{w} (h)$	Amplitude $a_{m} (h)$	Curvature $c_{m} (h)$
5	0.15
15			0.15	500
30	−0.10	250
50
70	0.10	250
80			−0.15	1000
90
95

The second and third columns are parameters for

η_{1}

, and the last two columns are for

η_{2}

.

Table 2. Choices for fixed effects, random effect, and errors in the simulation study.

Fixed Effect	$β_{0}$	0.06
Fixed Effect	$β_{1}$	0.04
Random Effect	$b_{i}$	$\sim N (0, 0 . 05^{2})$
Errors	$ϵ_{i t_{w} t_{d}}$	$\sim N (0, 0 . 02^{2})$

Table 3. Averages

M S E (γ_{1})

,

M S E (γ_{2})

,

M S E (η_{1})

,

M S E (η_{2})

, and

M S E (\tilde{y})

for four cases of simulation.

Table 3. Averages

M S E (γ_{1})

,

M S E (γ_{2})

,

M S E (η_{1})

,

M S E (η_{2})

, and

M S E (\tilde{y})

for four cases of simulation.

Case	N	$R^{2}$	$MSE (γ_{1})$	$MSE (γ_{2})$	$MSE (η_{1})$	$MSE (η_{2})$	$MSE (\tilde{y})$
Case 1	100	0.6	$1.000522 \times 10^{- 24}$	$1.991303 \times 10^{- 25}$	0.5035614	0.3827327	0.01625859
Case 2	100	0.9	$2.912084 \times 10^{- 25}$	$1.343186 \times 10^{- 24}$	0.5027712	0.3765218	0.01378987
Case 3	200	0.6	$8.249432 \times 10^{- 27}$	$8.167412 \times 10^{- 27}$	0.5286615	0.3614322	0.01240092
Case 4	200	0.9	$3.903604 \times 10^{- 27}$	$6.183226 \times 10^{- 27}$	0.5178613	0.3525342	0.01182925

Table 4. Comparison of AIC for selection of ridge weights and time structures.

Model	Time Structure for Days	Time Structure for Weeks	a	c	AIC
Model 1	$η_{1} (s)$	$γ_{1} (s)$	1	1	2,096,978
Model 2	$η_{1} (s)$	$γ_{1} (s) + t γ_{2} (s)$	1	1	1,978,254
Model 3	$η_{1} (s) + t η_{2} (s)$	$γ_{1} (s)$	1	1	1,947,720
Model 4	$η_{1} (s) + t η_{2} (s)$	$γ_{1} (s) + t γ_{2} (s)$	1	1	1,947,720
Model 5	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s)$	$γ_{1} (s)$	1	1	1,947,720
Model 6	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s)$	$γ_{1} (s) + t γ_{2} (s)$	1	1	1,947,720
Model 7	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s) + t^{3} η_{4} (s)$	$γ_{1} (s)$	1	1	1,947,720
Model 8	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s) + t^{3} η_{4} (s)$	$γ_{1} (s) + t γ_{2} (s)$	1	1	1,947,720
Model 9	$η_{1} (s)$	$γ_{1} (s)$	3	3	1,789,975
Model 10	$η_{1} (s)$	$γ_{1} (s) + t γ_{2} (s)$	3	3	16,898,224
Model 11	$η_{1} (s) + t η_{2} (s)$	$γ_{1} (s)$	3	3	1,574,037
Model 12	$η_{1} (s) + t η_{2} (s)$	$γ_{1} (s) + t γ_{2} (s)$	3	3	1,442,020
Model 13	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s)$	$γ_{1} (s)$	3	3	1,347,340
Model 14	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s)$	$γ_{1} (s) + t γ_{2} (s)$	3	3	1,304,740
Model 15	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s) + t^{3} η_{4} (s)$	$γ_{1} (s)$	3	3	1,300,120
Model 16	$η_{1} (s) + t η_{2} (s) + t^{2} η_{3} (s) + t^{3} η_{4} (s)$	$γ_{1} (s) + t γ_{2} (s)$	3	3	1,235,622

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zahed, M.; Lalonde, T.; Skafyan, M. Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity. Mathematics 2023, 11, 1973. https://doi.org/10.3390/math11081973

AMA Style

Zahed M, Lalonde T, Skafyan M. Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity. Mathematics. 2023; 11(8):1973. https://doi.org/10.3390/math11081973

Chicago/Turabian Style

Zahed, Mostafa, Trent Lalonde, and Maryam Skafyan. 2023. "Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity" Mathematics 11, no. 8: 1973. https://doi.org/10.3390/math11081973

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity

Abstract

1. Introduction

2. Intensive Longitudinal Functional Model with Multiple Time Scales

2.1. Statistical Model of ILFMM

2.2. Matrix Form of ILFMM

2.3. Estimation Based on the Mixed Model Representation

2.4. Roughness Penalties Estimate

2.5. Selection of Time-Structure in $γ (t_{w}, s)$ and $η (t_{d}, s)$

2.6. Selection of a, b, c, and d for Decomposition Penalties

3. Simulation Steps

4. Simulation Conclusions

5. Physical Activity Study Application

6. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Application of an Intensive Longitudinal Functional Model with Multiple Time Scales in Objectively Measured Children’s Physical Activity

Abstract

1. Introduction

2. Intensive Longitudinal Functional Model with Multiple Time Scales

2.1. Statistical Model of ILFMM

2.2. Matrix Form of ILFMM

2.3. Estimation Based on the Mixed Model Representation

2.4. Roughness Penalties Estimate

2.5. Selection of Time-Structure in γ ( t w , s ) and η ( t d , s )

2.6. Selection of a, b, c, and d for Decomposition Penalties

3. Simulation Steps

4. Simulation Conclusions

5. Physical Activity Study Application

6. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.5. Selection of Time-Structure in $γ (t_{w}, s)$ and $η (t_{d}, s)$