Next Article in Journal
Preface to the Special Issue on “Advances in Machine Learning, Optimization, and Control Applications”
Previous Article in Journal
P-CA: Privacy-Preserving Convolutional Autoencoder-Based Edge–Cloud Collaborative Computing for Human Behavior Recognition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A U-Statistic for Testing the Lack of Dependence in Functional Partially Linear Regression Model

1
School of Mathematics and Statistics, Shanxi University, Taiyuan 030006, China
2
School of Statistics, Capital University of Economics and Business, Beijing 100070, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(16), 2588; https://doi.org/10.3390/math12162588
Submission received: 17 July 2024 / Revised: 14 August 2024 / Accepted: 19 August 2024 / Published: 21 August 2024

Abstract

:
The functional partially linear regression model comprises a functional linear part and a non-parametric part. Testing the linear relationship between the response and the functional predictor is of fundamental importance. In cases where functional data cannot be approximated with a few principal components, we develop a second-order U-statistic using a pseudo-estimate for the unknown non-parametric component. Under some regularity conditions, the asymptotic normality of the proposed test statistic is established using the martingale central limit theorem. The proposed test is evaluated for finite sample properties through simulation studies and its application to real data.

1. Introduction

In the past few decades, functional data analysis has been widely developed and applied in various fields, such as medicine, biology, economics, environmetrics, and chemistry (see [1,2,3,4,5]). An important model in functional data analysis is the partial functional linear model, which includes the parametric linear part and the functional linear part. To make the relationships between variables more flexible, the parametric linear part is usually replaced by the non-parametric part. This model is known as the functional partially linear regression model, which has been studied in [6,7,8]. The functional partially linear regression model is formulated as follows:
Y = g ( u ) + 0 1 α ( s ) X ( s ) d s + ε ,
where Y is the response variable. X ( · ) denotes the functional predictor, characterized by its mean function, μ 0 ( · ) , and covariance operator, C . The slope function α ( · ) is an unknown function. g ( · ) is a general continuous function defined on a compact support Ω . The random error ε has a mean of zero and a finite variance σ 2 , and is statistically independent of the predictor X ( · ) . When g ( · ) is a constant, model (1) reduces to a functional linear model. Refer to [9,10,11] for further details. With g ( · ) representing the parametric linear component, model (1) is identified as a partially functional linear model, an area explored in [12,13,14].
Hypothesis testing plays a critical role in statistical inference. For testing the linear relationship between the response and the functional predictor in the functional linear model, functional principal component analysis (FPCA) is a major idea in constructing test statistics. See [9,10,15]. Taking into account the flexibility of non-parametric functions, Ref. [6] introduced the functional partially linear model. Refs. [7,8] constructed the estimators of the slope functions based on spline and FPCA respectively. They utilized B-spline for estimating non-parametric components. In the context of predictors with additive measurement error, ref. [16] investigated estimators for the slope function and non-parametric component using FPCA and kernel smoothing methods. Ref. [17] established estimators of the slope function, non-parametric component, and mean of the response variable in the presence of randomly missing responses.
However, testing the relationship between the response variable and functional predictor in the functional partially linear regression model has been rarely considered so far. In this paper, the following hypothesis testing for model (1) will be considered:
H 0 : α ( t ) = α 0 ( t ) v . s . H 1 : α ( t ) α 0 ( t ) ,
where α 0 ( t ) denotes an assigned function. Here we assume α 0 ( t ) = 0 without compromising generality. To test (2) within the framework of model (1), a chi-square test was devised by [18]. This test relies on estimators for the nonlinear and slope functions. The underlying assumption is that the functional data can be well-approximated by a small number of principal components.
In particular, we focus on functional data that cannot be approximated with a few principal components, such as the velocity and acceleration of changes in China’s Air Quality Index (AQI). If these changes are represented by some curves, the velocity and acceleration are equivalent to the first and second derivatives of the AQI, respectively. The number of principal components selected by FPCA may approach approximately 30. Only several research studies have considered this data structure in the functional data analysis. Ref. [19] constructed a FLUTE test based on order-four U-statistic for the testing in the functional linear model, which can be computationally very costly. In order to save calculation time, ref. [20] developed a faster test using a order-two U-statistic. Inspired by this, we introduce a non-parametric U-statistic that integrates functional data analysis with the traditional kernel method to test (2).
The structure of the paper is as follows. Section 2 details the development of a new test procedure for the functional partially linear regression model. Section 3 presents the theoretical properties of the proposed test statistic under some regularity conditions. Section 4 includes a simulation study to evaluate the finite sample performance of the proposed test. Section 5 presents the application of the test to spectrometric data. The proofs of the primary theoretical results are presented in Appendix A.

2. Test Statistic

Assume Y and U are random variables taking real values. X ( · ) is a stochastic process with sample paths in L 2 [ 0 , 1 ] , which is the set of all square-integrable functions defined on [ 0 , 1 ] . Let · , · , · represent the inner product and norm in L 2 [ 0 , 1 ] , respectively. { ( Y i , X i ( · ) , U i ) , i = 1 , 2 , , n } constitutes a random sample drawn from model (1),
Y i = 0 1 α ( s ) X i ( s ) d s + g ( U i ) + ε i , i = 1 , 2 , , n .
For any given α ( t ) L 2 [ 0 , 1 ] , we move α ( t ) to the left,
Y i X i , α = g ( U i ) + ε i , i = 1 , 2 , , n .
Hence, model (4) simplifies to a classical non-parametric model. A pseudo-estimate for the non-parametric function employing Nadaraya–Watson method, can be formulated as follows:
g ^ ( U i ) = j i n K h ( U j U i ) ( Y j X j , α ) k i n K h ( U k U i ) , i = 1 , 2 , , n ,
where K h ( · ) = K ( · / h ) / h with K ( · ) being a preselected kernel function. A kernel function maps from the set of real numbers to the set of real numbers. It adheres to the following properties: (i) Non-negativity: the kernel function K ( · ) must be non-negative. (ii) Normalization: The integral (or sum in the discrete case) of the kernel function over the entire real line must equal 1, which means it can be interpreted as a probability density function. The bandwidth h in (5) is typically selected through data-driven procedures, such as cross-validation techniques. Here, we estimate non-parametric g ( U i ) without the ith sample.
Let
W i = ( W i 1 , , W i ( i 1 ) , W i ( i + 1 ) , , W i n ) T ,
X i , α = ( X 1 , α , , X i 1 , α , X i + 1 , α , , X n , α ) T ,
Y i = ( Y 1 , , Y i 1 , Y i + 1 , , Y n ) T , X i = ( X 1 , , X i 1 , X i + 1 , , X n ) T ,
where W i j = K h ( U j U i ) / k i K h ( U k U i ) . So the pseudo-estimate (5) of non-parametric function can be reformulated in matrix form as
g ˇ ( U i ) = W i T ( Y i X i , α ) .
Substituting g ˇ ( U i ) for g ( U i ) in model (3), we have
Y ˇ i = X ˇ i , α + ε i ,
where X ˇ i ( t ) = X i ( t ) W i T X i ( t ) , Y ˇ i = Y i W i T Y i . If we denote μ i t μ ( U i , t ) = E [ X 1 ( t ) | U i ] , where “≜” stands for “defined as”. Then μ ^ i t = W i T X i ( t ) can be the estimator of the conditional expectation μ i t for any t [ 0 , 1 ] .
Given an arbitrary orthonormal basis { ψ j } j = 1 in L 2 [ 0 , 1 ] , the functional predictor X ( · ) and the slope function α ( · ) admit the following series expansions: Let p represent the number of truncated basis functions, as follows:
X i ( t ) = j = 1 p ξ i j ψ j ( t ) + j = p + 1 ξ i j ψ j ( t ) , α ( t ) = j = 1 p β j ψ j ( t ) + j = p + 1 β j ψ j ( t ) ,
where ξ i j = X i , ψ j , β j = α , ψ j . Let ξ ˇ i j = X ˇ i , ψ j , then the model (6) can be rewritten as follows:
Y ˇ i = j = 1 ξ ˇ i j β j + ε i = j = 1 p ξ ˇ i j β j + j = p + 1 ξ ˇ i j β j + ε i .
Denote ξ i = ( ξ i 1 , ξ i 2 , , ξ i p ) T , which has mean μ and covariance matrix Σ . Let
ξ i = ( ξ 1 , , ξ i 1 , ξ i + 1 , , ξ n ) T , ξ ˇ i = ( ξ ˇ i 1 , ξ ˇ i 2 , , ξ ˇ i p ) T ,
ξ ˇ i = ( ξ ˇ 1 , , ξ ˇ i 1 , ξ ˇ i + 1 , , ξ ˇ n ) T , β = ( β 1 , β 2 , , β p ) T .
For model (3), the approximation error is defined as follows:
e i = 0 1 α ( s ) X i ( s ) d s k = 1 p ξ i k β k .
To investigate the influence of the approximation error, we impose the following conditions on the functional predictors and regression function:
(C1) The functional predictors { X i ( · ) } i = 1 n and the regression function α ( t ) adhere to the following conditions:
(i) The functional predictors { X i ( · ) } i = 1 n reside within a Sobolev ellipsoid of order two, then there exists a universal constant C, such that j = 1 ξ i j 2 j 4 C 2 , for i = 1 , , n .
(ii) The regression function satisfies α 2 ( t ) d t D , where D is a constant.
By applying the Cauchy–Schwarz inequality, we obtain the following:
e i 2 = j = p + 1 ξ i j β j 2 j = p + 1 ξ i j 2 j 4 j = p + 1 j 4 α j 2 C 2 D p 4 .
Then the approximation error can be ignored as p . Model (6) becomes as follows:
Y ˇ i = j = 1 p ξ ˇ i j β j + ε i ,
which is a high-dimensional partial linear model. Since
E ( X i E [ X i | U i ] ) ( Y i E ( Y i | U i ) ) 2
can be an effective measure for assessing the distance between α ( · ) and zero for test (2). Motivated by [21], we construct the following test statistic by estimating (8).
T n p = 1 2 n 2 n 2 1 i = 2 n j = 1 i 1 Δ i j X ˇ Δ i j Y ˇ ,
where
Δ i j X ˇ = X ˇ i X ˇ ¯ , X ˇ j X ˇ ¯ + X ˇ i X ˇ j , X ˇ i X ˇ j 2 n , Δ i j Y ˇ = Y ˇ i Y ˇ ¯ Y ˇ j Y ˇ ¯ + Y ˇ i Y ˇ j 2 2 n ,
where X ˇ ¯ ( t ) and Y ˇ ¯ denote the sample means of X ˇ i ( t ) and Y ˇ i , respectively. By some calculations, we can obtain E [ Δ i j ( X ˇ ) ] = 0 , E [ Δ i j ( Y ˇ ) ] = 0 . The test statistic T n p quantifies the discrepancy between α ( · ) and 0 under the null hypothesis. High values of the test statistic T n p suggest evidence in favor of the alternative hypothesis, prompting the rejection of the null hypothesis.

3. Asymptotic Theory

To achieve the asymptotic properties of the proposed test, we first suppose the following conditions based on [19,21]. We denote the following:
μ ( U i ) = ( μ 1 ( U i ) , μ 2 ( U i ) , , μ p ( U i ) ) T E [ ξ | U i ] ,
Σ * ( U i ) = E [ ξ i ξ i T | U i ] μ ( U i ) μ T ( U i ) , Σ * = Σ E [ μ ( U 1 ) μ T ( U 1 ) ] .
A condition on the dimensionality of matrix Σ * is stipulated as follows:
(C2) As n , p ; Σ * > 0 , tr ( Σ * 4 ) = o ( tr 2 ( Σ * 2 ) ) .
(C3) For a constant m p , there exists an m-dimensional random vector Z i = ( Z i 1 , , Z i m ) T such that ξ i = E [ ξ i | U i ] + Γ ( U i ) Z i . The vector Z i is characterized by E ( Z i ) = 0 , var ( Z i ) = I m , and for any U i , Γ ( U i ) is a p × m matrix with Γ ( U i ) Γ T ( U i ) = Σ * ( U i ) . It is assumed that each random vector { Z i , i = 1 , , n } has finite fourth moments and E ( Z i j 4 ) = 3 + Δ for some constant Δ . Moreover, we assume the following:
E Z i j 1 l 1 Z i j 2 l 2 Z i j d l d = E Z i j 1 l 1 E Z i j 2 l 2 E Z i j d l d
for k = 1 d l k 4 and j 1 j 2 j d , where d is a positive integer.
(C4) β T Σ * β = o ( h 2 ) , and β T Σ * 3 β = o ( tr ( Σ * 2 ) / n ) .
(C5) The error term satisfies E [ ε 4 ] < + .
(C6) The random variable U is confined to a compact domain Ω , and its density function f exhibits a continuously differentiable second derivative and bounded away from 0 on its support. The kernel K ( · ) is a symmetric probability density with compact support and is Lipschitz continuous.
(C7) E ( ξ 1 | U 1 ) and g ( · ) are Lipschitz continuous and admit continuous second-order derivatives.
(C8) It is assumed that the sample size n and the smoothing parameter h satisfy the following: lim n h = 0 , lim n n h = , lim n n h 4 = 0 .
(C9) The truncated number p and the sample size n are assumed to satisfy p = o ( n 2 h 2 ) .
Condition (C2) is widely utilized in high-dimensional data research (see [21,22,23]). Condition (C3) resembles a factor model. To assess local power, we further impose condition (C4) on the coefficient vector β . In fact, (C4) can serve as the local alternative as its distance measurement between β and 0. This local alternative can be also found in [21]. (C5) is the typical assumption for the error term ε . Conditions (C6–C8) are very common in non-parametric smoothing. (C9) is a technical condition that is needed to derive the theorems.
In practical applications, the data must satisfy conditions (C1–C3) and (C7). Conditions (C1) and (C7) are generally met for most datasets. (C2) does not specify a relationship between p and n. The matrix’s positive definiteness ensures that the regression coefficients can be identified. tr ( Σ * 4 ) = o ( tr 2 ( Σ * 2 ) ) holds if the eigenvalues of Σ * are all bounded or the largest eigenvalue is of smaller order than ( p b ) 1 / 2 b 1 / 4 , where b is the number of unbounded eigenvalues. Condition (C3) essentially assumes that the functional predictor X i ( t ) is based on a latent factor model, where the factor loadings meet the pseudo-independence assumption. If X ( t ) is a Gaussian process, it can be expanded as X ( t ) = j = 1 m λ j N j u j ( t ) , with N j being independent standard normal random variables. This expansion is a special case of (C3) when the ( i , j ) -th element of the transformation matrix Γ is λ j u j , ϕ i . These conditions are generally met for most data and do not affect the validity of the proposed test. Many datasets can be regarded as following a Gaussian process, such as changes in gene expression levels, logarithmic returns on financial asset prices, soil moisture, and temperature distribution.
We present the asymptotic theory for the proposed test statistic under the null hypothesis and local alternative (C4) in the subsequent two theorems:
Theorem 1.
Under the assumptions of conditions (C1), (C3–C9), it follows that
( i ) E ( T n p ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n ; ( i i ) T n p C * ( α ) 2 = 1 n 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j + o p tr ( Σ * 2 ) / n ,
where C * ( α ) = E [ X i μ i t , α ( X i μ i t ) ] . It can be regarded as the covariance operator of a random variable X i μ i t .
Theorem 2.
Assume conditions (C1–C3) and (C5–C9) hold, we then have the following results under either the null hypothesis or the local alternative (C4):
n ( T n p C * ( α ) 2 ) σ 2 2 tr ( Σ * 2 ) D N ( 0 , 1 ) , as n ,
where D represents convergence in distribution.
Theorem 2 demonstrates that, under the local alternative hypothesis (C4), the proposed test statistic possesses the following asymptotic local power at the nominal significance level α :
Ψ ( β ) = Φ z α + n C * ( α ) 2 σ 2 2 tr ( Σ * 2 ) ,
where Φ ( · ) denotes the cumulative distribution function of the standard normal, and z α represents its ( 1 α ) th quantile. We define η ( α ) = C * ( α ) / σ 2 2 tr ( Σ * 2 ) , which represents the signal-to-noise ratio. When the term η ( α ) = o ( 1 / n ) , the power converges to α , then the power converges to 1 if it has a high order of 1 / n . This implies that the proposed test is consistent. The power performance will be demonstrated through simulations in Section 4.
According to Theorem 2, the proposed test statistic leads to the rejection of H 0 at a significance level α when
n T n p 2 σ ^ 2 tr ( Σ * 2 ) ^ z α ,
where σ ^ 2 and tr ( Σ * 2 ) ^ z α serve as consistent estimators for σ 2 and tr ( Σ * 2 ) , respectively. We use a similar method as in [24] to estimate the trace. That is,
tr ( Σ * 2 ) ^ = Y 1 n 2 Y 2 n + Y 3 n ,
where Y 1 n = 1 A n 2 i j X ˇ i T , X ˇ j 2 , Y 2 n = 1 A n 3 i j k X ˇ i T , X ˇ j X ˇ j T , X ˇ k , Y 2 n = 1 A n 4 i j k l X ˇ i T , X ˇ j X ˇ k , X ˇ l with A n m = n ! / ( n m ) ! . And the simple estimator σ ^ 2 = ( n 1 ) 1 i = 1 n ( Y ˇ i Y ˇ ¯ ) 2 is used, which is consistent under the null hypothesis testing.

4. Simulation

This section evaluates the finite sample performance of the proposed test, including its size and power. The assessment is conducted through a series of simulation studies. Through numerical simulations, we will validate that the distribution of the proposed test statistic under the null hypothesis is consistent with the properties stated in Theorem 1. For each simulation, we create 1000 Monte Carlo samples. The basis expansion and FPCA are conducted using the R package fda.
To mitigate the probability of both Type I and Type II errors in the testing procedure, the sample size must be adequately large. However, to maintain computational efficiency during the numerical simulations, the sample size should not be excessively large. Consequently, the sample size n in this study has been set within a range of 50 to 200. To validate the effectiveness of our proposed test, the parameters are flexibly set.
Here we compare the proposed test T n p with the chi-square test T n constructed by [18]. The cumulative percentage of total variance (CPV) method is used to estimate the number of principal components in T n . Let CPV, explained by the first m empirical functional principal components, be defined as follows:
CPV = i = 1 m λ ^ i i = 1 p λ ^ i ,
where { λ ^ i } i = 1 p is the estimate of the eigenvalue of the covariance operator. The smallest value of m for which CPV (m) surpasses the threshold of 95% is selected in this section. We denote p as the number of basis functions used to fit curves. The simulated data are produced according to the following model:
Y i = 0 1 α ( s ) X i ( s ) d s + g ( U i ) + ε i , i = 1 , 2 , , n .
where g ( U i ) = 2 U i or g ( U i ) = 2 + sin ( 2 π U i ) , and { U i , i = 1 , 2 , , n } is independently drawn from the uniform distribution on ( 0 , 1 ) . To analyze the impact of different error distributions, the following four distributions will be selected: (1) ε i N ( 0 , 1 ) , (2) ε i t ( 3 ) / 3 , (3) ε i Γ ( 1 , 1 ) 1 , (4) ε i ( lnorm ( 0 , 1 ) e ) / e ( e 1 ) . All results about g ( U i ) = 2 U i are presented in Supplementary Materials.
We next report the simulation results for two data structures of the predictor X ( t ) .
1. The predictor X ( t ) is defined as j = 1 50 ξ j ϕ j ( t ) , with ξ j normally distributed with mean 0 and variance λ j = 10 ( ( j 1 / 2 ) π ) 2 , ϕ j ( t ) = 2 ( j 1 / 2 ) π t for j = 1 , 2 , , 50 . The slope function α ( t ) is given by c ( 2 sin ( π t / 2 ) + 3 2 sin ( 3 π t / 2 ) ) , where the coefficient c ranges from 0 to 0.2. c = 0 corresponds to the null hypothesis. The number of basis functions used to fit curves and the sample size are taken as follows: p = 11 , 49 , n = 50 , 100 . Under different error distributions, Table 1 and Table 2 evaluate the empirical size and power of both tests for different non-parametric functions when the nominal level α is 0.05 .
From Table 1 and Table 2, the following can be seen: (i) The performances of both tests remain consistent across various error distributions and non-parametric functions; (ii) Because T n p is intended for functional data beyond the reach of a few principal components, the power of the proposed test is somewhat less than that of T n . (iii) The power of the test increases with the sample size n, but it is not significantly affected by increases in the parameter value p. In fact, for the functional data structure given in Simulation 1, the number of principal components selected is relatively small, regardless of the number of basis functions used to fit the functional data.
2. The functional predictor is constructed using the expansion in (7), with ϕ k representing the Fourier basis function on [0,1] defined as ϕ 1 ( t ) = 1 , ϕ 2 ( t ) = 2 sin ( 2 π t ) , ϕ 3 ( t ) = 2 cos ( 2 π t ) , ϕ 4 ( t ) = 2 sin ( 4 π t ) , ϕ 4 ( t ) = 2 cos ( 4 π t ) , . The first p of the basis functions will be used to generate the prediction function and slope function. Let X i ( t ) = j = 1 p ξ i j ϕ j ( t ) , α ( t ) = j = 1 p β j ϕ j ( t ) , where p = 11 , 49 , 201 , 365 , n = 50 , 100 , 200 , the coefficient of slope function { β i = | β | / p , i = 1 , , p } with | β | 2 = c 10 2 and c varying from 0 to 1. c = 0 corresponds to the case in which H 0 is true. The coefficients of predictor ξ i j follow the moving average model:
ξ i j = ρ 1 Z i j + ρ 2 Z i ( j + 2 ) + + ρ T Z i ( p + T 1 ) ,
where the constant T adjusts the degree of dependence among the elements of the predictor. { Z i j , Z i ( j + 1 ) , , Z i ( p + T 1 ) } are drawn independently from the distribution N ( 0 , I p + T 1 ) with T = 10 . The element at the ( j , k ) position of the covariance matrix Σ for coefficient vector ξ i is l = 1 T | j k | ρ l ρ l + | j k | I { | j k | < T } , where { ρ k , k = 1 , , T } is independently generated from the uniform distribution U ( 0 , 1 ) .
The bandwidth is chosen using cross-validation (CV). At a significance level of α = 0.05 , Table 3 delineates the empirical size and power of the two tests when the function g ( · ) is linear. Table 4 presents the results for the case where g ( · ) is a trigonometric function.
From Table 3 and Table 4, the number of basis functions used for fitting functions has a very important impact on the test. Specifically, (i) Across various error distributions, as p increases, the empirical size of test T n significantly exceeds the nominal level, whereas our proposed test T n p maintains stable performance; (ii) The power of the test increases with the sample size n. Conversely, it decreases as the values of p increase. (iii) The proposed test demonstrates robustness across all scenarios presented in this simulation study. Actually, for the functional data structure given in Simulation 2, selecting too many principal components negates the effectiveness of FPCA-based test statistics. Instead, the proposed test has great advantages (see bold numbers in Table 3 and Table 4).
To more effectively verify the accuracy of the asymptotic theory underlying our proposed test statistic, Table 5 provides the mean and standard deviation (sd) of the test statistic under different scenarios. From Table 5, it is observed that when c = 0 , the mean of our proposed test statistic fluctuates around zero, and the standard deviation fluctuates around one. This aligns with the theoretical expectations. As c increases, the mean of the test statistic moves further away from zero, and the standard deviation moves further away from one, indicating a departure from the null hypothesis.
Furthermore, to verify the asymptotic theory of our proposed test, we consider the case where ( n , p ) = ( 200 , 365 ) . Figure 1 and Figure 2 draw the null distributions and the q-q plots T n p , corresponding to g ( u ) = 2 u and g ( u ) = 2 + sin ( 2 π u ) , respectively. The null distributions are represented by the dashed lines, while the solid lines are density function curves of standard normal distributions.
For different n , p , Figure 3 and Figure 4, respectively, show the empirical power functions of the proposed test statistics. These figures are presented for four different error distribution functions. The function g ( · ) is linear in Figure 3 and trigonometric in Figure 4. When ( n , p ) = ( 200 , 201 ) , ( 100 , 201 ) , ( 200 , 365 ) , the empirical power functions of the proposed test are represented by solid lines, dashed lines, and dotted lines, respectively. From Figure 3 and Figure 4, it can be seen that the power increases rapidly as long as c increases slightly. The test’s power is positively related to the sample size n and inversely related to the magnitude of p. The proposed test is stable under different error distributions. These are consistent with the conclusions in Table 3 and Table 4.
It is worth noting that, theoretically, a kernel function K ( · ) is sufficient if it satisfies the conditions of symmetry and Lipschitz continuity. In practical applications, however, the choice of kernel function should be based on the characteristics and requirements of the data. For instance, the Epanechnikov kernel is more suitable for bounded data, while the Gaussian kernel is better suited for data with long tails. In this simulation study, according to the given data setting, the Epanechnikov kernel was chosen. To compare the effects of the two kernels, we replaced the Epanechnikov kernel used to generate Figure 4 with a Gaussian kernel to produce Figure 5. From Figure 4 and Figure 5, it can be observed that the impact of the two kernels on the test is relatively minor.
The numerical simulations show that our proposed test performs well for the data types. However, with larger sample sizes, the numerical simulations in this paper require considerable computational time, which is a limitation of the proposed test statistic. Additionally, its performance on the datasets that violate the assumptions (C1–C3) and (C7), such as when the real data are not in a Sobolev ellipsoid of order two, remains to be seen.

5. Application

This section applies the proposed test to the spectral data, which has been described and analyzed in the literature (see [25,26]). This dataset can be obtained on the following platforms: http://lib.stat.cmu.edu/datasets/tecator (accessed on 16 July 2024). Each meat sample is characterized by a 100-channel spectrum of absorbance, along with the moisture (water), fat, and protein contents. The absorbance is calculated as the negative logarithm base 10 of the transmittance, as measured by the spectrometer. The three contents, measured in percent, are determined by analytic chemistry. The dataset comprises 240 samples, partitioned into 5 subsets for the validation of models and extrapolation studies. In this section, we utilize a total of 215 samples, which include both training and test samples drawn from the 5 subsets. The spectral measurement data consist of curves, represented by X i ( · ) , corresponding to absorbance values recorded at 100 equally spaced wavelengths from 850 nm to 1050 nm. Let Y i represent the fat content as the response variable, Z i represent the protein content, and U i represent the moisture content. Similar to [27], the following two models will be used to assume the relationship between them:
Y i = 850 1050 α ( t ) X i ( t ) d t + g ( Z i ) + ε i ,
Y i = 850 1050 α ( t ) X i ( t ) d t + g ( U i ) + ε i .
The present investigation primarily focuses on the test in models (10) and (11): α ( t ) = 0 . The number of basis functions used for fitting function curves p is selected as 129. Figure 6 shows the estimation of slope function α ( t ) in models (10) and (11).
The calculation results are as follows: (i) For model (10), the value of the statistic is T n p = 31.186 ; p-value is 0. (ii) For model (11), the value of the statistic is T n p = 0.867 ; p-values are 0.386. From this, we can see that the model test (10) is significant, while the model test (11) is not significant. This result can also be reflected in Figure 6. It is obvious that the estimated value of α ( t ) on the right side of Figure 6 is much smaller than that on the left side.

6. Conclusions

To test (2), this paper first provides a pseudo-estimate of the non-parametric function g ( u ) using kernel methods with a fixed coefficient function β ( t ) . The pseudo-estimate is then substituted into the model, converting the original model into a linear one. This allows for the construction of the second-order U-test statistic employed in this paper, utilizing the corresponding testing methods from functional linear models. The proposed test does not require estimating the covariance operator of the predictor function. It follows a normal distribution asymptotically under both a null hypothesis and a local alternative. Moreover, numerical simulations show that our proposed test performs better than the test constructed in [18] when functional data cannot be approximated by a few principal components. Finally, the real data are applied to our proposed test to verify its feasibility.
Additionally, the proposed test is adaptable to cases where the response variable is functional, which is a focus for our upcoming research. In the real world, the proposed test requires real data to meet some technical conditions (C1–C3, C7). When data fail to satisfy these conditions, the viability of the test presented in this paper requires further investigation. Therefore, future work may focus on broadening the test’s applicability. The calculation of the statistics in numerical simulations requires optimization for efficiency.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/math12162588/s1.

Author Contributions

Conceptualization, F.Z. and B.Z.; methodology, B.Z.; Validation, B.Z.; formal analysis, F.Z.; investigation, F.Z.; data curation, F.Z.; writing—original draft preparation, F.Z.; writing—review and editing, B.Z.; visualization, B.Z.; supervision, B.Z.; project administration, B.Z.; funding acquisition, B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (nos. 12271370, 12301349), the Natural Science Foundation of Shanxi Province, China (no. 202203021222009).

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors are grateful to the editor, associate editor, and referees for their valuable feedback, which has significantly enhanced the quality of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Several lemmas are established to facilitate the proofs of Theorems 1 and 2. Without loss of generality, we assume that μ 0 ( t ) = 0 and E [ g ( U ) ] = 0 . Let C n = log ( 1 / h ) / ( n h ) + h 2 . With reference to the asymptotic theory of non-parametric estimation, the pseudo-estimation of the non-parametric function satisfies sup u Ω g ˇ ( u ) g ( u ) = O p ( C n ) . Denote
D g ( U i ) = g ( U i ) g ˇ ( U i ) , D μ i t = μ i t μ ^ i t ,
for i = 1 , 2 , , n . Similarly to the lemmas in [21], it is easy to derive the following lemmas.
Lemma A1.
If (C1), (C3), and (C4) hold, it can be demonstrated that for any square matrix, M,
( i ) E Z 1 Z 1 T M Z 1 Z 1 T = M + M T + tr ( M ) I p + Δ diag ( M ) ; ( i i ) E Z 1 Z 2 T M Z 2 Z 1 T = tr ( M ) I p ; ( i i i ) E ( X 1 μ 1 t , α X 1 μ 1 t , X 2 μ 2 t X 2 μ 2 t , α ) 2 = o ( tr ( Σ * 2 ) ) .
Lemma A2.
Given that conditions (C1–C3) and (C5–C9) are satisfied, the following results are obtained.
( i ) E X ˇ 1 , X ˇ 2 4 = o ( n tr 2 ( Σ * 2 ) ) ; ( i i ) E C * ( X ˇ 1 ) , X ˇ 1 2 = o ( n tr 2 ( Σ * 2 ) ) .
Lemma A3.
If (C1–C9) hold, then we can obtain the following:
( i ) E X ˇ 1 , X ˇ 1 = O ( tr ( Σ * ) ) ; ( i i ) E Y ˇ 1 2 = O ( 1 ) ; ( i i i ) E X ˇ ¯ 12 , X ˇ ¯ 12 = O tr ( Σ * ) / n ; ( i v ) E Y ˇ ¯ 12 2 = O C n 2 ; ( v ) E X ˇ ¯ 12 , X ˇ ¯ 12 Y ˇ ¯ 12 2 = O tr ( Σ * ) / n 2 ,
where X ˇ ¯ i j ( t ) , Y ˇ ¯ i j represent the sample means of X ˇ ( t ) and Y ˇ without ith and jth samples, for i , j = 1 , 2 , , n . That is
X ˇ ¯ i j ( t ) = 1 n 2 k i , j X ˇ k ( t ) , Y ˇ ¯ i j = 1 n 2 k i , j Y ˇ k .
Proof of Theorem 1. 
Rewrite
n n 2 Δ i j X ˇ P i j ( 1 ) + P i j ( 2 ) + P i j ( 3 ) + P i j ( 4 ) , n n 2 Δ i j Y ˇ L i j ( 1 ) + L i j ( 2 ) + L i j ( 3 ) + L i j ( 4 ) ,
where
P i j ( 1 ) = 1 1 n X ˇ i , X ˇ j , P i j ( 2 ) = 1 2 n X ˇ i , X ˇ i + X ˇ j , X ˇ j 2 E X ˇ 1 , X ˇ 1 , P i j ( 3 ) = 1 2 n X ˇ i + X ˇ j , X ˇ ¯ i j , P i j ( 4 ) = 1 2 n X ˇ ¯ i j , X ˇ ¯ i j E [ X ˇ 1 , X ˇ 1 ] n 2 , L i j ( 1 ) = 1 1 n Y ˇ i Y ˇ j , L i j ( 2 ) = 1 2 n Y ˇ i 2 + Y ˇ j 2 2 E Y ˇ 1 2 , L i j ( 3 ) = 1 2 n Y ˇ i + Y ˇ j Y ˇ ¯ i j , L i j ( 4 ) = 1 2 n Y ˇ ¯ i j 2 E Y ˇ 1 2 n 2 ,
then the expectation of test statistic T n p is as follows:
E [ T n p ] = j < i l , k = 1 4 E P i j ( l ) L i j ( k ) .
To prove the conclusion (i) in Theorem 1, it needs to be calculated one by one for ( l , k ) , l , k = 1 , 2 , 3 , 4 . Because of the similarity to calculations in different cases of ( l , k ) , here, we mainly consider the case where ( l , k ) = ( 1 , 1 ) ,
E P i j ( 1 ) L i j ( 1 ) G 1 ( 1 , 1 ) + G 2 ( 1 , 1 ) + G 3 ( 1 , 1 ) + G 4 ( 1 , 1 ) + G 5 ( 1 , 1 ) + G 6 ( 1 , 1 ) ,
where
G 1 ( 1 , 1 ) = ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) D g ( U 2 ) , G 2 ( 1 , 1 ) = ( n 1 ) 2 n 2 E [ X 1 μ ^ 1 t , α X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α ] , G 3 ( 1 , 1 ) = ( n 1 ) 2 n 2 E [ X 1 μ ^ 1 t , X 2 μ ^ 2 t ε 1 ε 2 ] , G 4 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α D g ( U 1 ) , G 5 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) ε 2 , G 6 ( 1 , 1 ) = 2 ( n 1 ) 2 n 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , α ε 2 .
For the above six items, we will analyze them one by one. Firstly, we consider the first term. We have the following:
E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) D g ( U 2 ) = 2 E [ D μ 1 t , X 2 μ 2 t D g ( U 1 ) D g ( U 2 ) ] + E [ D μ 1 t , D μ 2 t D g ( U 1 ) D g ( U 2 ) ] = O W 12 W 21 ξ 2 T ξ 2 μ T ( U 2 ) ξ 2 g 2 ( U 1 ) = O tr ( Σ * ) / n 2 h ,
then G 1 ( 1 , 1 ) = o tr ( Σ * 2 ) / n holds. For the second term, we have the following:
E [ X 1 μ ^ 1 t , α X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α ] = E [ X 1 μ 1 t , α X 1 μ 1 t , X 2 μ 2 t X 2 μ 2 t , α ] + 2 E D μ 1 t , α D μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α D μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α X 2 μ 2 t , D μ 1 t D μ 2 t , α + 2 E X 1 μ 1 t , α X 1 μ 1 t , D μ 2 t D μ 2 t , α + 2 E D μ 1 t , α X 1 μ 1 t , D μ 2 t D μ 2 t , α + 2 E X 1 μ 1 t , α D μ 1 t , D μ 2 t X 2 μ 2 t , α = C * ( α ) 2 + O 2 β Σ * 2 β / n h + β T Σ * β tr ( Σ * ) / n 2 .
Combined with (C1), (C3), and (C9), G 2 ( 1 , 1 ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n holds. The error term ε i with a mean of zero is independent of the predictor; hence, it is easy to see that both the third term G 3 ( 1 , 1 ) and the sixth term G 6 ( 1 , 1 ) are zero. For the other two cross terms, G 4 ( 1 , 1 ) and G 5 ( 1 , 1 ) , we need to prove that they are high-order infinitesimals of tr ( Σ * 2 ) / n . In fact,
E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 2 μ ^ 2 t , α D g ( U 1 ) = E X 1 μ 1 t , α X 1 μ 1 t , D μ 2 t D g ( U 2 ) + E D μ 1 t , α D μ 1 t , X 2 μ 2 t D g ( U 2 ) + E D μ 1 t , α X 1 μ 1 t , D μ 2 t D g ( U 2 ) + E X 1 μ 1 t , α D μ 1 t , D μ 2 t D g ( U 2 ) + E D μ 1 t , α D μ 1 t , D μ 2 t D g ( U 2 ) = O n E [ W 23 2 β T ( ξ 1 ξ 1 T ξ 1 μ T ( U 1 ) ) ξ 3 g ( U 3 ) ] = O E [ β T Σ * ( U 1 ) μ ( U 2 ) g ( U 2 ) f 1 ( U 2 ) ] / n h = o tr ( Σ * 2 ) / n .
Finally, for G 5 ( 1 , 1 ) , we have the following:
E X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) ε 2 = E [ X 1 μ 1 t + D μ 1 t , X 2 μ 2 t + D μ 2 t ( W 12 ε 2 2 ) ] = E [ ( ξ 1 μ ( U 1 ) ) T ξ 1 W 21 W 12 σ 2 ] + E [ W 12 2 ξ 2 T ( ξ 2 μ ( U 1 ) ) σ 2 ] = O tr ( Σ * ) / n 2 h .
Using (C3) and the following fact tr 2 ( Σ * ) p tr ( Σ * 2 ) , we obtain tr ( Σ * ) / tr ( Σ * 2 ) = o ( n h ) , i.e., A 5 ( 1 , 1 ) = o tr ( Σ * 2 ) / n . Then, the following can be seen:
E P i j ( 1 ) L i j ( 1 ) = C * ( α ) 2 + o tr ( Σ * 2 ) / n .
The conclusion (i) of Theorem 1 follows from the calculation of E P i j ( 1 ) L i j ( 1 ) and the proof of Theorem 1 in [21]. Conclusion (ii) is addressed in Theorem 2’s proof; here we omit it. □
Proof of Theorem 2. 
By Theorem 1, we have the following:
n ( E ( T n p ) C * ( α ) 2 ) σ 2 2 tr ( Σ * 2 ) = o ( 1 ) ,
then we only need to prove the following:
n ( T n p E T n p ) σ 2 2 tr ( Σ * 2 ) D N ( 0 , 1 ) .
We denote T n p ( k , l ) = n n 2 1 i > j P i j ( k ) L i j ( l ) E P i j ( k ) L i j ( l ) with k , l = 1 , 2 , 3 , 4 . The subsequent result is established:
n ( T n p E T n p ) = k = 1 4 l = 1 4 T n p ( k , l ) .
In order to derive the asymptotic properties of the above equation, we will find the asymptotic order of all terms T n p ( k , l ) . These items are divided into the following two groups according to the treatment methods.
  • Group 1: ( k , l ) = ( 1 , 1 ) , ( 1 , 2 ) , ( 1 , 3 ) , ( 1 , 4 ) , ( 2 , 1 ) , ( 2 , 3 ) , ( 3 , 1 ) , ( 3 , 3 ) , ( 4 , 1 ) , ( 4 , 3 ) .
  • Group 2: ( k , l ) = ( 2 , 2 ) , ( 2 , 4 ) , ( 3 , 2 ) , ( 3 , 4 ) , ( 4 , 2 ) , ( 4 , 4 ) .
Since the methods are similar, the cases of ( k , l ) = ( 1 , 1 ) and ( k , l ) = ( 2 , 2 ) will be considered, respectively, in detail in each group. Firstly, for T n p ( 1 , 1 ) , we can rewrite the following:
T n p ( 1 , 1 ) T n p , 1 ( 1 , 1 ) + T n p , 2 ( 1 , 1 ) + T n p , 3 ( 1 , 1 ) + T n p , 4 ( 1 , 1 ) + T n p , 5 ( 1 , 1 ) + T n p , 6 ( 1 , 1 ) + T n p , 7 ( 1 , 1 ) + T n p , 8 ( 1 , 1 ) + T n p , 9 ( 1 , 1 ) ,
where
T n p , 1 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) E [ X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) ] } , T n p , 2 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α D g ( U i ) E [ X i μ ^ i t , X j μ ^ j t ] X j μ ^ j t , α D g ( U i ) } , T n p , 3 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t X i μ ^ i t , α D g ( U j ) E [ X i μ ^ i t , X j μ ^ j t ] X i μ ^ i t , α D g ( U j ) } , T n p , 4 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U i ) ε j E [ X i μ ^ i t , X j μ ^ j t D g ( U i ) ε j ] } , T n p , 5 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , X j μ ^ j t D g ( U j ) ε i E [ X i μ ^ i t , X j μ ^ j t D g ( U j ) ε i ] } , T n p , 6 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α ε i , T n p , 7 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t X i μ ^ i t , α ε j , T n p , 8 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i { X i μ ^ i t , α X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α E [ X i μ ^ i t , α X i μ ^ i t , X j μ ^ j t X j μ ^ j t , α ] } , T n p , 9 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ ^ i t , X j μ ^ j t ε i ε j .
To prove (A1), we shall prove the following:
T n p ( 1 , 1 ) E T n p ( 1 , 1 ) σ 2 2 tr ( Σ * 2 ) = T n p , 91 ( 1 , 1 ) σ 2 2 tr ( Σ * 2 ) + o p ( 1 ) ,
where T n p , 91 ( 1 , 1 ) = 2 ( n 1 ) n 2 j < i X i μ i t , X j μ j t ε i ε j .
It is easy to see that the means of nine items in the right equation of (A2) are all zero. To calculate their asymptotic order, it is necessary to prove their second moment. Due to the similarity in calculating the first eight items, we use the first item T n p , 1 ( 1 , 1 ) as an example to consider.
E ( T n p , 1 ( 1 , 1 ) ) 2 = n 2 n 2 2 i = 2 n E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) + n 2 n 2 2 i = 2 n j i E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) + o ( tr ( Σ * 2 ) ) ,
where
Q i , 1 ( 1 , 1 ) = j = 1 i 1 X i μ ^ i t , X j μ ^ j t D g ( U i ) D g ( U j ) .
For i j , let us calculate E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) and E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) .
E Q i , 1 ( 1 , 1 ) Q i , 1 ( 1 , 1 ) = ( i 1 ) E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 2 μ ^ 2 t D g ( U 1 ) 2 D g ( U 2 ) 2 + ( i 1 ) ( i 2 ) E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) ( i 1 ) B 11 ( 1 , 1 ) + ( i 1 ) ( i 2 ) B 12 ( 1 , 1 ) , E Q i , 1 ( 1 , 1 ) Q j , 1 ( 1 , 1 ) = Ξ 1 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + Ξ 2 E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 3 μ ^ 3 t , X 4 μ ^ 4 t D g ( U 1 ) D g ( U 2 ) D g ( U 3 ) D g ( U 4 ) Ξ 1 B 12 ( 1 , 1 ) + Ξ 2 B 13 ( 1 , 1 ) ,
where Ξ 1 = ( i 1 ) ( j 1 ) , Ξ 2 = ( i 1 ) ( j 1 ) ( i 1 ) ( j 1 ) .
Using the Cauchy–Schwarz inequality and Lemma A.2, we can obtain
B 11 ( 1 , 1 ) = O ( C n 4 n tr ( Σ * 2 ) ) .
For B 12 ( 1 , 1 ) and B 13 ( 1 , 1 ) ,
B 12 ( 1 , 1 ) = E X 1 μ ^ 1 t , X 2 μ ^ 2 t X 1 μ ^ 1 t , X 3 μ ^ 3 t D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) = O ( E W 23 tr ( Σ * ( U 1 ) Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E W 13 W 21 tr ( Σ * ( U 1 ) Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E W 12 W 13 tr ( Σ * ( U 1 ) ) tr ( Σ * ( U 3 ) ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) + E μ T ( U 1 ) μ ( U 2 ) μ T ( U 3 ) μ ( U 1 ) D g ( U 1 ) 2 D g ( U 2 ) D g ( U 3 ) ) = O ( E [ tr ( Σ * ( U 1 ) Σ * ( U 2 ) ) g 2 ( U 1 ) g ( U 2 ) ] / n 2 + E 2 [ tr ( Σ * ( U 1 ) ) g 2 ( U 1 ) ] / n 3 + E [ μ T ( U 2 ) μ ( U 1 ) μ T ( U 1 ) μ ( U 2 ) g ( U 1 ) / n 2 ] + E [ μ T ( U 2 ) μ ( U 1 ) μ T ( U 1 ) μ ( U 2 ) g 2 ( U 1 ) g 2 ( U 2 ) / n 2 ] ) , B 13 ( 1 , 1 ) = O ( E [ D μ T ( U 1 ) D μ ( U 2 ) ( μ ( U 3 ) μ ^ ( U 3 ) ) T ( μ ( U 4 ) μ ^ ( U 4 ) ) D g ( U 1 ) D g ( U 2 ) D g ( U 3 ) D g ( U 4 ) ] ) = O μ T ( U 1 ) μ ( U 2 ) μ T ( U 3 ) μ ( U 4 ) g ( U 1 ) g ( U 2 ) g ( U 3 ) g ( U 4 ) / n 2 .
So, we can have T n p , 1 ( 1 , 1 ) = o p tr ( Σ * 2 ) . We apply similar methods to the T n p , 1 ( 1 , 1 ) , the terms { T n p , k ( 1 , 1 ) , k = 2 , , 7 } are all equal to o p ( tr ( Σ * 2 ) ) . For T n p , 9 ( 1 , 1 ) , we rewrite
T n p , 9 ( 1 , 1 ) T n p , 91 ( 1 , 1 ) + T n p , 92 ( 1 , 1 ) + T n p , 93 ( 1 , 1 ) + T n p , 94 ( 1 , 1 ) ,
where
T n p , 91 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j , T n p , 92 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X i μ i t , D μ j t ε i ε j , T n p , 93 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 X j μ j t , D μ i t ε i ε j , T n p , 94 ( 1 , 1 ) = 2 ( n 1 ) n 2 i = 2 n j = 1 i 1 D μ i t , D μ j t ε i ε j .
Since the means of the above four formulas are zero, in order to prove that (A3) is true, it is necessary to verify that the second moments of T n p , 9 k ( 1 , 1 ) , k = 2 , 3 , 4 are higher-order infinitesimals of the quantity tr ( Σ * 2 ) . In fact,
E [ T n p , 92 ( 1 , 1 ) ] 2 = E [ T n p , 93 ( 1 , 1 ) ] 2 = 4 ( n 1 ) 2 σ 4 n 4 i = 2 n ( i 1 ) E X 1 μ 1 t , D μ 2 t 2 = O E [ X 1 μ 1 t , D μ 2 t X 1 μ 1 t , D μ 2 t ] = O n E [ W 23 2 ξ 3 T Σ * ( U 1 ) ξ 3 ] = o ( tr ( Σ * 2 ) ) ,
E [ T n p , 94 ( 1 , 1 ) ] 2 = O E [ D μ i t , D μ j t D μ i t , D μ j t ] = O E [ tr ( Σ * ( U 3 ) μ ( U 1 ) μ T ( U 1 ) f 1 ( U 3 ) ) ] / n h + O tr 2 ( Σ * ) / n 2 h = o ( tr ( Σ * 2 ) ) .
Then Equation (A3) holds. Similarly, for Group 2, i.e., when ( k , l ) = ( 2 , 2 ) , ( 2 , 4 ) , ( 3 , 2 ) , ( 3 , 4 ) , ( 4 , 2 ) , ( 4 , 4 ) , there is a similar proof process for the asymptotic behavior of each item in the group. Here, we only consider T n p ( 2 , 2 ) . By careful calculation, we have the following:
E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] = O ( E [ X 1 μ 1 t , X 1 μ 1 t 2 ] + E 2 [ X 1 μ 1 t , X 1 μ 1 t ] + E [ X 1 μ 1 t , X 1 μ 1 t X 2 μ 2 t , X 2 μ 2 t ] ) = O ( E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ] + E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T ( ξ 2 μ ( U 2 ) ) ] + E 2 [ ( ξ 1 μ ( U 1 ) ) T ( ξ 1 μ ( U 1 ) ) ] ) = O ( E [ 2 tr ( Σ * 2 ( U 1 ) ) + tr 2 ( Σ * ) + Δ tr ( diag ( Γ T ( U 1 ) Γ ( U 1 ) ) Γ T ( U 1 ) Γ ( U 1 ) ) ] ) .
Using the fact that
E [ tr ( diag ( Γ T ( U 1 ) Γ ( U 1 ) ) Γ T ( U 1 ) Γ ( U 1 ) ) ] = E [ tr ( Σ * 2 ( U 1 ) ) ] = O ( tr ( Σ * 2 ) ) ,
we have the following:
E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] = O ( tr 2 ( Σ * ) + tr ( Σ * 2 ) ) .
In addition, by a simple calculation, we have the following:
E [ ( Y ˇ 1 2 + Y ˇ 2 2 2 E [ Y ˇ 1 2 ] ) 2 ] = 2 E [ ( Y ˇ 1 2 ) 2 ] + 2 E [ Y ˇ 1 2 Y ˇ 2 2 ] 4 E 2 [ Y ˇ 1 2 ] = O ( 1 ) .
Combining (A4), (A5) with the Cauchy–Schwarz inequality, we have the following:
E | T n p ( 2 , 2 ) | 1 4 n E [ ( X ˇ 1 , X ˇ 1 + X ˇ 2 , X ˇ 2 2 E ( X ˇ 1 , X ˇ 1 ) ) 2 ] E [ Y ˇ 1 2 + Y ˇ 2 2 2 E ( Y ˇ 1 2 ) ] 2 = o tr ( Σ * 2 ) .
We denote the following: T ˇ n p 1 n 2 1 / 2 i = 2 n j = 1 i 1 X i μ i t , X j μ j t ε i ε j , by condition (C1), we only need to consider the following: T ˇ n p = n 2 1 / 2 i = 2 n j = 1 i 1 ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ε i ε j . Then, by Slutsky’s theorem, if the following conclusion can be obtained, Theorem 2 will be proved.
T ˇ n p var ( T ˇ n p ) D N ( 0 , 1 ) .
By some simple calculations, we have var ( T ˇ n p ) = σ 4 tr ( Σ * 2 ) . Let Z n i = j = 1 i 1 X i μ i t , X j μ j t ε i ε j / n 2 , v n i = E [ Z n i 2 | F i 1 ] , X u i = ( ξ i , U i , ε i ) T , where F i = { X u 1 , , X u i } is a σ -algebra produced by { X u i , k = 1 , , i } , V n = i = 2 n v n i . The condition E [ Z n i | F i ] = 0 is readily verifiable, and the sequence { i = 2 j Z n i , F j : 2 j n } constitutes a mean-zero martingale. Adherence to the Martingale central limit theorem is ensured by verifying the following conditions:
V n var ( T ˇ n p ) P 1 , as n ;
i = 2 n σ 4 tr 1 ( Σ * 2 ) E { Z n i 2 I ( | Z n i | > η σ 2 tr ( Σ * 2 ) ) | F i 1 } P 0 for η > 0 .
Note that
v n i = σ 2 n 2 j = 1 i 1 ε j 2 ( ξ j μ ( U j ) ) T Σ * ( ξ j μ ( U j ) ) + 2 k < l < i ε k ε l ( ξ k μ ( U k ) ) T Σ * ( ξ l μ ( U l ) ) .
Then, we define the following:
V n var ( T ˜ n p ) C n 1 + C n 2 ,
where
C n 1 = 1 n 2 σ 2 tr ( Σ * 2 ) j < i ε j 2 ( ξ j μ ( U j ) ) T Σ * ( ξ j μ ( U j ) ) , C n 2 = 2 n 2 σ 2 tr ( Σ * 2 ) k < l < i ε k ε l ( ξ k μ ( U k ) ) T Σ * ( ξ l μ ( U l ) ) .
The equality E [ C n 1 ] = 1 can be readily confirmed, and we have the following:
var ( C n 1 ) = E [ C n 1 2 ] 1 = 1 n 4 ( σ 4 Σ * 2 ) 2 i = 2 n ( ( i 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) σ 4 ε 1 4 ] + ( i 1 ) ( i 2 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) σ 4 ε 1 2 ε 2 2 ] ) + 1 n 4 ( σ 4 Σ * 2 ) 2 i = 2 n j i ( ( i 1 ) ( j 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) σ 4 ε 1 4 ] + ( ( i 1 ) ( j 1 ) ( i 1 ) ( j 1 ) ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) σ 4 ε 1 2 ε 2 2 ] ) 1 = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] ) = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( E [ tr ( Σ * ( U 1 ) Σ * Σ * ( U 1 ) Σ * ) ] + tr 2 ( Σ * ( U 1 ) Σ * ) + Δ tr ( diag ( Γ T ( U 1 ) Σ * Γ ( U 1 ) ) Γ T ( U 1 ) Σ * Γ ( U 1 ) ) ) = E [ ε 1 4 ] n σ 4 tr 2 ( Σ * 2 ) O ( tr ( Σ * 4 ) + tr 2 ( Σ * 2 ) ) ,
By (C2), we have C n 1 P 1 . Similarly, we can obtain E [ C n 2 ] = 0 , and we have the following:
var ( C n 2 ) = E [ C n 2 2 ] = O ( 2 n 2 2 tr 2 ( Σ * 2 ) i = 2 n ( ( i 1 ) ( i 2 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) ( X 2 μ ( U 2 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] + i = 2 n j i ( i 1 ) ( j 1 ) ( ( i 1 ) ( j 1 ) 1 ) E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 2 μ ( U 2 ) ) ( ξ 2 μ ( U 2 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] ) ) = O tr ( Σ * 4 ) tr 2 ( Σ * 2 ) .
Combined with tr ( Σ * 4 ) = o ( tr 2 ( Σ * 2 ) ) , we have C n 2 P 0 . Thus, Equation (A6) holds. Finally, we only need to prove (A7). Hence, leveraging the law of large numbers and the fact that E [ Z n i 2 I ( | Z n i | > η σ 2 tr ( Σ * 2 ) ) ] E ( Z n i 4 | F i 1 ) / ( η 2 σ 4 tr ( Σ * 2 ) ) , it suffices to demonstrate that 2 i n E ( Z n i 4 ) = o ( tr 2 ( Σ * 2 ) ) . Through straightforward computations, we obtain the following result:
i = 2 n E [ Z n i 4 ] = 1 n 2 2 j < i E [ ( ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ) 4 ε i 4 ε j 4 ] + i = 2 n E [ Z n i 4 ] = 1 n 2 2 j < i k j E [ ( ( ξ i μ ( U i ) ) T ( ξ j μ ( U j ) ) ) 2 ( ( ξ i μ ( U i ) ) T ( ξ k μ ( U k ) ) ) 2 ε i 4 ε j 2 ε k 2 ] = E 2 ( ε 1 4 ) n 2 2 i = 2 n ( i 1 ) E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ] 4 + 3 σ 4 E ( ε 1 4 ) n 2 2 i = 2 n ( i 1 ) ( i 2 ) E ( ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ) 2 ( ( ξ 1 μ ( U 1 ) ) T ( ξ 3 μ ( U 3 ) ) 2 ) = O E [ ( ξ 1 μ ( U 1 ) ) T ( ξ 2 μ ( U 2 ) ) ] 4 n 2 + O E [ ( ξ 1 μ ( U 1 ) ) T Σ * ( ξ 1 μ ( U 1 ) ) ] 2 n
Combining (C2) and Lemma 2, Equation (A7) holds. Thus, the proof of Theorem 2 is completed. □

References

  1. Crainiceanu, C.M.; Staicu, A.M.; Di, C.Z. Generalized multilevel functional regression. J. Am. Stat. Assoc. 2009, 104, 1550–1561. [Google Scholar] [CrossRef] [PubMed]
  2. Wang, J.; Zhou, F.; Li, C.; Yin, N.; Liu, H.; Zhuang, B.; Huang, Q.; Wen, Y. Gene Association Analysis of Quantitative Trait Based on Functional Linear Regression Model with Local Sparse Estimator. Genes 2023, 14, 834. [Google Scholar] [CrossRef] [PubMed]
  3. Kokoszka, P.; Miao, H.; Zhang, X. Functional dynamic factor model for intraday price curves. J. Financ. Econom. 2014, 13, 456–477. [Google Scholar] [CrossRef]
  4. Rigueira, X.; Araújo, M.; Martínez, J.; García-Nieto, P.J.; Ocarranza, I. Functional Data Analysis for the Detection of Outliers and Study of the Effects of the COVID-19 Pandemic on Air Quality: A Case Study in Gijón, Spain. Mathematics 2022, 10, 2374. [Google Scholar] [CrossRef]
  5. Yao, F.; Müller, H.G. Functional quadratic regression. Biometrika 2010, 97, 49–64. [Google Scholar] [CrossRef]
  6. Lian, H. Functional partial linear model. J. Nonparametr. Stat. 2011, 23, 115–128. [Google Scholar] [CrossRef]
  7. Zhou, J.; Chen, M. Spline estimators for semi-functional linear model. Stat. Probab. Lett. 2012, 82, 505–513. [Google Scholar] [CrossRef]
  8. Tang, Q. Estimation for semi-functional linear regression. Statistics 2015, 49, 1262–1278. [Google Scholar]
  9. Zhang, Y.; Wu, Y. Robust hypothesis testing in functional linear models. J. Stat. Comput. Simul. 2023, 93, 2563–2581. [Google Scholar] [CrossRef]
  10. Kokoszka, P.; Maslova, I.; Sojka, J.; Zhu, L. Testing for lack of dependence in the functional linear model. Can. J. Stat. 2008, 36, 207–222. [Google Scholar] [CrossRef]
  11. James, G.M.; Wang, J.; Zhu, J. Functional linear regression that’s interpretable. Ann. Stat. 2009, 37, 2083–2108. [Google Scholar] [CrossRef]
  12. Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
  13. Yu, P.; Zhang, Z.; Du, J. A test of linearity in partial functional linear regression. Metrika 2016, 79, 953–969. [Google Scholar] [CrossRef]
  14. Hu, H.; Zhang, R.; Yu, Z.; Lian, H.; Liu, Y. Estimation and testing for partially functional linear errors-in-variables models. J. Multivar. Anal. 2019, 170, 296–314. [Google Scholar]
  15. Smaga, Ł. General linear hypothesis testing in functional response model. Commun. Stat.-Theory Methods 2019, 50, 5068–5083. [Google Scholar]
  16. Zhu, H.; Zhang, R.; Li, H. Estimation on semi-functional linear errors-in-variables models. Commun. Stat.-Theory Methods 2019, 48, 4380–4393. [Google Scholar] [CrossRef]
  17. Zhou, J.; Peng, Q. Estimation for functional partial linear models with missing responses. Stat. Probab. Lett. 2020, 156, 108598. [Google Scholar] [CrossRef]
  18. Zhao, F.; Zhang, B. Testing linearity in functional partially linear models. Acta Math. Appl. Sin. Engl. Ser. 2024, 40, 875–886. [Google Scholar] [CrossRef]
  19. Hu, W.; Lin, N.; Zhang, B. Nonparametric testing of lack of dependence in functional linear models. PLoS ONE 2020, 15, e0234094. [Google Scholar] [CrossRef]
  20. Zhao, F.; Lin, N.; Hu, W.; Zhang, B. A faster U-statistic for testing independence in the functional linear models. J. Stat. Plan. Inference 2022, 217, 188–203. [Google Scholar] [CrossRef]
  21. Zhao, F.; Lin, N.; Zhang, B. A new test for high-dimensional regression coefficients in partially linear models. Can. J. Stat. 2023, 51, 5–18. [Google Scholar] [CrossRef]
  22. Cui, H.; Guo, W.; Zhong, W. Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. Ann. Stat. 2018, 46, 958–988. [Google Scholar] [CrossRef]
  23. Zhong, P.; Chen, S. Tests for high-dimensional regression coefficients with factorial designs. J. Am. Stat. Assoc. 2011, 106, 260–274. [Google Scholar] [CrossRef]
  24. Chen, S.; Zhang, L.; Zhong, P. Tests for high-dimensional covariance matrices. J. Am. Stat. Assoc. 2010, 105, 810–819. [Google Scholar] [CrossRef]
  25. Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer: New York, NY, USA, 2006. [Google Scholar]
  26. Shang, H.L. Bayesian bandwidth estimation for a semi-functional partial linear regression model with unknown error density. Comput. Stat. 2014, 29, 829–848. [Google Scholar] [CrossRef]
  27. Yu, P.; Zhang, Z.; Du, J. Estimation in functional partial linear composite quantile regression model. Chin. J. Appl. Probab. Stat. 2017, 33, 170–190. [Google Scholar]
Figure 1. The null distributions and q-q plots of our proposed test when g ( u ) = 2 u .
Figure 1. The null distributions and q-q plots of our proposed test when g ( u ) = 2 u .
Mathematics 12 02588 g001
Figure 2. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Figure 2. The null distributions and the q-q plots of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Mathematics 12 02588 g002
Figure 3. Empirical power functions of our proposed test when g ( u ) = 2 u .
Figure 3. Empirical power functions of our proposed test when g ( u ) = 2 u .
Mathematics 12 02588 g003
Figure 4. Empirical power functions of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Figure 4. Empirical power functions of our proposed test when g ( u ) = 2 + s i n ( 2 π u ) .
Mathematics 12 02588 g004
Figure 5. Empirical power functions of our proposed test with the Gaussian kernel when g ( u ) = 2 + s i n ( 2 π u ) .
Figure 5. Empirical power functions of our proposed test with the Gaussian kernel when g ( u ) = 2 + s i n ( 2 π u ) .
Mathematics 12 02588 g005
Figure 6. (a) The estimator of the slope function in model (10); (b) the estimator of the slope function in model (11).
Figure 6. (a) The estimator of the slope function in model (10); (b) the estimator of the slope function in model (11).
Mathematics 12 02588 g006
Table 1. When g ( u ) = 2 u , the empirical size and power for two tests are evaluated.
Table 1. When g ( u ) = 2 u , the empirical size and power for two tests are evaluated.
( n , p ) cN(0,1)t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T np T n T np T n T np T n T np
(50,11)0.000.0690.0600.0740.0770.0710.0710.0840.053
0.050.0970.1010.1330.1020.1350.1070.1480.111
0.100.2500.2110.2920.2440.2690.2250.3370.283
0.150.4740.3830.5550.4480.4940.4150.5760.470
0.200.7130.5840.7610.6310.7380.6020.7550.656
(100,11)0.000.0490.0520.0550.0520.0580.0590.0490.052
0.050.2330.1950.2750.2250.2170.1950.3120.268
0.100.6890.6030.7460.6600.7150.6180.7430.652
0.150.9610.8770.9560.9130.9630.8990.9310.884
0.200.9980.9840.9860.9750.9950.9750.9780.965
(100,49)0.000.0570.0600.0500.0610.0510.0490.0550.047
0.050.2240.2030.2820.2550.2360.2250.3050.288
0.100.7410.6070.7570.6650.7180.6150.7470.659
0.150.9620.9000.9470.8710.9500.8840.9380.886
0.200.9980.9810.9870.9770.9970.9780.9880.969
Table 2. When g ( u ) = 2 + sin ( 2 π u ) , the empirical size and power for two tests are evaluated.
Table 2. When g ( u ) = 2 + sin ( 2 π u ) , the empirical size and power for two tests are evaluated.
( n , p ) cN(0,1)t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T np T n T np T n T np T n T np
(50,11)0.000.0620.0640.0720.0690.0710.0720.0600.054
0.050.0870.0910.1090.0960.1120.1030.1170.106
0.100.2350.1970.2500.2190.2410.2080.2930.249
0.150.4200.3590.5020.4190.4490.3890.5060.449
0.200.6580.5450.7240.6050.6940.5730.7350.638
(100,11)0.000.0620.0620.0500.0460.0630.0600.0540.060
0.050.2170.2050.2390.2190.2350.2080.2550.240
0.100.6680.5680.7140.6170.6960.6050.7480.644
0.150.9460.8740.9470.8830.9460.8520.9270.873
0.200.9960.9800.9930.9721.0000.9790.9870.964
(100,49)0.000.0500.0620.0560.0630.0470.0670.0600.056
0.050.2260.2160.2490.2020.2090.1990.2770.256
0.100.6740.5620.7340.6110.6920.5820.7330.639
0.150.9430.8730.9430.8900.9410.8550.9250.889
0.200.9970.9760.9940.9810.9980.9800.9820.955
Table 3. When g ( u ) = 2 u , the empirical size and power for two tests are evaluated.
Table 3. When g ( u ) = 2 u , the empirical size and power for two tests are evaluated.
( n , p ) cN(0,1)t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T np T n T np T n T np T n T np
(50,11)0.000.0970.0660.0880.0760.0960.0740.1040.059
0.250.3780.4540.4420.5180.3890.4430.4870.572
0.500.5890.6950.6690.7490.6100.7020.7250.783
0.750.7650.8320.8110.8610.7690.8470.8320.882
1.000.8650.9170.8830.9250.8670.9120.8890.922
(100,11)0.000.0600.0670.0860.0640.0720.0580.0620.049
0.250.5110.7110.5820.7380.5690.7390.6180.760
0.500.8200.9320.8570.9320.8460.9220.8410.924
0.750.9490.9840.9430.9750.9350.9730.9170.967
1.000.9820.9980.9710.9860.9710.9890.9580.984
(100,49)0.000.2450.0640.2240.0580.2360.0580.2020.048
0.250.5410.4980.5700.5430.5340.5010.5630.564
0.500.7540.7710.8040.8070.7670.7760.7690.798
0.750.8840.9100.8990.9360.8990.9040.8790.887
1.000.9570.9660.9490.9680.9560.9640.9280.934
Table 4. When g ( u ) = 2 + sin ( 2 π u ) , the empirical size and power for two tests are evaluated.
Table 4. When g ( u ) = 2 + sin ( 2 π u ) , the empirical size and power for two tests are evaluated.
( n , p ) cN(0,1)t(3) Γ ( 1 , 1 ) lnorm(0,1)
T n T np T n T np T n T np T n T np
(50,11)0.000.0990.0690.0870.0750.0860.0630.1010.064
0.250.3530.4200.4230.4840.3830.4340.4820.538
0.500.5740.6610.6400.7140.6020.6770.6950.758
0.750.7510.8060.7830.8490.7500.8140.8040.860
1.000.8410.8930.8690.9130.8420.8970.8740.913
(100,11)0,000.0670.0580.0600.0540.0650.0550.0720.059
0.250.4860.6620.5450.7130.5370.6970.5910.742
0.500.7830.9010.8190.9060.8140.9070.8320.916
0.750.9300.9830.9180.9660.9300.9800.9150.955
1.000.9760.9960.9590.9830.9770.9940.9540.972
(100,49)0.000.2360.0660.2120.0660.2180.0610.2430.065
0.250.5430.4750.5400.5060.5260.4800.6580.612
0.500.7440.7450.7810.7910.7470.7500.8350.815
0.750.8850.8920.8900.9110.8750.8910.9130.913
1.000.9480.9640.9370.9560.9380.9550.9470.956
Table 5. Mean and standard deviations of our proposed test statistics across different scenarios.
Table 5. Mean and standard deviations of our proposed test statistics across different scenarios.
( n , p ) cN(0,1)t(3) Γ ( 1 , 1 ) lnorm(0,1)
meansdmeansdmeansdmeansd
(50,11)0.00−0.03330.9374−0.00610.9418−0.00430.9994−0.02470.9825
0.251.77382.20731.35101.96161.20821.86211.66622.2812
0.503.31892.73812.60002.53332.30502.40133.10533.0144
0.754.70673.07263.73882.93953.31752.78574.36263.5327
1.005.95213.28984.77173.24474.23253.06035.47803.9246
(100,11)0.00−0.02000.9308−0.00640.9198−0.06380.9361−0.02630.9369
0.253.74803.00492.69392.58952.30692.45233.06433.0377
0.507.00413.81595.16543.48004.51743.26025.79554.2112
0.759.89904.29907.41754.13366.57333.84128.23435.0734
1.0012.51214.61559.48004.62838.45854.290310.42375.7426
(100,49)0.00−0.03440.9647−0.08530.9221−0.06340.9468−0.03220.8838
0.251.39741.62032.45782.08142.28542.02472.77632.3084
0.502.68162.03024.62922.76384.27552.60325.07163.1856
0.753.84722.32596.49303.25206.01452.99036.97753.7744
1.004.92622.55508.10873.59797.54833.27158.60254.2086
(100,201)0.00−0.02840.9996−0.04640.9366−0.07990.9973−0.10590.9523
0.250.59651.17690.94101.26870.85171.25411.02211.3276
0.501.17301.31541.81791.52341.67491.46241.99411.6616
0.751.71071.43652.58471.71062.41351.62952.81491.9165
1.002.21701.53683.27421.86663.07511.75693.52902.1234
(200,365)0.00−0.11970.9914−0.11290.9450−0.08920.9681−0.11040.9420
0.250.80321.21293.34621.86783.06281.69373.39501.9900
0.501.67561.37405.98722.39965.54532.06815.99182.6471
0.752.48671.50368.07262.71887.54412.31248.01823.0317
1.003.24671.61089.76402.93159.19632.47039.65703.2855
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, F.; Zhang, B. A U-Statistic for Testing the Lack of Dependence in Functional Partially Linear Regression Model. Mathematics 2024, 12, 2588. https://doi.org/10.3390/math12162588

AMA Style

Zhao F, Zhang B. A U-Statistic for Testing the Lack of Dependence in Functional Partially Linear Regression Model. Mathematics. 2024; 12(16):2588. https://doi.org/10.3390/math12162588

Chicago/Turabian Style

Zhao, Fanrong, and Baoxue Zhang. 2024. "A U-Statistic for Testing the Lack of Dependence in Functional Partially Linear Regression Model" Mathematics 12, no. 16: 2588. https://doi.org/10.3390/math12162588

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop