**5. Simulation Study**

The control chart performance is many times measured by observing its power (1 − *β*). It is defined as the probability of identifying an out-of-control state when the process state is actually out-of-control [**?** ]. Moreover, assuming that the process is actually under control, the type I error (*α*) or false alarm rate will be defined as the probability of detecting an out-of-control signal [**?** ].

In the case of the process is under control, the probability of identifying an out-of-control observation should be small enough to prevent an unacceptable number of false alarms. Otherwise, if the process is effectively out of control, the power should be high enough to detect the process change as quickly as possible [**?** ].

Another common index for measuring the performance of a control chart is Average Run Length (ARL), which is defined as the average number of observations plotted before a signal is out-of-control. The ARL is equal to 1 *p* (if we can assume that the signals are independent, that the run length distributionisgeometric),where*p*istheprobabilityofhavinganout-of-controlsignal[**?**].

To evaluate the performance of a control chart in Phase II, the ARL0 and ARL1 are often used, which are the average number of observations until the first out of control is detected, in cases where the process is really under control (ARL0 = 1 *α* ) or actually is not (ARL1 = 1 1−*β* ) [**?** ]. The ARL1 should be at its lowest to increase the probability of quickly identifying events (1 − *β*), power of a test that lead to the process being out-of-control [**?** ].

Given that the *F* distribution is not known, a Monte Carlo simulation is designed to calculate the control charts power. The simulated scenarios allow us to estimate and compare the power of the control charts for different data depth measurements and for the case of independent and dependent functional data.

In this section, the performance of the control graphics proposed for Phases I and II will be evaluated. First, the simulation scheme designed in Febrero et al. [**?** ] will be used to evaluate the performance of the control chart proposal for Phase I. Realizations of a Gaussian stochastic process have been proposed following the expression below [**? ?** ]:

$$\mathcal{X}(t) = \mu(t) + \sigma(t) \cdot \epsilon(t),\tag{3}$$

whereby *σ*<sup>2</sup>(*t*) = 0.5 and

$$\mu(t) = \mathbf{E}\left(\mathcal{X}\left(t\right)\right) = \mathfrak{A}0t(1-t)^{3/2},\tag{4}$$

whereas (*t*) is a Gaussian process (*t*) ∼ *GP*(**<sup>0</sup>**, Σ) with **0** mean and variance-covariance matrix equal to

$$\mathbb{E}\left[\epsilon(t\_i) \times \epsilon(t\_j)\right] = \epsilon^{-\frac{\left|\boldsymbol{t}\_i - \boldsymbol{t}\_j\right|}{\boldsymbol{0}\boldsymbol{\beta}}}.$$

Additionally, Reference [**?** ] developed an alternative model to generate atypical curves, *μ*(*t*) = 30*t*3/2(1 − *t*). In Figure **??**a, the two functional means are presented. The black curve accounts for the process mean without atypical curves, while the red curve is the mean of the process that generates the atypical curves.

The control charts proposed in this work have been designed to monitor the functional mean to detect two events—change in the mean of the process in terms of the magnitude and shape—which reveal that the process is not under control. For designing control charts for Phase II, it is assumed that the process is under control, that is, outliers are not detected. To generate simulation scenarios for each of these events, the following functional means have been considered:

• Mean of the model with a change in the magnitude:

$$
\mu(t) = 30t(1-t)^{3/2} + \delta,\tag{5}
$$

by which *δ* denotes the change that goes from 0.4 to 2 in steps of 0.4.

• Mean of the model with a change in form:

$$
\mu(t) = (1 - \eta) \cdot \Re t (1 - t)^{3/2} + \eta \cdot \Re t^{3/2} (1 - t), \tag{6}
$$

where *η* is the change from 0.2 to 1 in steps of 0.2.

In Figure **??**b, the green curve accounts for the functional mean of a process when there is a change in the magnitude (*δ* = 0.7), while the curve of blue denotes the mean of a process when there is a change in the shape (*η* = 0.3).

In Febrero et al. [**?** ], the functional data X1, ... , X*n* denote realizations of a stochastic process *<sup>X</sup>*(·), assuming continuous trajectories in the [*a*, *b*]=[0, 1] period and independence between the curves. However, simulation scenarios in which the simulated curves are defined by a variable degree of dependence have also been considered. This is because several practical applications of this type of chart are related to continuously monitored data with respect to time, forming functional time series, such as the curves of daily energy consumption in commercial areas. In this way, dependent curves are generated from the model *Y*˜ *i*(*t*) = *μ*(*t*) + *σ*(*t*) · ˜(*t*), with ˜(*t*) = *ρ* · ˜*i*−<sup>1</sup>(*t*)+(<sup>1</sup> − *ρ*) ·  *<sup>i</sup>*(*t*), where *ρ* is the correlation measure between curves and *σ*(*t*) = 0.5 and both (*t*) and ˜(*t*) are Gaussian processes [**?** ].

In order to compare the results of the simulations in the scenarios defined by independence and dependence between the curves, the variance of  is rescaled (we define the variance of the error ˜ to be one). Specifically, considering *σ*<sup>2</sup>  = (<sup>1</sup>−*ρ*<sup>2</sup>) (<sup>1</sup>−*ρ*)<sup>2</sup> = (<sup>1</sup>+*ρ*) (<sup>1</sup>−*ρ*), you have *σ*<sup>2</sup> ˜ = 1.

In Figure **??**, different scenarios are presented considering the changes in the functional mean of the process, in the shape and magnitude, in the cases of independence and dependence between curves. The gray curves show the realizations of the process when it is under control (whose mean is the Equation (**??**)). However, the red curve in each graph accounts for the scenarios in which the presence of events that destabilize the process is considered, that is, the process is not under control. In Figure **??**a,b, the cases of independence between curves and the presence of events defined by changes in the functional mean in terms of the magnitude and in shape, respectively, are shown. However, Figure **??**c,d show two cases defined by the presence of dependence between curves, including changes in the magnitude of the mean (panel c), with respect to its shape (panel d).

**Figure 9.** (**a**) Functional means and (**b**) changes in the shape and the magnitude in the mean of the process.

In the building energy efficiency domain, the out of control signals of energy consumption, temperature, CO2 proportion and humidity, among others, can be defined by a change of shape and/or magnitude with respect to the under control signals analogous to those analyzed in the simulation study. The change in magnitude is related with a change in the scale of the studied process, for example, the increasing of energy consumption, temperature, humidity or concentration of CO2 in all the hours corresponding of a specific day with respect to the otherwise normal pattern. This type of change is accounted by the addition of the *δ* term to the Equation (**??**) in order to obtain the Equation (**??**). Thus, the amount of that change is controlled by *δ* parameter. The green curve of Figure **??** accounts for an example of magnitude change. If we compare with the black one, we could realize that the two curves have the same shape and the only difference is the scale. On the other hand, changes in the shape of the curves are introduced by modifying the Equation (**??**) by the *η* parameter resulting in the Equation (**??**). In the energy efficiency domain, this type of change can be related to a change in the HVAC facilities programing (changes in temperature regulation, changes in the time schedule), in a failure of the HVAC in just one interval of the day, an extremely high or low level of occupation in a building and extreme changes in the weather, among other causes. The proposed simulation study has performed taking into account the specific domain where the FDA control chart approach is applied, namely energy efficiency in buildings. Thus, the shape of these types of profiles is similar to that corresponding to CO2, temperature, energy consumption and humidity curves in buildings. In order to measure the performance of the proposed control charts for very different types of profiles, new studies may be necessary.

In the following section, a simulation study is performed to determine conditions under which it can be verified that the smooth bootstrap procedure works when there is independence and dependence between curves.

**Figure 10.** Scenarios in which independence between curves is studied. Changes in the functional mean with respect to its magnitude (**a**) and shape (**b**) are shown. In the case of dependence, panels (**c**) and (**d**) show the simulation scenarios in which changes in the magnitude and shape, respectively, are observed in the functional mean.

### *5.1. Measurement and Comparison of the Performance of the Control Chart Proposed for Phase I*

The performance of the control chart is estimated and compared from the generation of calibrated samples of size 50 and 100 (curves). For each sample, different functional depth measurements described in Section **??** are calculated and the outlier detection robust procedures (weighted and trimmed) are applied for the estimation of type I error when the process is under control and the power of the test when the process is out-of-control. For the estimation of type I error, each scenario is replicated 1000 times (*n* = 50, 100, assuming independence and dependence between curves). When the power of the test is estimated, in each scenario (assuming independence and dependence), a curve within the alternative hypothesis is generated; this procedure is also repeated 1000 times.

Following the scheme described in Reference [**?** ], curves observed at equidistant points are considered; the number of points that define each curve is 51 in the interval [0, 1]. From 1000 resamples (*B* = 1000) and with a 2.5% trimming procedure (removing less deep curves), a smoothing bootstrap procedure defined by a smoothing factor *γ* = 0.05 is applied to estimate the *C* = 0.01 quantile representing the LCL.

First, a simulation study is performed to estimate and compare the type I error (*α* = 0.01 is fixed) of the proposed control chart, assuming scenarios with independence and dependence between curves. Subsequently, a similar study is carried out to estimate and evaluate the power of the control chart to detect out-of-control signals in different situations (independence, dependence, different sample size and a change in the shape or magnitude).

In Table **??**, the results of the estimation of the false alarm rate (type I error) in the independence scenario are shown. The average of the percentage of false out-of-control signals (type I error) detected by the procedure shown above are very close to the nominal 1% for the two considered sample sizes. Furthermore, it can be observed that, when *n* increases, the type I error percentages are closer to the nominal level. In general, for a sample of size *n* = 100, the results of applying the weighted method are closer to *α*, especially when using the mode depth measurement. The results obtained in the simulations are similar to those presented in Reference [**?** ].

**Table 1.** Estimation of the false alarm rate (%) for the case of independence between curves using a nominal type I error of 1%.


In any process, type I error increases the production cost. Hence, it is essential not to overestimate this error rate when managing the quality.

Table **??** shows the results of the simulation to evaluate the ability of the control chart to detect a change in the shape or magnitude of the functional mean of the process through the estimation of its power (1 − *β*). The percentage of out-of-control signals (outliers) correctly detect when the population defined by the Equation (**??**) is contaminated with curves belonging to the *M*1 model (Equation (**??**)) and *M*2 model (Equation (**??**)); it is denoted by *pc*; however, the percentage of false alarms (false states out of control) is *pf* . These parameters have been estimated, in all the scenarios assuming the independence of curves, using the average of the corresponding empirical values, *p*ˆ*c* and *p*ˆ*f* .

Table **??** shows that a better performance is achieved when the curves of model *M*1 (where changes in the magnitude of the proposed control chart are simulated) are studied. Precisely, *p*ˆ*f* and *p*ˆ*c* are closer to the nominal *α* and (1 − *β*). When identifying changes in the shape of the process average, *M*2, the mode depth provides the highest percentages of correctly detected out-of-control signals. However, in the case of the *M*1 model, the use of RP depth provides percentages of the detection of the true out-of-control states lower than those corresponding to the use of FM and mode depths. With respect to a robust method for outlier detection, the performance is similar in all the scenarios. However, an exception is the case wherein the RP depth is used; it reveals the low performance of the control chart in detecting observations corresponding to actual out-of-control states.

Briefly, the detection rate of false out-of-control signals for the independence scenario is close to 1%. However, when using the trimmed method, the detection rate of false out-of-control signals is overestimated but this percentage decreases when the sample size increases.

The results of the false alarm rate (type I error) for scenarios defined by dependence between curves are shown in Table **??**. It is important to note that, for different values of *ρ*, very similar results with respect to the independence scenarios have been obtained. Precisely, the average of the percentages of false out-of-control signals are close to the nominal 1% in the two studied sample sizes. Additionally, when *n* increases, the type I error percentages are closer to the nominal level. However, some differences are observed when the RP data depth measure is used to develop the control chart. In this case, there is an overestimation of the percentage of false out-of-control signals.


**Table 2.** Percentages of *p*ˆ*c* and *p*ˆ*f* for the cases of curves simulated with *M*1 (Equation (**??**)) and *M*2 (Equation (**??**)) models, assuming independence between curves.



Tables **??**–**??** show the results of the empirical estimation of *pc* and *pf* , assuming different values of *ρ* (from 0.3 to 0.7). The power (estimated by *p*<sup>ˆ</sup>*c*) of the control chart proposed for the model *M*1 (Equation (**??**)) performs better when the weighted method is applied and if the sample size is increased. It is also observed that the performance of the control chart tends to be the same, independent of the type of data depth measurement used. Certainly, the performance of control charts in detecting real changes in the process, related to differences in the shape and mean, is better when the mode depth is used.


**Table 4.** Empirical values of *p*ˆ*f* and *p*<sup>ˆ</sup>*c*, with *ρ* = 0.3 (assuming dependence between curves).

**Table 5.** Empirical values of *p*ˆ*f* and *p*<sup>ˆ</sup>*c*, with *ρ* = 0.5 (assuming dependence between curves).



**Table 6.** Empirical values of *p*ˆ*f*and *p*<sup>ˆ</sup>*c*, with *ρ* = 0.7 (assuming dependence between curves).

With respect to the false out-of-control rate *p*ˆ*f* , in the scenarios corresponding to the use of *M*1 model, when the trimmed method is also used, a lower rate is obtained. In the case of the *M*2 model, there are similar results on the scenarios defined by independence between curves, that is, the *p*ˆ*f* is lower when the trimmed method for outlier detection is used.

In Reference [**?** ], new methods for the detection of outliers were proposed for the case in which there is dependence between curves. From the simulation studies carried out in this study, at different degrees of dependence, we can say that the outlier detection method proposed in Reference [**?** ] was relatively robust against the presence of dependence between curves. The simulation study performed in this section supports the results obtained in the work in Reference [**?** ] and, in conclusion, justifies the use of this method within the new control charts proposed for Phase I, even in scenarios with dependence between curves.

Although the application of the weighed outlier detection method to Mode data depth has generally provided Phase I control charts with best performance, if the false alarm rate of Tables **??** and **??** are observed, the use of trimmed outlier detection method tends to provide values of *p*ˆ*f* slightly closer to *α* = 1% (with respect to the weighted method) when the process is actually under control, the curves are independent and the sample is relatively small (*n* = 50). In addition, if the process is out of control, the curves are independent and the outliers are generated by the Equation (**??**) (changes in magnitude), the trimmed method applied to FM data depth provide a *p*ˆ*f* close to *α* = 1% and the highest *p*<sup>ˆ</sup>*c*, as shown in Table **??**. In all the remaining scenarios, the use of weighed method applied to Mode data depth tends to provide the closest to *α*% *p*ˆ*f*and the highest *p*ˆ*c* (see Tables **??**–**??**).

### *5.2. Measurement and Comparison of the Performance of the Control Chart Proposed for Phase II*

For Phase II, the monitoring stage, the use of the rank control chart has been proposed. The application of the rank control chart allows simultaneous monitoring of changes in the mean and variability of a process. In the functional case, in order to calculate the rank statistic, the functional FM, RP and mode depths are used.

An ARL0 = 1 *<sup>α</sup>*=0.025 (the monitoring sample is assumed under control) is assumed to evaluate the performance of the control chart. The power of the control chart is estimated and compared for an under-control process, based on the generation of a calibration sample of size *n* = 50 by a Monte Carlo procedure.

Following the simulation scheme of Phase I, curves observed at equidistant points are assumed; they are composed of 51 points at [0, 1] interval. A smoothed bootstrap with a smoothing factor *γ* = 0.05, 1000 resamples (*B* = 1000), using a 2.5% trimmed procedure (removing the shallowest curves), is applied to estimate and compare the power of the control chart to detect out-of-control signals when a significance level of *α* = 0.025 is assumed. Additionally, in the same way as in Phase I, the simulation of scenarios with independence and dependence between curves are assumed.

Table **??** shows the estimates of the power (%) of the control charts for the scenario of independence between curves, whereas Table **??** shows power of the control chart for the scenario with dependence. In both cases, the ability of the control chart to detect a change in the magnitude of the functional mean of the process is evaluated by the estimation of its power (1 − *β*).

**Table 7.** Power of the control chart, 1 − *β*, for the case *M*1 (Equation (**??**)) model in the scenario of independence between curves.


From the results of the Table **??**, any depth measure can be used to detect a shift in the process mean, since the same performance, in terms of power, is obtained.

**Table 8.** Power of the control chart, 1 − *β*, for the scenarios defined by the *M*1 (**??**) model, assuming dependence between curves.


The results of the detection of a shift in the process mean are shown in Table **??**. A similar performance is observed when using any depth measure for different values of *ρ*. Apparently, the control chart for Phase II is robust against the existence of dependence between curves.

As observed in the simulation study and in the analysis of the case study with real data, the present proposal of control charts for functional data, including Phase I and II control charts, can be useful to detect anomalies in diverse scenarios. In the case of its application to real data, the set of proposed techniques is being examined for implementation in the web platform Σqus and for its use by the company Nerxus for detecting false alarms in facilities in commercial areas. The present control chart methodology can be used for control tasks, monitoring, anomaly detection and continuous

improvement in diverse industrial processes, monitoring of environmental variables, chemical industry and, in general, any process involving continuous monitoring of functional data over time.

Regarding the use of our methodology in more complex case studies defined by different operation modes, the application of the multi-modelling framework methodology in combination with our proposal could be useful. Indeed, in the building energy efficiency domain, there are many different operation modes of installations, each one defined by a specific operation pattern. Namely, the HVAC installations can be operated in heating or ventilation modes (there are even different modes within ventilation or heating). The automatic classification of each profile in the corresponding profile pattern could be very useful in the building energy efficiency field and previously to the application of our control chart proposal for Phase I and Phase II. With respect to the work of Grasso et al. [**?** ], it is also interesting to mention that the proposed profile monitoring control charting scheme is that based on functional PCA and described in Colosimo and Pacella [**?** ].
