Variance Feedback Drift Detection Method for Evolving Data Streams Mining

Han, Meng; Meng, Fanxing; Li, Chunpeng

doi:10.3390/app14167157

Open AccessArticle

Variance Feedback Drift Detection Method for Evolving Data Streams Mining

by

Meng Han

^1,2,*,

Fanxing Meng

¹ and

Chunpeng Li

¹

School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China

²

The Key Laboratory of Images and Graphics Intelligent Processing of State Ethnic Affairs Commission: IGIPLab, North Minzu University, Yinchuan 750021, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 7157; https://doi.org/10.3390/app14167157

Submission received: 8 July 2024 / Revised: 11 August 2024 / Accepted: 12 August 2024 / Published: 15 August 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Learning from changing data streams is one of the important tasks of data mining. The phenomenon of the underlying distribution of data streams changing over time is called concept drift. In classification decision-making, the occurrence of concept drift will greatly affect the classification efficiency of the original classifier, that is, the old decision-making model is not suitable for the new data environment. Therefore, dealing with concept drift from changing data streams is crucial to guarantee classifier performance. Currently, most concept drift detection methods apply the same detection strategy to different data streams, with little attention to the uniqueness of each data stream. This limits the adaptability of drift detectors to different environments. In our research, we designed a unique solution to address this issue. First, we proposed a variance estimation strategy and a variance feedback strategy to characterize the data stream’s characteristics through variance. Based on this variance, we developed personalized drift detection schemes for different data streams, thereby enhancing the adaptability of drift detection in various environments. We conducted experiments on data streams with various types of drifts. The experimental results show that our algorithm achieves the best average ranking for accuracy on the synthetic dataset, with an overall ranking 1.12 to 1.5 higher than the next-best algorithm. In comparison with algorithms using the same tests, our method improves the ranking by 3 to 3.5 for the Hoeffding test and by 1.12 to 2.25 for the McDiarmid test. In addition, they achieve a good balance between detection delay and false positive rates. Finally, our algorithm ranks higher than existing drift detection methods across the four key metrics of accuracy, CPU time, false positives, and detection delay, meeting our expectations.

Keywords:

concept drift; variance; data stream; classification; statistical test

1. Introduction

Concept drift is a phenomenon in which data distribution changes over time. It often appears in time series data. In today’s world, a large amount of data are generated every moment, and concept drift phenomena are also emerging endlessly, such as the decline in product reputation caused by emergencies, public opinion drift, etc. Decrease in sales of certain types of goods due to changes in living habits, etc. Traditional machine learning models assume that the data distribution is stable. The occurrence of concept drift greatly reduces the performance of the original learning model. Therefore, rapid detection or response to concept drift is crucial to ensure the high learning efficiency of the classifier.

Active detection mechanisms represent one of the effective strategies for addressing concept drift. In this approach, a classifier employs a drift detector to discern shifts in the data stream’s underlying concepts. Upon detecting a drift risk, the drift detector promptly alerts the classifier, enabling timely adjustments. A prevalent approach for drift detection involves monitoring the classification performance of the model. According to the Probably Approximately Correct (PAC) theory [1], classifier performance tends to stabilize with increased learning time under stable conditions. Therefore, significant changes in the error rate of the base classifier can be viewed as indicators of concept drift. Concept drift typically arises from changes in the underlying data distribution. While monitoring the underlying data distribution facilitates a deeper understanding of concept drift, such methods are often complex and unsuitable for online environments dealing with massive data scales [2]. Consequently, drift detection methods based on classifier performance monitoring are frequently favored for their expedited processing speed and practicality.

To date, numerous drift detection algorithms have been proposed, encompassing both classic and more recent methods. Classic approaches include the Drift Detection Method (DDM) [3], Early Drift Detection Method (EDDM) [4], and Adaptive Sliding Window (ADWIN) [5]. Additionally, newer algorithms have emerged with notable efficacy, such as the Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds (HDDM) [2] and the Fast Hoeffding Drift Detection Method (FHDDM) [6]. Recent advancements include the Multi-level Weighted Drift Detection Method (MWDDM) [7], and the Group Drift Detection Method (GDDM) [8], among others.

However, many existing drift detection methods encounter several common challenges: (1) Most current drift detection methods use the same detection approach for different data streams, which limits the flexibility and generalizability of drift detection; (2) Most current drift detection methods face a trade-off between false positives and detection speed, which affects the classification performance of base classifiers. Thus, to address these key concerns and optimize the performance of drift detection, we introduced a unique solution and a novel drift detection method. This method tailors its strategies to different data streams, considering their unique characteristics. By doing so, it aids classifiers in achieving superior classification performance.

The main contributions of this article include four aspects:

Fast Variance Estimation Strategy for Small-scale Sliding Windows: We introduce a swift variance estimation strategy tailored for small-scale sliding windows that addresses the issue of small sample effects. This approach ensures both speed and flexibility in mutation drift detection.
Variance Feedback Strategy: We propose a variance feedback strategy that utilizes the variance calculated through the estimation strategy. This feedback dynamically adjusts the weighting of window data instances within the drift detector, amplifying the window mean. This enhancement significantly boosts detection speed while maintaining low false positive rates.
Concept Drift Detection Method Based on Variance Feedback (VFDDM): We develop the Concept Drift Detection Method Based on Variance Feedback (VFDDM). This method can be viewed as a drift detection framework that can be combined with various drift tests. It provides more suitable statistical tests and relevant parameters for different data streams based on the estimated variance, which is a relatively unique solution in the current research.
Design of Drift Detectors and Experimental Validation: Building upon VFDDM, the article presents three distinct drift detectors—VFDDM_H, VFDDM_M, and VFDDM_K—each employing different statistical testing methods. Comprehensive experiments conducted on synthetic and real datasets demonstrate the efficacy of the proposed drift detection method, showcasing its strong performance across various metrics.

The rest of the article is organized as follows: Section 2 provides a brief overview of related work, Section 3 describes the proposed algorithm in detail, Section 4 describes the experimental setup and analyzes the experimental results, and Section 5 conducts the research of this paper. summarized and proposed future work directions.

2. Related Works

In the field of machine learning, there are two primary approaches to addressing concept drift: active detection and passive adaptation. Active detection involves utilizing a drift detector to identify concept drift and subsequently triggering the classifier’s incremental mechanism. On the other hand, passive adaptation involves adjusting the model to cope with concept drift without employing a separate detection mechanism. This research focuses on the exploration of active detection mechanisms. This section begins by examining the definition and manifestation of concept drift and subsequently provides a comprehensive overview of classic and cutting-edge drift detection methods.

2.1. The Definition of Concept Drift

Concept drift is an abnormal phenomenon that appears in data flow. Assume that in a supervised learning environment, the data item arriving in the data flow at time t is {x_t, y_t}, x_t is the feature vector, x_t = {x₁, x₂,…}, and y_t is the label of the data item. The concept of data at time t is defined as P_t(x_t, y_t), and concept drift can be defined as the concept difference between time T (T > t) and time t, as shown in Equation (1) [9]:

\exists x_{t} : P_{t} (x_{t}, y_{t}) \neq P_{T} (x_{T}, y_{t})

(1)

Concept drift is divided into two types: virtual concept drift and real concept drift [10]. Virtual concept drift refers to the change of P(x), that is, the decision boundary remains unchanged, and the data distribution within the boundary changes; real concept drift refers to the change of P(y|x), which is intended to indicate that the decision boundary of the classifier has changed [11]. Figure 1 illustrates the difference between virtual concept drift and real concept drift, where the dotted line is the decision boundary. The difference between virtual concept drift and real concept drift lies in whether the actual decision boundary undergoes a fundamental change. As shown in Figure 1, the decision boundary for virtual concept drift (Figure 1b) and the decision boundary for the original distribution (Figure 1a) are essentially the same. The distinction is that in virtual concept drift, the internal distribution of the data types has changed compared to the original distribution. On the other hand, real concept drift represents a situation where the decision boundary undergoes a fundamental change, as depicted in Figure 1c. The content of this study falls within the category of real concept drift.

Real concept drift can be divided into four categories: Abrupt, Gradual, Incremental and Recurring according to different change characteristics [12]. Abrupt drift refers to the sudden generation of a new concept that replaces the original concept at a certain time; Gradual drift refers to the fact that data instances with new concepts slowly occupy the overall sample space within a certain period of time; Incremental drift refers to the gradual transformation of old concepts within a certain period of time. It is a new concept; Recurring drift refers to the regular replacement of two concepts in time. The schematic diagram of the four drift types is shown in Figure 2. Abrupt and gradual concept drift, being the most common types, are the focal points of our research.

2.2. Concept Drift Detection Methods

Concept drift is an important factor affecting the performance of classification models. Literature [13] divides concept drift detectors into three categories: (1) Error rate-based methods: These methods monitor the online error rate of the base classifier. If the error rate is significant changes will trigger a drift warning. DDM [3] and EDDM [4] are representatives of this group; (2) Distribution monitoring methods: This type of algorithm mainly monitors changes in data distribution. Significant differences in data distribution will be labeled as concept drift, and most unsupervised methods fall into this category; (3) Multiple hypothesis testing: This type of method uses multiple hypothesis tests to detect concept drift in different ways.

Error rate is a commonly used metric in concept drift detection research. This type of drift detector uses the classifier accuracy results as the basis for judging the occurrence of drift, and uses experience or statistical theory to judge significant changes in accuracy to issue drift alarm. This idea is also the focus of this study.

DDM [3] is a classic drift detection method based on error rate. This algorithm is based on binomial distribution and defines two thresholds: warning and drift. The online error rate p and standard deviation s are detection metrics, by adjusting confidence intervals to detect concept drift. EDDM [4] uses the distance between error rates as a metric to enhance the detection ability of gradual drift. The Reactive Drift Detection Method (RDDM) [14] uses an explicit mechanism to discard older instances based on DDM, thus improving the performance loss of DDM. After experimental verification, RDDM has a faster detection speed and higher overall accuracy than DDM. Similar to the idea of error rate monitoring, CusumDM [15] is based on the principle of Cumulative Sum (Cusum), the algorithm monitors the changes in cumulative sum, and uses the significant change in cumulative sum as the basis for judging the occurrence of drift.

Drift detection methods based on error rates are often combined with sliding window mechanisms. ADWIN [5] is a classic window mechanism detector. This algorithm combines adjoining windows with sub-window mechanisms. When there is a significant difference between the sub-window means in the new window, discard old windows when differences are noted. FHDDM [6] is based on a single window mechanism and judges concept drift by comparing the difference between the window observation mean and the expected mean. The significant difference between the two means illustrates the occurrence of concept drift. This algorithm determines the significant difference based on Hoeffding inequality. MWDDM_H [7] is also based on Hoeffding’s inequality and extends the window into a superimposed window to improve the speed of gradual drift detection. At the same time, the author proposes a multi-level weighted strategy, which adopts a weak weighted strategy in the stable phase and a strong weighted strategy in the warning phase, thereby further improving the speed of drift detection. MWDDM_H ranks among the top in terms of drift detection speed among current drift detectors, but it comes at the cost of a higher false positive rate. The primary goal of drift detection research is to help classification models maintain high accuracy. Both slower detection speeds and higher false positive rates can adversely affect the performance of classification models. Therefore, balancing detection speed and false positive rate is crucial for drift detectors. Achieving a good balance is essential to ensure the optimal performance of classification models, and addressing this issue is one of the key focuses of our research.

Error rate-based drift detection methods are often based on various statistical tests, such as FHDDM and MWDDM_H based on the Hoeffding test. HDDM [2] proposed two online non-parametric drift detection methods, HDDM_A and HDDM_W, based on Hoeffding bounds. HDDM_A is based on moving averages and is suitable for detecting abrupt drift; HDDM_W uses weighted moving averages to enhance the ability to handle gradual drift. McDiarmid Drift Detection Method(MDDM) [16] uses McDiarmid inequality to judge significant differences based on FHDDM. Experiments have shown that MDDM has a faster drift detection speed and lower false negative rate. WMDDM [17] is also based on McDiarmid inequality. This algorithm uses the sigmoid weighting function to implement instance weighting, and adjusts the adaptive factor to increase the weight after entering the warning level, thereby improving the speed of drift detection. Although WMDDM introduces the concept of dynamic weighting, its weight adjustment is based on the current phase, meaning that the weight coefficient increases with greater drift risk. While this strategy improves drift detection speed, it does not take into account the characteristics of different data streams. This aspect has not been addressed by current state-of-the-art methods. Therefore, providing personalized detection solutions based on data stream characteristics is a key focus of our research. Bhattacharyya distance-based Drift Detection Method (BDDM) [18] uses Bhattacharyya distance to detect concept drift. The author proposes a significance difference threshold suitable for weighted instances. Experiments have proved that BDDM is better than FHDDM in terms of accuracy and detection performance. Although BDDM demonstrates significant advantages on abrupt drift datasets, its performance in gradual drift environments still needs improvement. In our research, enhancing the adaptability of drift detectors in different data stream environments is also one of the key objectives. SeqDrift2 [19] uses Bernstein’s inequality to detect concept drift. The author believes that Bernstein’s inequality is suitable for low-variance environments. The literature [16] believes that SeqDrift2’s assumptions about the data environment are too strict.

3. Proposed Algorithms

This paper focuses on abrupt and gradual drift as the main research subjects. It proposes a small sample variance estimation strategy tailored for data streams. The estimated variance is used to assess the stability of the data stream. This approach enables the provision of different drift detection strategies for various data streams. This paper proposes a Variance Feedback Drift Detection Method (VFDDM), which can use a variety of different statistical testing methods. This article implements three drift detectors VFDDM_H, VFDDM_M and VFDDM_K based on this method.

3.1. Variance Estimation

Among concept drift detection methods, small-scale windows have greater advantages in abrupt drift detection. For example, FHDDM [6], MDDM [16] and BDDM [18] all use smaller windows to reduce detection delays. This research hopes to obtain the data stream variance on a small-scale window and characterize the data stream stability to a certain extent, so as to adopt personalized drift detection solutions for data streams with different characteristics. However, under small-scale window conditions, small sample effects will be faced. That is, when the window saves the classification results, if the window size is small, the prediction results in the window are likely to be all correct, and the window sample variance is 0 at this time. Obviously, the window variance cannot represent the overall variance of the data stream at this time. This is the problem that small-scale samples bring to variance estimation.

In response to the above problem, this study proposes a fast variance estimation strategy under small-scale window conditions to estimate the overall variance of the data stream. This strategy is divided into two phases: Variance sample collection and Rapid variance estimation.

Phase 1. Variance sample collection

In the sample collection stage, the Bernstein test is used as the basis for collecting variance sample points. The Bernstein test is a test method that depicts the upper bound of the difference between the observed mean and the actual mean, and also takes into account the impact of sample variance. Therefore, this test is suitable for environments involving sample variance. This test is based on Bernstein’s inequality [19]:

Theorem 1.

Bernstein’s inequality—Let X₁, X₂,…, X_n be n independent random variables located in the interval [0,1], with the variance is σ² and any ε_B > 0. Then,

\Pr (|\frac{1}{n} \sum_{i = 1}^{n} X_{i} - E [X]| > ε_{B}) \leq 2 \exp (\frac{- n ε_{B}^{2}}{2 σ^{2} + 2 ε_{B} / 3})

(2)

The significance confidence level δ generated by this inequality can be expressed as Equation (3).

δ = \exp (\frac{- n ε_{B}^{2}}{2 σ^{2} + 2 ε_{B} / 3})

(3)

Therefore, when the significance confidence level δ is given, the mean significant difference threshold ε_B can be expressed as Equation (4).

ε_{B} = \frac{1}{3} t + \sqrt{\frac{1}{9} t^{2} + 2 t σ^{2}}

(4)

t = \frac{1}{n} \ln \frac{1}{δ}

(5)

Equation (4) gives the threshold calculation method of the Bernstein test. The threshold is affected by the real-time variance of the sample and can be used as the basis for detecting significant changes in the sample.

In the variance sample collection stage, the minimum Bernstein bound is used as the minimum standard for variance sample collection. The minimum Bernstein bound can be calculated by Equation (4) and expressed as Equation (6).

{(ε_{B})}_{m i n} = {ε_{B}|}_{σ^{2} = 0} = \frac{2 \ln δ^{- 1}}{3 n}

(6)

The variance sample collection we proposed is based on the minimum Bernstein bound. When the difference between the current window mean p and the historical maximum mean p_max is within the minimum Bernstein bound (ε_B)_min, the data stream is considered stable and meets the variance sample collection conditions. This condition can be described as the Equation (7).

p > p_{m a x} - {(ε_{B})}_{m i n}

(7)

When the window mean p satisfies Equation (7), the variance estimation strategy will collect the latest data point X_n₋₁ in the current window as a sample for variance calculation. In practical situations, the probability of having a data stream variance of 0 is virtually nonexistent. Therefore, the Bernstein bound ε_B derived from the actual data stream variance will always be greater than (ε_B)_min. Consequently, when concept drift occurs, the mean p will satisfy p < p_max − ε_B and will not satisfy Equation (7). This ensures that the collected samples are not from the concept drift period, thereby guaranteeing that our sample collection phase is within the data stream’s stable phase as determined by the Bernstein test. The schematic diagram of this process is shown in Figure 3.

Discussion—Conditions for variance sample collection are set to obtain each data stream’s variance in its stable state, which helps characterize the data stream’s features. If conditions are not set for variance collection, then when drift occurs, the mean value of the sliding window must fluctuate greatly, so collecting data at risk of drift as a basis for variance estimation is unreliable. No matter which data stream is used, as long as concept drift occurs, the value of the window instance will fluctuate significantly.

Phase 2. Rapid variance estimation

According to the description of Phase 1, it can be seen that the sample collection behavior of the variance estimation strategy is carried out in a relatively stable stage considered by the Bernstein test. Therefore, we believe that the variance of the collected samples can be expressed as the variance of the data stream in the stable stage. Assuming that the samples collected in Phase 1 are Y₁,…, Y_m, the stable state variance of the data stream can be estimated as Equation (8).

σ^{2} = \frac{1}{m} {\sum_{i = 1}^{m} (1 - \frac{1}{m} \sum_{j = 1}^{m} Y_{j})}^{2}

(8)

Note that the samples in the stable state obey the binomial distribution, so we optimized the variance calculation into a cumulative form, as shown in Equation (9).

σ^{2} = \frac{1}{m + 1} (N_{1} \cdot {(1 - \frac{N_{1}}{m})}^{2} + (m - N_{1}) \cdot {(\frac{N_{1}}{m})}^{2})

(9)

Among them, N₁ is the number of correct predictions in the sample, and N₁ can be obtained through cumulative calculation. Therefore, every time the window is slid, the latest estimated variance value can be obtained by simply recording the predicted value of the new sample and accumulating it. No additional calculations are required, which improves work efficiency.

Figure 4 illustrates the overall workflow of the variance estimation strategy. First, in the variance sampling phase (Phase 1), the current sliding window mean p and the historical maximum mean p_max are used. Then, the minimum Bernstein bound is applied to determine whether the current state is suitable for sampling. If it is, the latest instance in the window is added to the sample pool. The sample pool accumulates to yield N₁. The estimated variance of the current data stream is then generated using the sample pool’s capacity mm and the accumulated number of correct instances N₁.

3.2. Variance Feedback

The variance feedback strategy we proposed provides personalized drift detection strategies for different data streams based on the variance estimated by the variance estimation strategy. This strategy will provide different mean calculation methods and statistical testing methods for drift detection based on different variances. Improve the pertinence of drift detection, thereby improving the detection speed and accuracy of the drift detector, and at the same time improving the performance of the base classifier. The work of the variance feedback strategy is divided into two phases: the Mean generation and the Statistical test selection.

Phase 1. Mean generation

First, in this stage, window instances will be weighted to reflect the difference between old and new data. We designed a weight adjustment strategy based on variance feedback. We believe that different variances represent different stability of data streams, so they should also be treated differently in weighting methods. In the field of concept drift detection, the purpose of weighting is to increase detection speed and reduce detection delay. Therefore, environments with lower variance can adopt a more relaxed weighting method than environments with higher variance, that is, the weight can be appropriately increased. This study uses a Sigmoid-like function to adjust the weight, and the weight adjustment strategy based on variance feedback can be expressed as Equation (10).

d i f f = d i f f_{0} \cdot (1 + \frac{1}{1 + e^{σ^{2}}})

(10)

Among them, diff₀ is the original weight factor. This formula reflects the design idea of using high weight in a more stable environment and low weight in a less stable environment.

Secondly, we designed a mean adjustment strategy based on variance feedback, which has the same theoretical basis as the weight adjustment strategy. We believe that under lower variance conditions, the data stream is relatively stable and there are fewer influencing factors such as noise, so the window state is relatively stable. The drift detection strategy in this case can be more relaxed, the opposite is true in the case of higher variance. We used the variance σ² to appropriately adjust the original observation mean p₀ of the window, which is specifically expressed as Equation (11). β is the mean influence factor.

p = \frac{p_{0}}{1 - β / (1 + e^{σ^{2}})}

(11)

Phase 2. Statistical test selection

In this phase, different statistical testing methods are provided to detect concept drift based on the variance of the data stream in a stable state. The selection strategy for statistical testing is based on the critical variance

σ_{c}^{2}

between the Bernstein bound and the Hoeffding bound, and it provides various concept drift statistical testing strategies for data streams through the relationship between the estimated variance value σ² and

σ_{c}^{2}

.

Discussion—The above approach principle is as follows: The Bernstein bound can be represented by Equation (4), while the Hoeffding bound under the same conditions can be represented by Equation (12).

ε_{H} = \sqrt{\frac{1}{2 n} \ln \frac{1}{δ}}

(12)

Observing the Bernstein bound and the Hoeffding bound, we refer to the variance of the data stream when they are equal to the critical variance, which can be expressed as Equation (13).

σ_{c}^{2} = \frac{1}{4} - \frac{1}{3} ε_{H}

(13)

The critical variance

σ_{c}^{2}

reflects the applicability range of the Bernstein bound. When σ² =

σ_{c}^{2}

, the Bernstein bound is equal to the Hoeffding bound, meaning that the Bernstein test and the Hoeffding test have the same theoretical delay for drift detection. When σ² <

σ_{c}^{2}

, the Bernstein bound will be smaller than the Hoeffding bound [19], resulting in a lower drift detection delay compared to the Hoeffding test. Additionally, as the variance decreases further, the drift detection delay of the Bernstein test will also decrease further. Therefore, in environments where σ² ≤

σ_{c}^{2}

, the Bernstein test shows a more pronounced advantage in detection speed. Reference [5] suggests that for low variance distributions, the Hoeffding bound overestimates the probability of large deviations. Therefore, using Bernstein testing in such cases theoretically provides a faster drift detection speed. Conversely, when σ² >

σ_{c}^{2}

, the Bernstein bound will be greater than the Hoeffding bound and the Bernstein test will lag behind the Hoeffding test in terms of detection speed. Moreover, as the variance increases, the disadvantage of the Bernstein test will become more pronounced. So, in this case, Bernstein’s test theoretically becomes suboptimal for drift detection. Based on the above analysis, the statistical test selection strategy can be expressed as Equation (14).

\{\begin{array}{l} i f : σ^{2} \leq σ_{c}^{2} \to Bernstein Test \\ i f : σ^{2} > σ_{c}^{2} \to Other Statistical Test \end{array}

(14)

Figure 5 illustrates the overall workflow of the variance feedback strategy. First, in Phase 1, the strategy uses the estimated variance σ² to generate the instance weight factor diff. Then, the initial mean p₀ of the window instances is generated. Further, the final mean p is generated based on the estimated variance and p₀, and the maximum mean p_max is updated. In Phase 2, the estimated variance σ² is used to determine the statistical test method for drift detection, thus completing the drift detection process.

From Equation (14), it can be inferred that when σ² >

σ_{c}^{2}

, the statistical test selection strategy will opt for other more compact statistical testing methods for drift detection. However, not all testing methods are suitable; this is due to considerations regarding detection speed. Reviewing the relationship between the Bernstein bound and the Hoeffding bound, the main reason for rejecting Bernstein testing under the condition of σ² >

σ_{c}^{2}

is that in this scenario, the Bernstein bound is larger, theoretically decreasing the drift detection speed. Therefore, alternative statistical testing methods chosen in this case should have faster detection speeds, namely, lower testing bounds. Based on the aforementioned requirements, this section will introduce some statistical tests suitable for environments where σ² >

σ_{c}^{2}

.

Statistical Test 1. Hoeffding’s Test for Drift Detection

The Hoeffding test is a commonly used statistical testing method in the field of concept drift detection. This test relies on Hoeffding’s inequality.

Theorem 2.

Hoeffding’s inequality—Let X₁, X₂,…, X_n be n independent random variables located in the interval [0,1], with the variance is σ² and any ε_H > 0. Then,

\Pr (|\frac{1}{n} \sum_{i = 1}^{n} X_{i} - E [X]| > ε_{H}) \leq 2 \exp (- 2 n ε_{H}^{2})

(15)

Therefore, when a significance confidence level δ is given, the significance threshold ε_H required for the Hoeffding test can be expressed as Equation (12). Reviewing Equations (4) and (12), it can be observed that when σ² >

σ_{c}^{2}

, ε_H < ε_B. Therefore, the Hoeffding test is a statistical testing method suitable for environments where σ² >

σ_{c}^{2}

, possessing drift detection performance that is at least superior to that of the Bernstein test.

Statistical Test 2. McDiarmid’s Test for Drift Detection

The experiments in MDDM [16] demonstrated that under identical conditions, McDiarmid’s inequality exhibits lower detection latency compared to Hoeffding’s inequality in drift detection tasks. Therefore, we believe that the McDiarmid test is also applicable in situations where σ² >

σ_{c}^{2}

. The McDiarmid test is based on McDiarmid’s inequality.

Theorem 3.

McDiarmid’s inequality—Let X₁, X₂,…, X_n be n independent random variables. If there exists a function mapping f, such that for X₁,…,X_n, it holds that for any i and

X_{i}^{'} \in R

, there is:

|f (X_{1}, \dots, X_{i}, \dots, X_{n}) - f (X_{1}, \dots, X_{i}^{'}, \dots, X_{n})| \leq c_{i}

(16)

This implies that by substituting any value for X_i to modify the function f, at most c_i changes. Thus, for all ε_M > 0, we have:

\Pr (E [f] - f \geq ε_{M}) \leq \exp (- \frac{2 ε_{M}^{2}}{\sum_{i = 1}^{n} c_{i}^{2}})

(17)

The significance threshold derived from the McDiarmid inequality can be expressed as Equation (18).

ε_{M} = \sqrt{\frac{\sum_{i = 1}^{n} c_{i}^{2}}{2} \ln \frac{1}{δ}}

(18)

c_{i} = \frac{w_{i}}{\sum_{i = 1}^{n} w_{i}}

(19)

where w_i represents the weight of data instances.

Statistical Test 3. Kolmogorov’s Test for Drift Detection

Literature [20] derived the Kolmogorov’s inequality and obtained the partial sum Kolmogorov inequality under Bernoulli distribution. This inequality can be proven to generate significance thresholds more compact than the Hoeffding bound. In the following, this research will introduce this inequality. For convenience, this form of the Kolmogorov inequality will be referred to as the Kolmogorov inequality.

Theorem 4.

Kolmogorov’s inequality—For a sequence of independent Bernoulli random variables X₁, X_2… with E(X_i) = p_i, which may not necessarily have the same distribution. For ∀ε < 1,

{\bar{p}}_{k}

+ ε > 1/2 or

{\bar{p}}_{k}

< 1/2. Then

\Pr (\sup_{k \geq n} ({\bar{X}}_{k} - {\bar{p}}_{k}) > ε_{K}) \leq \exp (- n (2 ε_{K}^{2} + \frac{1}{3} ε_{K}^{4} e^{\frac{ε_{K}^{2}}{4}}))

(20)

where

{\bar{X}}_{k} = (1 / k) \sum_{i = 1}^{k} X_{i}

and

{\bar{p}}_{k} = (1 / k) \sum_{i = 1}^{k} p_{i}

.

Research findings indicate that when k = n, this inequality can be transformed into a statistical testing method applicable to drift detection, termed Kolmogorov’s test, as shown in Equation (21).

\Pr (p_{m a x} - p > ε_{K}) \leq \exp (- n (2 ε_{K}^{2} + \frac{1}{3} ε_{K}^{4} e^{\frac{ε_{K}^{2}}{4}}))

(21)

It is easy to prove that the significance threshold ε_K generated by the Kolmogorov test is smaller than the threshold ε_H of the Hoeffding test. Therefore, under the condition where σ² >

σ_{c}^{2}

, it also theoretically possesses faster drift detection speed.

All three statistical testing methods mentioned above are applicable to σ² >

σ_{c}^{2}

. In the experimental section (Section 4), we will employ each of these three statistical testing methods for drift detection to validate the effectiveness of the variance feedback strategy proposed in this paper.

3.3. Variance Feedback Drift Detection Method

We devised and implemented VFDDM. This method utilizes a variance feedback mechanism. Firstly, our variance estimation strategy is employed to estimate the variance under the stable state of the data stream. Once the variance estimation is obtained, our variance feedback strategy is utilized to select a personalized drift detection scheme for the data stream. This includes setting up the window mean calculation method and selecting the statistical testing method. The overall schematic diagram of this method is illustrated in Figure 6.

Then we will illustrate the workflow of the VFDDM using the drift detector VFDDM_H based on the Hoeffding test as an example. The VFDDM_H employs a single sliding window mechanism [21], with a window size of n. The workflow of VFDDM_H is as follows:

Step. 1. The prediction results of the data stream instances (1 for correct, 0 for incorrect) will sequentially enter the window.

Step. 2. When the window is full, VFDDM uses the estimated variance σ² to generate weight diff and mean p for the current window instances and updates the historical maximum mean p_max, as shown in Equation (22) [6].

if p_{m a x} < p \Rightarrow p_{m a x} = p

(22)

Step. 3. Based on the variance σ², the statistical test method is selected for the current situation. When σ² ≤

σ_{c}^{2}

, the Bernstein test is employed; when σ² >

σ_{c}^{2}

, the Hoeffding test is used. This process is illustrated in Equation (23).

\{\begin{matrix} i f : σ^{2} \leq σ_{c}^{2} \to p_{m a x} - p > ε_{B} \\ i f : σ^{2} > σ_{c}^{2} \to p_{m a x} - p > ε_{H} \end{matrix} \to D r i f t : T r u e

(23)

Meanwhile, based on p, p_max, and Equation (7), it is determined whether the current state meets the variance sampling conditions. If it does, the latest instance prediction result is added to the variance sample pool, and the estimated variance σ² will be updated.

Step. 4. When the next instance prediction result enters the window, repeat Step 1 to Step 3.

In addition to VFDDM_H, we also implemented two other drift detectors: VFDDM_M based on the McDiarmid test and VFDDM_K based on the Kolmogorov test. For ease of exposition, we collectively refer to these three drift detectors, VFDDM_H, VFDDM_M, and VFDDM_K, as VFDDMs.

Algorithm 1 provides a pseudocode summary of VFDDM. The inputs to Algorithm 1 include the data stream S, the window win, and the variance sample pool vp, along with two parameters—the confidence level δ and the initial weighting factor diff₀. Lines 2–3 describe the sliding window insertion and eviction operations when a new instance arrives. Lines 4–5 describe the operation of the variance feedback strategy, including weight calculation (Equation (10)) and mean calculation (Equation (11)). Line 6 represents the update process for the maximum mean of the window (Equation (22)). Lines 8–10 describe the mean generation process in the variance estimation strategy. When the current condition satisfies the variance sample collection condition (Equation (7)), the most recent predicted instance is added to the sample pool vp. Additionally, the variance σ² is updated according to Equation (9). Lines 11–18 describe the process of selecting the statistical test method in the variance feedback strategy. First, the critical variance

σ_{c}^{2}

is calculated based on Equation (13) (Line 11). Then a decision is made: when the variance is small (σ² ≤

σ_{c}^{2}

), VFDDM adopts the Bernstein test; when the variance is large (σ² >

σ_{c}^{2}

), VFDDM adopts other test methods. Lines 16–18 describe the drift detection process: when the difference between the current mean p and the maximum mean p_max is significant, VFDDM issues a drift warning and performs a reset operation. Lines 19–23 describe the reset operation of VFDDM when a drift warning occurs, including mean reset, variance reset, base classifier reset, and drift status reset.

Algorithm 1 Variance Feedback Drift Detection Method (VFDDM)
	Input: Stream—S, Window—win, Variance pool—vp, Confidence level— $δ$ , Initial weighting factor— $d i f f_{0}$
	Output: Drift_Status
1	while S has a new instance S_i do
2		Insert prediction(S_i) into win
3		Forget the oldest instance in win
4		Calculate diff by Equation (10)
5		Calculate p by Equation (11)
6		Update p_max by Equation (22)
7		Calculate (ε_B)_min by Equation (6)
8		if p_max − p ≤ (ε_B)_min do
9			Add prediction(S_i) to vp
10			Update variance σ² by Equation (9)
11		Calculate critical variance $σ_{c}^{2}$ by Equation (13)
12		if σ² ≤ $σ_{c}^{2}$ do
13			ε ← Bernstein Test()
14		else do
15			ε ← other Test
16		if p_max − p > ε do
17			Drift_Status ← True
18			Reset()

19	def Reset()
20		Reset p ← p_max←0
21		Reset $σ^{2} \leftarrow σ_{c}^{2}$
22		Reset Base classifier
23		Reset Drift_Status ← False

Algorithm 2 provides the statistical test methods required for VFDDM. These methods all require two parameters as input: the confidence level δ and the window size n. They also return a significance threshold (Lines 4, 7, 12, and 19) for drift detection. Lines 1–4 describe the Bernstein test, which is used for cases with smaller variances. This test requires the data stream variance as an additional input parameter (line 2), and then calculates the significance threshold according to Equation (4) and returns it (lines 3–4). Lines 5–19 introduce the statistical test methods applicable for situations with larger variances, including the Hoeffding test (lines 5–7), the McDiarmid test (lines 8–12), and the Kolmogorov test (lines 13–19). The McDiarmid test requires additional calculation of the sum of squared weights (line 10), while the significance threshold for the Kolmogorov test cannot be directly computed and needs to be obtained through binary search approximation for an approximate solution (lines 17–18). Therefore, an approximate precision ap is required as an additional input (line 14). At the same time, since the objective of the Kolmogorov test is Bernoulli random variables, the weighted mean should be restored to the mean based on the Bernoulli distribution (Line 15 and Equation (11)).

Algorithm 2 Statistical Tests for VFDDM
	Input: Confidence level— $δ$ , Size of window—n
	Output: Significance threshold—ε
	Lower Variance—Bernstein Test
1	def Bernstein Test()
2		Extra input: Variance—σ²
3		Calculate ε_B by Equation (4)
4		Return ε_B

	Higher Variance—Other Tests
5	def Hoeffding Test()
6		Calculate ε_H by Equation (12)
7		Return ε_H

8	def McDiarmid Test()
9		Extra input: Weighting factor—diff
10		Calculate sigma by Equation (19)
11		Calculate ε_M by Equation (18)
12		Return ε_M

13	def Kolmogorov Test()
14		Extra input: Approximate precision—ap, Mean—p, Max of mean—p_max
15		Reverse solving for original p and p_max using Equation (11).
16		Find the equation E which satisfied ε_K according to Equation (21).
17		while precision(ε_K) > ap do
18			Binary search approximation for solving E
19		Return ε_K

3.4. Space and Time Complexity Analysis

This section provides a space and time complexity analysis of VFDDM. Firstly, VFDDM requires maintaining a sliding window of size n, where n is a fixed value during the algorithm’s operation. Therefore, the size of the sliding window is constant, resulting in a space complexity of O(1). Additionally, VFDDM requires several registers to store parameters such as the weight factor diff, the mean influence factor β, the variance σ², and the significance threshold ε. However, the storage of these parameters only requires a fixed number of registers, and the storage space does not increase with the number of window traversals. Thus, the space resources required by VFDDM are constant, with a space complexity of O(1).

The sliding window maintained by VFDDM moves 1 instance at a time, assuming there are S data instances. Each time the window slides, VFDDM operates according to the process outlined in Algorithm 1, with the main time expense being the traversal of the window, which has a time complexity of O(n). Therefore, the overall time complexity is O(n·S).

4. Experiment Evaluation

4.1. Datasets

To validate the effectiveness of VFDDM, we plan to conduct experiment evaluations on both synthetic and real datasets. This section will provide detailed information about the datasets used in the experiments. The synthetic data streams used in this study are based on the MOA framework [22], allowing for the definition of concept drift positions, frequencies, and durations under various conditions [23]. In the experiments, the synthetic data streams are uniformly set to a scale of 100,000 instances, with 10% noise included. For abrupt concept drift data streams, drift positions are set every 20,000 instances, occurring at positions 20,000, 40,000, 60,000, and 80,000. For gradual concept drift data streams, drifts occur every 25,000 instances, at positions 25,000, 50,000, and 75,000. Each type of data stream generator follows the aforementioned rules for random generation. The four synthetic data stream generators used in our research are as follows:

Sine—Abrupt drift. This generator sets two attributes, x and y, distributed uniformly in the interval [0,1]. Classification is performed based on the function y = sin(x), where instances below the curve are labeled as positive class, and instances above are labeled as negative class. During concept drift, the decision boundary reverses.

Mixed—Abrupt drift. This generator sets two attributes, x and y, uniformly distributed in the interval [0,1], along with two additional bool variables, v and w. Instances are labeled as positive class if they satisfy at least two out of the following three conditions: v = true, w = true, y < 0.5 + 0.3sin(2πx). During concept drift, the decision boundary reverses.

Circles—Gradual drift. This generator sets two attributes, x and y, uniformly distributed in the interval [0,1], and incorporates four types of circle equations, representing four concepts. Label categories are determined based on whether the point (x, y) lies inside or outside the circle, assigning positive or negative classes accordingly. Concept drift is induced by altering the circle equations.

LED—Gradual drift. The classification model aims to predict digits displayed on a 7-segment display, with each digit having a 10% chance of being displayed. The generator sets 7 relevant attributes along with 17 irrelevant attributes. Concept drift is simulated by swapping relevant attributes.

The Sine, Mixed, LED, and Circles synthetic dataset generators we used are widely applicable in the field of concept drift detection and have been extensively used in studies such as [6,16,17,24]. These generators can define the position and length of concept drift, making them suitable for evaluating the performance of drift detectors. Additionally, these simulation data generators are also provided within the MOA [22], ensuring high applicability for the datasets used in our experiments.

To evaluate VFDDM’s performance across different types of concept drift, we specifically selected two abrupt drift data stream generators (Sine and Mixed) and two gradual drift data stream generators (LED and Circles). This selection ensures the comprehensiveness of our experiments.

Meanwhile, we also utilize several classic real-world datasets to validate the performance of VFDDM. The descriptions of these real datasets are provided below:

Forest Covertype [25]—This dataset consists of 54 attributes and 581,012 instances, describing the forest cover types of 30 m × 30 m wilderness areas in four regions of the Roosevelt National Forest, obtained from the United States Forest Service Information Systems.

Electricity [25]—Originating from the New South Wales electricity market in Australia, this dataset comprises 45,312 instances and 8 attributes. The task of the classifier is to predict whether the electricity price will rise or fall. Concept drift mainly arises from the occurrence of sudden events (abrupt) or changes in consumer habits (gradual).

Pokerhand [25]—This dataset describes a poker game where 5 cards are randomly drawn from a standard deck of 52 cards to form various poker hands, and the task is to predict the value of these hands. The cards are ranked by suit and rank, and the hand value is a class consisting of 10 possible values. We used the normalized version of the dataset. The dataset comprises 829,201 instances with 5 numeric attributes and 5 categorical attributes.

Spam—This dataset is a spam email dataset containing both legitimate and spam emails. It consists of 10,000 instances, with each sample having 500 attributes and belonging to one of two classes.

The detailed information on the datasets is shown in Table 1.

4.2. Experimental Settings

This section describes the relevant settings of the experiments, including the comparison algorithms, experimental environment, and evaluation metrics. In this study, the data stream mining framework MOA [22] is utilized to implement the proposed drift detectors VFDDMs. The performance of VFDDMs is discussed by comparing them with classic or state-of-the-art drift detection algorithms such as FHDDM, MWDDM(MWDDM_H), MDDM, WMDDM, BDDM, RDDM, DDM, HDDM_A, and HDDM_W. For ease of comparison, we conducted the following settings:

(1) The window size n is set to 50, and the experiments are conducted under single-window conditions. The window size will affect the detection of different types of drifts. Smaller windows are suitable for detecting abrupt drifts, while larger windows are better for detecting gradual drifts. In literature [7], the authors used two sliding windows of sizes 25 and 100 to detect abrupt and gradual drifts, respectively. In our study, we adopted a compromise in the parameter settings, allowing us to evaluate both types of drifts simultaneously. In all testing methods, the confidence level δ was uniformly set to 10⁻⁷, a commonly used approach in the field of drift detection. The confidence level δ is a parameter that adjusts the strictness of the drift detection method. With a given confidence level δ, the detected concept drift theoretically has a confidence of 1 − δ. Therefore, the smaller the δ, the greater the confidence in the drift alert, and vice versa. Regarding algorithm performance, a smaller δ means stricter detection criteria and more precise detection results, and vice versa.

(2) Unless otherwise specified in the respective papers of the comparison algorithms, the weighting method is uniformly set to linear weighting. Additionally, the base weighting coefficient is uniformly set to 0.01. This weighting factor has been adopted by several studies, such as in references [7,16]. The weighting factor affects the detection speed of the drift detector. Generally, a higher weighting factor reduces the detection delay but comes with the drawback of a higher false positive rate. Therefore, the weighting factor should not be set too high.

(3) The parameter for the transition of the warning phase in BDDM under the NB classifier is adjusted to 0.5, meaning that drift detection is triggered when p_max − p ≥ 0.5p_max, aiming to achieve a higher level of drift detection performance.

(4) For WMDDM, the decay of the impact factor λ of its weight setting after entering the warning phase is set to linear decay.

(5) All other parameters of the comparison algorithms, not mentioned explicitly, are kept at their default values.

This study considers two models, Naïve Bayes (NB) and Hoeffding Tree (HT), as the base classifiers for the experiments. Both base classifiers mentioned above use default parameters provided by MOA. Experiments are run on Intel Core i5 (Dell Inspiron 5402 made in Beijing, China) @ 2.4 GHz with 16 GB of RAM running Windows 10.

This experiment will introduce the evaluation metrics used: Firstly, the classification accuracy of the base classifier is employed to assess the performance of the drift detection algorithm. The purpose of setting up drift detectors is to mitigate the impact of concept drift by alerting to drift risks, thus assisting the base classifier in achieving superior classification performance. Secondly, the performance of drift detection algorithms is analyzed using four specific metrics: True Positive (TP), False Positive (FP), False Negative (FN), and Detection Delay (DD). These metrics serve as crucial dimensions for evaluating the performance of concept drift detectors. To quantify these metrics, this study incorporates the concept of an acceptable delay length Δ [6]. Here, Δ represents the maximum tolerable delay for detecting drift in this experiment. The following will provide detailed explanations for these four metrics, with t representing the time of drift occurrence and T indicating the time when the drift detector issues an alert.

True Positive (TP)—When the drift warning occurs within the time interval [t, t + Δ], the drift identification is considered effective, and this interval is referred to as the True Positive Effective Interval. The True Positive Rate (TPR), also known as Recall in literature [18], is obtained by dividing the total number of true positives by the set drift quantity.

False Positive (FP)—When the drift warning occurs outside the True Positive Effective Interval, it is recorded as a false positive, indicating a detection error. The number of false positives can reflect the level of false detection of a drift detector.

False Negative (FN): At time t, concept drift indeed occurs, but the drift detector fails to report drift risks within the acceptable interval [t, t + Δ]. In this case, the drift is considered to be ignored by the detector. True Positives can be reflected inversely as False Negatives: the higher the TP, the lower the FN, and TP + FN = 1.

Detection Delay (DD): Defined as the difference between the time T when the drift warning is issued and the time t when the drift occurs. The detection delay must be within the acceptable delay length Δ to be considered legitimate. Drift warnings that exceed this range will be recorded as false positives (FP).

4.3. Experimental Analysis

4.3.1. Base Classifiers Performance Analysis

This section provides a detailed analysis of the classification performance of various drift detection algorithms, including the algorithms proposed in this paper, in terms of both accuracy and time consumption.

Table 2 presents the classification accuracy performance of NB and HT under different drift detectors on synthetic datasets. From Table 2, it can be observed that the classification models using VFDDMs demonstrate good classification accuracy on artificial datasets. In overall accuracy comparison, VFDDMs achieve an average accuracy ranking among the top three, indicating the excellent classification performance of VFDDMs.

In the comparison among the same base classifiers, VFDDMs also achieved optimal classification performance. In NB, the average rankings of VFDDM_K, VFDDM_H, and VFDDM_M were 2.5, 3.25, and 3.75, respectively. Additionally, the best-performing algorithms besides VFDDMs were MDDM and WMDDM (ranked 4.75 and 5.5). Furthermore, the lowest ranking of the proposed algorithms on NB was 7 (see the performance of VFDDM_M on the Mixed dataset), while the rankings of the other situations for the three algorithms were all in the top 50%. Similarly, in HT, all three variants of VFDDM showed the best average performance, followed by MDDM and MWDDM. Moreover, their accuracy rankings were all in the top 50% among all artificial datasets. This analysis demonstrates that the proposed algorithm exhibits strong adaptability and can provide excellent classification performance for different base classifiers on various datasets.

Furthermore, this paper will conduct a horizontal comparison of drift detection methods using the same statistical test method and demonstrate the advantages of the proposed algorithm. In drift detection methods using the Hoeffding test, FHDDM treats window instances following a binomial distribution without distinguishing the importance of instances within the window. Therefore, its performance is not as good as the MWDDM with a weighted strategy, with a difference in overall ranking of 0.5. On the other hand, the proposed Hoeffding test drift detector VFDDM_H significantly outperforms MWDDM in terms of accuracy. This is because the VFDDM adopts a dynamic weighting strategy, where the importance difference of instances is influenced by the overall variance of the data stream. Thus, its adaptability is relatively stronger compared to MWDDM. Additionally, MWDDM employs a conservative fixed weighting strategy in the warning phase, leading to an increase in the amount of retraining, which in turn affects the accuracy performance of the base classifier.

In the comparison of drift detection methods based on the McDiarmid test, VFDDM_M and MDDM both used the same weighting strategy in the experiment. However, VFDDM_M’s weighting conservatism can be self-adjusted according to the data stream environment, resulting in superior accuracy performance. Additionally, VFDDM_M also outperforms the adaptive adjustment strategy-based WMDDM. This is because WMDDM’s adaptive adjustment strategy involves adjusting decay parameters after the warning phase to accelerate detection speed without tracking the characteristics of the data stream itself.

Additionally, the Kolmogorov test can only handle binomially distributed data instances in drift detection due to the Kolmogorov inequality constraints. Weighting cannot be applied in this part, thus failing to reflect the differences in importance among instances. Thus, only VFDDM_K with semi-stage weighting outperforms the fully weighted MWDDM, MDDM, and WMDDM algorithms, further proving the effectiveness of our proposed strategy.

Figure 7 and Figure 8 illustrate the accuracy trends of the tested algorithms on NB and HT, respectively. The dashed vertical lines in the graphs represent drift points, with 4 drift points in the abrupt drift dataset and 3 drift points in the gradual drift dataset. The performance variation of base classifiers using different drift detectors at drift positions can be clearly compared from the graphs.

Figure 7 illustrates the accuracy trends of NB using different drift detectors on four synthetic datasets. From Figure 7a,b, it can be observed that the performance of VFDDM remains consistently high, with a faster response to abrupt drift and minimal performance loss at most drift points, highlighting the advantage of VFDDM in detecting abrupt drifts swiftly. In Figure 7c,d, the accuracy curve of VFDDM also ranks prominently. Although in the LED dataset, VFDDMs’ performance falls short compared to RDDM and HDDM_A, it outperforms these algorithms in other datasets.

Figure 8 illustrates the accuracy trends of HT using different drift detectors on four synthetic datasets. In Sine (Figure 8a), it can be observed that VFDDM maintains consistently high classification accuracy, with minimal performance loss at drift points compared to other drift detectors. In LED (Figure 8c), its performance is similar to that in Figure 7c. In the Mixed dataset, drift detectors with conservative detection tendencies perform better, such as FHDDM and WMDDM. FHDDM, which does not use a weighting strategy, outperforms MWDDM with a strong weighting strategy. WMDDM, based on the McDiarmid inequality, uses a Sigmoid function to dynamically generate instance weights based on the current classification accuracy, achieving adaptive weighting. This approach results in a higher overall accuracy compared to MDDM. This phenomenon in HT demonstrates its inherent adaptability in handling data stream classification tasks. The stable accuracy performance of most drift detectors in the Mixed dataset further confirms this observation.

Combining observations from Figure 7 and Figure 8, HT clearly outperforms NB in overall accuracy, highlighting HT’s advantage in data stream classification tasks. In vertical comparisons, VFDDM_H generally outperforms FHDDM and MWDDM in most cases; similarly, VFDDM_M generally outperforms MDDM and WMDDM in most cases, validating the effectiveness of the proposed VFDDM. This framework can enhance detection performance for original drift detection methods in most scenarios. In summary, VFDDM demonstrates excellent classification accuracy in most cases, with minimal performance loss at most drift points. While it may not achieve optimal performance in certain scenarios, its performance remains consistently high across all scenarios, indicating its significant overall advantage. This underscores our VFDDM’s strong generality and competitiveness.

Table 3 presents the time consumption results of the proposed algorithm compared to other benchmark algorithms. From Table 3, it can be observed that VFDDMs exhibit good time consumption performance in both NB and HT, comparable to FHDDM, BDDM, HDDM_A, and HDDM_W. Theoretically, for any given moment, FHDDM requires only cumulative operations to obtain the window mean, resulting in a time complexity of O(1), which is independent of the window size. Therefore, FHDDM consistently exhibits optimal time consumption both theoretically and experimentally. VFDDM_H employs a weighting strategy, which requires a traversal of the window each time it slides, resulting in a time complexity of O(n). Consequently, VFDDM_H has higher time consumption compared to FHDDM. VFDDM_M exhibits similar characteristics. In contrast, VFDDM_K, which uses the Kolmogorov test in high-variance data streams, does not rely on a weighting strategy and can also use cumulative calculations to obtain the mean. Thus, VFDDM_K achieves the same theoretical time complexity as FHDDM, which is O(1), in high-variance data stream environments. This explains why VFDDM_K performs better in terms of time consumption compared to VFDDM_H and VFDDM_M. However, the time consumption differences among the three variant algorithms are relatively low. MWDDM uses a multi-stage weighting approach, resulting in a theoretical time complexity of O(n) for each window slide. WMDDM, based on the McDiarmid inequality and instance weighting, also has a theoretical time complexity of O(n) per window slide. However, in practical implementation, WMDDM incurs additional computational overhead for calculating the McDiarmid bound and dynamically generating weight factors during each window slide. Therefore, the theoretical time complexity of WMDDM is higher than that of MWDDM. It is notable that the overall time efficiency of VFDDM_H is higher than the MWDDM algorithm, belonging to the same Hoeffding test category, demonstrating VFDDM_H’s good time performance. Furthermore, VFDDM_M exhibits better time consumption performance compared to the WMDDM based on similar tests. WMDDM’s weighting method uses the Sigmoid function, controlled by parameters λ and θ, resulting in higher computational overhead than VFDDM_M. This analysis demonstrates that the proposed VFDDM possesses excellent time consumption characteristics.

Finally, we compiled the ranking statistics for all algorithms on accuracy and CPU Times across all synthetic datasets, as shown in Figure 9. In Figure 9, the horizontal axis represents the ranking. Under the same horizontal coordinate, a larger area occupied by an algorithm indicates that the algorithm achieved that ranking more frequently in that metric. From the left graph, it can be observed that VFDDM demonstrates outstanding classification accuracy. Among the top ranks, VFDDMs’ classification accuracy occupies a significant proportion, indicating the general applicability of VFDDM across datasets. Additionally, from the right graph, it can be seen that VFDDMs also perform well in terms of time consumption, with a considerable proportion of cases showing high time efficiency rankings. Compared to other methods using the same test, VFDDMs also demonstrate strong competitiveness. This part of the results demonstrates the high efficiency of our variance quick estimation strategy. In summary, we assert that VFDDM possesses outstanding classification accuracy and time efficiency.

Table 4 shows the performance of accuracy and average drift alert count for four real datasets using NB and HT classifiers. Numerous studies indicate that concept drift in real datasets is unknown [16], which also limits the evaluation metrics for assessing drift detectors on real datasets. Due to the unknown drift locations, metrics such as delay, true positives, false positives, and false negatives, based on drift positions, cannot be determined. Therefore, this study focuses on discussing two metrics: classification accuracy and drift warning count, and analyzes the performance of drift detectors using these two metrics, while exploring the factors affecting classification accuracy on real datasets.

From Table 4, it can be observed that the DDM and HDDM series algorithms perform well in both NB and HT classifiers. The DDM algorithm performs best in NB-Covertype, HT-Covertype, and HT-Spam, but this performance may not solely be attributed to the drift detector. In fact, in a study by Bifet et al. [26], experiments were conducted on real datasets Electricity and Covertype, testing a learning strategy of periodically restarting the classification model, which resulted in better classification accuracy compared to drift detectors. Bifet et al. explained this phenomenon as being due to the existence of time dependency between data stream instances, as also mentioned in the literature [16]. Therefore, based on this conclusion, increasing the number of model restarts is often beneficial for the classification accuracy of data streams with time dependency. This also explains the phenomenon where DDM and HDDM series algorithms simultaneously achieve high levels of accuracy and average warning count on real datasets.

Based on our findings, relying solely on accuracy to evaluate drift detectors on real datasets is unconvincing. Therefore, when assessing the performance of drift detectors on real datasets, it is essential to consider both accuracy and the number of drift warnings. Adopting a controlled variable approach would be more convincing than solely focusing on accuracy. In other words, while DDM and HDDM series algorithms demonstrate excellent accuracy, this does not necessarily imply that they are the optimal drift detectors for real datasets.

In the NB classifier, BDDM and VFDDMs report almost the same number of drift risk instances in the Covertype dataset. However, VFDDMs exhibit approximately a 0.5 advantage in accuracy performance compared to BDDM, indicating that VFDDM is stronger in drift detection capability than BDDM. The performance of both algorithms in HT-Covertype further confirms this point. In the Spam dataset, although VFDDMs and DDM report the same number of drift warnings, VFDDMs achieve significantly higher accuracy than DDM, demonstrating the advantage of VFDDMs over DDM. Similarly, in the Spam dataset, it can be observed that the accuracy performance of VFDDMs is comparable to HDDM_W, while HDDM_W reports a higher number of drift warnings than VFDDMs. As previously analyzed in drift detection performance, HDDM_W has a relatively high false positive rate, which is further evidenced by its performance in NB-Spam, emphasizing the advantage of VFDDMs in having a low false positive rate.

In the HT classifier, the performance contrast of various drift detectors in the Spam dataset is most pronounced. The performance of VFDDMs is superior to MWDDM, with lower average warning counts, indicating higher precision in drift detection compared to MWDDM. Similar observations are noted for HDDM_A and HDDM_W. Although DDM performs best in this dataset, achieving a slight improvement of 0.04 compared to VFDDMs, the average warning counts reveal that DDM has a count of 0.85 while VFDDMs have a count of 0.09, and both have similar accuracy. Therefore, it can be inferred that DDM exhibits a higher false positive rate in its operation, further demonstrating the advantage of VFDDMs.

In conclusion, based on the above analysis, we believe that the VFDDM demonstrates outstanding performance in drift detection on real datasets, with high precision in detecting real concept drifts.

4.3.2. Drift Detection Performance Analysis

This section will provide a detailed analysis of the drift detection performance of various drift detection algorithms, including VFDDMs, focusing on detection delay, true positives, and false positives.

Table 5 presents the performance of VFDDMs and other comparison algorithms on the sudden drift datasets. From this table, it can be observed that regardless of the base classifier or the type of sudden drift dataset, VFDDMs successfully detected all configured drift positions with a true positive rate of 100%. Additionally, the false positive rate and detection delay remained at low levels. MWDDM, HDDM_W, BDDM, and VFDDMs exhibited shorter detection delays for concept drift in Sine and Mixed data streams, followed by MDDM, WMDDM, and FHDDM. It can be observed that drift detectors with shorter detection delays are generally based on instance weighting strategies. Among these, MWDDM applies the highest level of weighting, resulting in the lowest drift detection delay in abrupt drift data streams. However, excessive weighting can also lead to an increase in false positive rates, as shown in the table where MWDDM exhibits a high false positive rate. This indicates that an excessive focus on achieving faster detection speeds inevitably compromises the false positive rate, a point also demonstrated by HDDM_W. Among the drift detectors with faster detection speeds, VFDDMs show relatively better performance in terms of false positive rates. For example, in the Mixed dataset, VFDDMs and BDDM demonstrated significantly lower false positive rates than MWDDM and HDDM_W. BDDM, based on the Bhattacharyya distance, issues a drift alert when the Bhattacharyya distance between the current and historical optimal moment’s instance distributions exceeds a threshold. BDDM also employs an instance weighting strategy to further enhance detection speed. VFDDMs outperformed BDDMs in terms of accuracy, highlighting their superiority.

In the comparison among algorithms based on the same statistical test, VFDDM_H exhibits a larger detection delay compared to MWDDM. This is because MWDDM employs relatively large weighting parameters during the warning phase. In this study, the weighting weight during the warning phase is set to 5, which results in significant differences in the weights between new and old instances. Consequently, when the data stream encounters drift risks, the speed of drift detection increases. However, this comes at the cost of higher false positive rates, especially evident in the Mixed dataset where the false positive rate is 9 times that of VFDDM_H. This indicates that VFDDM_H demonstrates a dual advantage of low false positive rates and low latency in detecting sudden drifts. VFDDM_M and MDDM exhibit nearly identical performance in both true positive and false positive rates. However, VFDDM_M surpasses MDDM in detection delay. This is because MDDM employs fixed weighting parameters, whereas VFDDM_M adjusts its weighting strategy based on variance. In relatively stable data streams during non-drift states, VFDDM_M adopts a more lenient weighting strategy, thus enhancing drift detection speed. In terms of accuracy comparison, VFDDM_M also outperforms MDDM in average rankings on the sudden drift datasets. In summary, it can be concluded that VFDDM_M outperforms MDDM in detecting sudden drifts. The same rationale applies to WMDDM. VFDDM_M exhibits lower latency across all four sudden drift conditions. It only has slightly higher false positive rates than WMDDM on the HT-Mixed dataset.

Additionally, VFDDM_K exhibits the highest detection delay among VFDDMs. This is because the Kolmogorov test cannot weight data instances, thus failing to reflect differences in importance among them. As a result, its performance in terms of detection delay is not as good as VFDDM_H and VFDDM_M. However, its advantage lies in relatively lower false positive rates, and the delay rate is significantly improved compared to FHDDM, which also lacks weighting. Overall, in the context of sudden drift experiments, all three drift detectors demonstrate excellent performance.

Table 6 illustrates the performance of different drift detectors on the gradual drift dataset. In the LED dataset, regardless of whether the base classifier is NB or HT, VFDDMs achieve the optimal detection delay, surpassing the MWDDM algorithm with increased weighting factors. This indicates that the Bernstein test significantly helps to enhance the speed of drift detection in datasets with small variances. Moreover, VFDDMs outperform MWDDM in terms of accuracy. Additionally, VFDDMs exhibit the highest true positive rate, demonstrating the advantages of our variance feedback strategy. It is worth noting that MWDDM also achieves top detection speed in gradual drift datasets. This indicates that increasing the instance weighting strength improves drift detection speed in both abrupt and gradual drift environments, although at the cost of higher false positive rates, as shown by the performance of HDDM_W and MDDM. The performance of BDDM reveals that its advantage in detection speed is less pronounced in gradual drift environments, suggesting that the Bhattacharyya distance is more suitable for detecting concept drift in abrupt drift scenarios.

In the Circle dataset, FHDDM, MDDM, WMDDM, and VFDDMs demonstrate good performance in terms of false positive rates. In the Hoeffding test, VFDDM_H simultaneously achieves low false positive rates and detection delay rates. The McDiarmid test improves the speed of drift detection while maintaining the same false positive rate as MDDM.

Finally, we conducted a statistical analysis of the ranking distribution of various drift detectors based on two drift detection metrics: detection delay and false positive rate, as shown in Figure 10. From the left graph, it is evident that VFDDMs generally exhibit good drift detection delay performance. In comparison with many advanced drift detection algorithms, VFDDMs rank in the top 50% in most cases. The proportion of first-place rankings for VFDDMs is second only to MWDDM, primarily due to MWDDM’s high weighting coefficient. Moreover, VFDDMs outperform other drift detectors in terms of first-place rankings.

From the right graph, it can be observed that VFDDMs maintain a low false positive rate, indicating a lower false alarm rate and reducing unnecessary reboots, thereby ensuring classifier accuracy. Hence, VFDDMs perform best in terms of classification accuracy, consistent with the findings in the classification accuracy experiment section of this paper. It is noteworthy that the second place is consistently occupied by WMDDM, but with a frequency of only 1, indicating that it achieved the second place only in the NB-Mixed scenario.

In terms of false positive rate ranking analysis, FHDDM has the lowest overall false positive rate. However, in terms of detection delay ranking, FHDDM ranks lower, indicating its conservative detection strategy. While a conservative strategy is important, excessively conservative strategies can lead to less-than-ideal detection efficiency. Conversely, MWDDM has high detection efficiency but suboptimal false positive rate performance. Although its detection delay ranks among the top, it comes at the cost of a higher false positive rate.

In summary, VFDDMs demonstrate excellent performance in balancing detection delay and false positive rate. As shown in Figure 10, VFDDMs maintain a low false positive rate while ensuring high-ranking detection delay, placing them at the forefront among all compared algorithms. Experimental results confirm that our proposed VFDDMs possess good adaptability and can provide suitable drift detection strategies for different types of data streams, meeting our expectations.

Figure 11 illustrates the comprehensive performance of the drift detectors tested in the experiment across four key metrics: classification accuracy, CPU time consumption, detection delay, and false positive rate, under different base classifier scenarios. The algorithms extending further toward the edges of the graph exhibit superior overall performance. From Figure 11, it can be observed that VFDDMs consistently rank within the top three in both NB and HT scenarios, demonstrating the excellent overall performance of our VFDDM.

5. Conclusions

Concept drift significantly impacts decision-making efficiency, making it crucial to capture concept drift quickly and accurately. In current drift detection research, there are few approaches that dynamically adjust drift detection methods based on the characteristics of the data stream. Our research utilizes the variance of data streams to dynamically adjust drift detection strategies, thereby providing personalized drift detection strategies for different data streams.

Firstly, to estimate the variance in data streams under small window conditions, we propose a rapid variance estimation strategy, using the minimum Bernstein bound as the condition for variance sampling. Secondly, we introduce a variance feedback strategy that adjusts the weighting and mean calculation methods to provide targeted detection strategies for each data stream, improving detection speed and classifier accuracy. Lastly, based on these two strategies, we design a variance feedback-based drift detection method, VFDDM, and implement three drift detectors based on different testing methods: VFDDM_H, VFDDM_M, and VFDDM_K.

Experiments show that base classifiers based on VFDDMs achieve optimal overall classification accuracy. The three drift detectors based on VFDDM rank in the top three for accuracy among all compared algorithms. For drift detectors using the same statistical testing methods, there is a significant improvement in base classifier accuracy. The average ranking improvement for Hoeffding’s test ranges from 3 to 3.5, and for McDiarmid’s test, it ranges from 1.12 to 2.25. For weighted strategy drift detection algorithms, the improvement ranges from a minimum of 1.12 to a maximum of 4.25. These results thoroughly demonstrate the effectiveness of the proposed variance feedback strategy. This strategy allows for the setting of appropriate drift detection strategies for each data stream to some extent. In drift detection performance comparisons, VFDDM exhibits a low false positive rate while maintaining low detection delay, balancing these two metrics effectively without excessively sacrificing one for the other. Based on the accuracy performance of base classifiers using VFDDMs, we can conclude that balancing false positives and detection delay is beneficial for improving base classifier accuracy. This is precisely the advantage of the proposed methods.

In a comprehensive ranking analysis based on all evaluation metrics (accuracy, runtime, false positive rate, and detection delay), VFDDMs outperform other comparison algorithms, indicating the effectiveness of the VFDDM algorithm. This also demonstrates the effectiveness of our approach, which characterizes data stream features using variance and dynamically adjusts drift detection methods based on estimated variance.

Future work will focus on three aspects: Firstly, we will explore more statistical test methods and investigate their potential for concept drift detection, integrating them into the VFDDM. Secondly, we plan to incorporate a stacked window strategy into the VFDDM to further improve drift detection efficiency. Lastly, we will consider the feasibility of implementing a dynamic window in the VFDDM.

Author Contributions

F.M. completed the main work, the coding of the model, experiments and the writing of the main paper; M.H. reviewed the paper, and provided guidance on experiments and funding support; C.L. completed some experiments and the production of experimental diagrams. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (62062004), the Natural Science Foundation of Ningxia Province (2022AAC03279), and the Central Universities Foundation of North Minzu University (2021KJCX10).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

No potential conflict of interest was reported by the authors.

References

Perez, M.; Somenzi, F.; Trivedi, A. A PAC learning algorithm for LTL and omega-regular objectives in MDPs. In Proceedings of the AAAI Conference on Artificial Intelligence, Lexington, KY, USA, 18–22 November 2024; Volume 38, pp. 21510–21517. [Google Scholar]
Frias-Blanco, I.; del Campo-Avila, J.; Ramos-Jimenez, G.; Morales-Bueno, R.; Ortiz-Diaz, A.; Caballero-Mota, Y. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 2014, 27, 810–823. [Google Scholar] [CrossRef]
Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P. Learning with drift detection. In Advances in Artificial Intelligence–SBIA 2004: Proceedings of the 17th Brazilian Symposium on Artificial Intelligence, Sao Luis, Maranhao, Brazil, 29 September–1 Ocotber 2004; Proceedings 17; Springer: Berlin, Germany, 2004; pp. 286–295. [Google Scholar]
Baena-Garcıa, M.; del Campo-Ávila, J.; Fidalgo, R.; Bifet, A.; Gavalda, R.; Morales-Bueno, R. Early drift detection method. In Proceedings of the fourth international workshop on knowledge discovery from data streams, Philadelphia, PA, USA, 20 August 2006; Volume 6, pp. 77–86. [Google Scholar]
Bifet, A.; Gavalda, R. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining: Society for Industrial and Applied Mathematics, Minneapolis, MN, USA, 26–28 April 2007; pp. 443–448. [Google Scholar]
Pesaranghader, A.; Viktor, H.L. Fast hoeffding drift detec-tion method for evolving data streams. In Proceedings of the Machine Learning and Knowledge Discovery in Da-Tabases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, 19–23 September 2016; Proceedings, Part II 16. Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 96–111. [Google Scholar]
Chen, Z.; Han, M.; Wu, H.; Li, M.; Zhang, X. A multi-level weighted concept drift detection method. J. Supercomput. 2023, 79, 5154–5180. [Google Scholar] [CrossRef]
Yu, H.; Liu, W.; Lu, J.; Wen, Y.; Luo, X.; Zhang, G. Detecting group concept drift from multiple data streams. Pattern Recognit. 2023, 134, 109113. [Google Scholar] [CrossRef]
Guo, H.; Li, H.; Ren, Q.; Wang, W. Concept drift type identifica-tion based on multi-sliding windows. Inf. Sci. 2022, 585, 1–23. [Google Scholar] [CrossRef]
Wang, K.; Xiong, L.; Liu, A.; Zhang, G.; Lu, J. A self-adaptive ensemble for user interest drift learning. Neurocomputing 2024, 577, 127308. [Google Scholar] [CrossRef]
Usman, M.; Chen, H. Intensive Class Imbalance Learning in Drifting Data Streams. IEEE Trans. Emerg. Top. Comput. Intell. 2024. early access. [Google Scholar] [CrossRef]
Moradi, M.; Rahmanimanesh, M.; Shahzadi, A. Unsupervised domain adaptation by incremental learning for concept drifting data streams. Int. J. Mach. Learn. Cybern. 2024, 1–24. [Google Scholar] [CrossRef]
Lu, J.; Liu, A.; Dong, F.; Gu, F.; Gama, J.; Zhang, G. Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. 2018, 31, 2346–2363. [Google Scholar] [CrossRef]
Barros, R.S.; Cabral, D.R.; Gonçalves, P.M., Jr.; Santos, S.G. RDDM: Reactive drift detection method. Expert Syst. Appl. 2017, 90, 344–355. [Google Scholar] [CrossRef]
Basseville, M.; Nikiforov, I.V. Detection of Abrupt Changes: Theory and Application. Prentice Hall: Englewood Cliffs, NJ, USA, 1993. [Google Scholar]
Pesaranghader, A.; Viktor, H.L.; Paquet, E. McDiarmid drift detection methods for evolving data streams. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–9. [Google Scholar]
Hu, Y.; Sun, Z. Weight adaptive concept drift detection method based on McDiarmid boundary. J. East China Univ. Ofscience Technol. 2023, 49, 419–428. [Google Scholar]
Baidari, I.; Honnikoll, N. Bhattacharyya distance based concept drift detection method for evolving data stream. Expert Syst. Appl. 2021, 183, 115303. [Google Scholar] [CrossRef]
Pears, R.; Sakthithasan, S.; Koh, Y.S. Detecting concept change in dynamic data streams: A sequential approach based on reservoir sampling. Mach. Learn. 2014, 97, 259–293. [Google Scholar] [CrossRef]
Mavrikiou, P.M. Kolmogorov inequalities for the partial sum of independent Bernoulli random variables. Stat. Probab. Lett. 2007, 77, 1117–1122. [Google Scholar] [CrossRef]
Khamassi, I.; Sayed-Mouchaweh, M.; Hammami, M.; Ghédira, K. Discussion and review on evolving data streams and concept drift adapting. Evol. Syst. 2018, 9, 1–23. [Google Scholar] [CrossRef]
Bifet, A.; Holmes, G.; Pfahringer, B.; Kranen, P.; Kremer, H.; Jansen, T.; Seidl, T. Moa: Massive online analysis, a framework for stream classification and clustering. In Proceedings of the First Workshop on Applications of Pattern Analysis, Windsor, UK, 1–3 September 2010; pp. 44–50. [Google Scholar]
Tosi, M.D.L.; Theobald, M. Optwin: Drift identification with optimal sub-windows. In Proceedings of the 2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW), Utrecht, The Netherlands, 13–16 May 2024; pp. 331–337. [Google Scholar]
Han, M.; Mu, D.; Li, A.; Liu, S.; Gao, Z. Concept drift detection methods based on different weighting strategies. Int. J. Mach. Learn. Cybern. 2024; 1–24. [Google Scholar] [CrossRef]
“Forest Covertype”, “Electricity,” and “Pokerhand” Datasets. Available online: https://moa.cms.waikato.ac.nz/datasets (accessed on 1 July 2024).
Bifet, A. Classifier concept drift detection and the illusion of progress. In Proceedings of the Artificial Intelligence and Soft Computing: 16th International Conference, ICAISC 2017, Zakopane, Poland, 11–15 June 2017; Proceedings, Part II 16. Springer International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 715–725. [Google Scholar]

Figure 1. The difference between virtual concept drift and real concept drift (The dashed line represents the decision boundary).

Figure 2. The process of data distribution changes in four types of concept drift.

Figure 3. Variance sample collection process (Including the evaluation of variance sampling conditions and the sampling process).

Figure 4. Workflow of Variance estimation (Including the sampling process and the variance estimation process).

Figure 5. Workflow of Variance feedback (Including weight generation, mean generation, and statistical test selection).

Figure 6. Workflow of VFDDM (Including variance estimation, variance feedback, and drift detection stages).

Figure 7. The accuracy trend of NB using different drift detectors on the synthetic datasets.

Figure 8. The accuracy trend of HT using different drift detectors on the synthetic datasets.

Figure 9. The ranking frequency of accuracy and CPU time for base classifiers using different drift detectors on the synthetic datasets.

Figure 10. The ranking frequency of detection delay and false positives for different drift detectors.

Figure 11. The overall ranking of different drift detectors based on four key metrics: accuracy, CPU time, detection delay, and false positives.

Table 1. Detailed information about the synthetic and real datasets.

Category	Name	Instances	Attributes	Labels	Noise	Drifts	Type
Synthetic	Sine	100,000	2	2	10%	4	Abrupt
	Mixed	100,000	4	2	10%	4	Abrupt
	Circles	100,000	2	2	10%	3	Gradual
	LED	100,000	24	10	10%	3	Gradual
Real	Covertype	581,012	53	7	Unknown
	Electricity	45,312	9	2
	Pokerhand	829,201	10	10
	Spam	10,000	500	2

Table 2. The average accuracy performance of base classifiers using different drift detectors on the synthetic datasets (The best results are highlighted in bold).

Detector	Classifier	Sine	Mixed	LED	Circles	Rank
FHDDM	NB	86.5915 ± 0.5237	83.2989 ± 0.0728	89.3133 ± 0.0813	83.7427 ± 0.1678	7.13
FHDDM	HT	86.8120 ± 0.1980	83.2227 ± 0.2028	89.3037 ± 0.0806	86.5965 ± 0.1115	7.13
MWDDM	NB	85.9646 ± 0.5256	83.2366 ± 0.0831	89.3212 ± 0.0907	83.7970 ± 0.1623	6.63
MWDDM	HT	86.8369 ± 0.2981	83.0172 ± 0.1066	89.3110 ± 0.0885	86.5915 ± 0.0640	6.63
MDDM	NB	85.9628 ± 0.5300	83.2957 ± 0.0743	89.3206 ± 0.0767	83.7625 ± 0.1770	5
MDDM	HT	86.8190 ± 0.1840	83.1730 ± 0.1798	89.3110 ± 0.0759	86.6351 ± 0.0598	5
WMDDM	NB	85.9625 ± 0.1695	83.2989 ± 0.0728	89.3135 ± 0.0813	83.7444 ± 0.1695	6.13
WMDDM	HT	86.8614 ± 0.2239	83.1991 ± 0.1754	89.3040 ± 0.0805	86.6007 ± 0.1122	6.13
BDDM	NB	85.9689 ± 0.5286	83.2922 ± 0.0741	89.3212 ± 0.0907	83.7465 ± 0.1520	7.88
BDDM	HT	86.8305 ± 0.1907	83.1371 ± 0.2801	89.2979 ± 0.0849	86.1249 ± 0.1678	7.88
RDDM	NB	85.8188 ± 0.1660	83.1541 ± 0.1133	89.3839 ± 0.0894	83.6955 ± 0.1660	8.38
RDDM	HT	86.5227 ± 0.2134	82.9513 ± 0.1341	89.3752 ± 0.0855	86.5066 ± 0.0397	8.38
HDDM_A	NB	85.8211 ± 0.4886	83.2351 ± 0.0453	89.3482 ± 0.0949	83.7432 ± 0.1408	7.38
HDDM_A	HT	86.7317 ± 0.1770	83.1645 ± 0.0627	89.3393 ± 0.0941	86.4497 ± 0.1678	7.38
HDDM_W	NB	85.9647 ± 0.5331	83.2400 ± 0.0617	89.3176 ± 0.0890	83.7967 ± 0.1614	7
HDDM_W	HT	86.7525 ± 0.2781	83.1022 ± 0.1254	89.3083 ± 0.0881	86.6021 ± 0.0424	7
DDM	NB	83.7927 ± 2.3258	81.0950 ± 3.5314	89.3312 ± 0.0876	83.1943 ± 0.6260	10.12
DDM	HT	86.4267 ± 0.1591	82.6847 ± 0.2447	89.3130 ± 0.0902	86.4353 ± 0.0710	10.12
VFDDM_K	NB	85.9654 ± 0.5331	83.2951 ± 0.0735	89.3246 ± 0.0927	83.7575 ± 0.1732	3.5
VFDDM_K	HT	86.8762 ± 0.1915	83.1720 ± 0.1888	89.3147 ± 0.0904	86.6171 ± 0.0455	3.5
VFDDM_H	NB	85.9695 ± 0.5295	83.2937 ± 0.0753	89.3246 ± 0.0927	83.7601 ± 0.1729	3.63
VFDDM_H	HT	86.8762 ± 0.1915	83.1583 ± 0.1491	89.3147 ± 0.0904	86.6155 ± 0.0448	3.63
VFDDM_M	NB	85.9684 ± 0.5302	83.2938 ± 0.0755	89.3246 ± 0.0927	83.7619 ± 0.1763	3.88
VFDDM_M	HT	86.8762 ± 0.1915	83.1550 ± 0.1530	89.3147 ± 0.0904	86.6153 ± 0.0449	3.88

Table 3. The CPU Time performance of base classifiers using different drift detectors on the synthetic datasets (The best results are highlighted in bold).

Classifier	Detector	Sine	Mixed	LED	Circles
NB	FHDDM	259.66 ± 18.26	298.94 ± 23.69	524.63 ± 15.65	254.44 ± 12.90
	MWDDM	485.22 ± 49.33	528.56 ± 15.89	820.81 ± 61.20	460.50 ± 18.92
	MDDM	296.00 ± 33.54	341.88 ± 28.53	601.09 ± 21.79	293.41 ± 35.67
	WMDDM	484.13 ± 41.92	508.59 ± 29.46	832.38 ± 36.51	465.44 ± 14.01
	BDDM	259.44 ± 20.76	318.53 ± 47.17	639.06 ± 42.52	281.09 ± 16.64
	RDDM	257.56 ± 33.44	285.88 ± 29.06	611.66 ± 23.60	275.88 ± 13.73
	HDDM_A	255.75 ± 17.41	301.88 ± 18.29	596.53 ± 29.43	293.13 ± 27.08
	HDDM_W	264.97 ± 38.20	306.81 ± 28.14	613.78 ± 17.44	261.66 ± 19.20
	DDM	270.91 ± 16.82	298.63 ± 24.22	605.72 ± 22.39	243.38 ± 13.15
	VFDDM_K	267.19 ± 5.72	316.22 ± 38.12	567.22 ± 26.19	275.88 ± 13.73
	VFDDM_H	282.56 ± 18.21	318.03 ± 17.00	577.69 ± 22.64	273.13 ± 31.27
	VFDDM_M	281.72 ± 9.50	336.41 ± 23.43	592.34 ± 14.34	281.38 ± 15.72
HT	FHDDM	372.84 ± 28.72	463.16 ± 12.11	723.31 ± 27.20	368.44 ± 19.01
	MWDDM	609.03 ± 47.17	664.28 ± 27.66	975.25 ± 38.32	617.97 ± 27.74
	MDDM	409.53 ± 18.31	479.13 ± 32.01	832.66 ± 46.22	500.50 ± 75.78
	WMDDM	593.34 ± 27.18	663.94 ± 27.89	973.31 ± 41.36	620.38 ± 38.44
	BDDM	406.38 ± 48.70	460.63 ± 35.50	818.31 ± 57.38	386.72 ± 19.77
	RDDM	431.97 ± 60.63	463.78 ± 34.48	874.22 ± 65.95	403.75 ± 25.55
	HDDM_A	372.59 ± 26.91	427.69 ± 22.52	786.38 ± 35.52	371.53 ± 36.49
	HDDM_W	387.78 ± 7.82	455.69 ± 24.41	749.69 ± 37.15	371.53 ± 36.49
	DDM	406.63 ± 19.38	527.25 ± 35.04	862.50 ± 33.15	410.72 ± 41.07
	VFDDM_K	379.91 ± 17.17	456.50 ± 28.64	749.06 ± 35.80	388.19 ± 19.37
	VFDDM_H	382.44 ± 23.72	462.22 ± 8.17	747.75 ± 31.34	402.84 ± 17.09
	VFDDM_M	397.00 ± 16.95	483.88 ± 10.67	735.09 ± 7.08	411.28 ± 14.35

Table 4. The accuracy and average number of drift alerts of drift detectors for base classifiers on the real datasets (The best results are highlighted in bold).

Classifier	Datasets	Covertype		Electricity		PokerHand		Spam
Classifier	Metrics	Acc	Alert	Acc	Alert	Acc	Alert	Acc	Alert
NB	FHDDM	83.83	0.49	82.87	0.37	76.72	0.37	88.77	0.02
	MWDDM	85.16	0.78	84.76	0.55	77.47	0.51	89.59	0.11
	MDDM	84.04	0.55	83.19	0.41	76.90	0.39	88.78	0.02
	WMDDM	83.92	0.50	82.89	0.39	76.70	0.38	88.96	0.04
	BDDM	83.76	0.62	83.64	0.46	75.37	0.28	88.89	0.02
	RDDM	86.16	0.94	85.09	0.72	77.27	0.62	88.79	0.40
	HDDM_A	86.59	1.13	85.58	0.93	77.08	0.63	88.87	0.19
	HDDM_W	85.66	0.82	85.13	0.58	77.76	0.56	89.12	0.11
	DDM	86.61	1.59	82.57	0.63	65.84	0.10	84.09	0.09
	VFDDM_K	84.37	0.60	83.31	0.44	76.98	0.40	89.12	0.09
	VFDDM _H	84.42	0.62	83.41	0.46	77.01	0.40	89.12	0.09
	VFDDM _M	84.41	0.61	83.40	0.46	76.97	0.40	89.12	0.09
HT	FHDDM	83.98	0.47	86.20	0.27	76.88	0.35	90.46	0.04
	MWDDM	85.21	0.75	86.37	0.47	77.50	0.49	90.66	0.13
	MDDM	84.08	0.53	86.22	0.33	77.05	0.37	90.54	0.04
	WMDDM	84.07	0.49	86.56	0.29	76.86	0.36	90.62	0.06
	BDDM	83.93	0.59	86.56	0.38	75.48	0.27	90.74	0.04
	RDDM	85.81	0.92	86.94	0.63	77.40	0.61	90.89	0.38
	HDDM_A	86.52	1.27	86.94	0.93	77.09	0.62	90.78	0.21
	HDDM_W	85.43	0.81	86.81	0.52	77.75	0.53	90.51	0.17
	DDM	87.22	1.48	86.71	0.74	73.68	0.25	90.89	0.85
	VFDDM_K	84.35	0.59	86.28	0.37	77.07	0.38	90.85	0.09
	VFDDM _H	84.42	0.59	86.29	0.38	77.15	0.39	90.85	0.09
	VFDDM _M	84.39	0.59	86.29	0.38	77.13	0.39	90.85	0.09

Table 5. The true positives, false positives, and detection delay performance of different drift detectors on abrupt drift datasets (The best results are highlighted in bold).

Classifier	Datasets	Sine—Abrupt			Mixed—Abrupt
Classifier	Metrics	TP	FP	DD	TP	FP	DD
NB	FHDDM	4.00	0.00	43.50 ± 4.28	4.00	0.20 ± 0.45	51.35 ± 17.46
	MWDDM	4.00	0.40 ± 0.55	34.00 ± 4.65	4.00	3.60 ± 0.89	33.25 ± 1.16
	MDDM	4.00	0.00	40.90 ± 4.51	4.00	0.40 ± 0.55	41.45 ± 0.74
	WMDDM	4.00	0.00	43.50 ± 4.28	4.00	0.80 ± 0.45	43.50 ± 1.31
	BDDM	4.00	0.00	38.00 ± 4.69	4.00	0.20 ± 0.45	37.45 ± 1.02
	RDDM	4.00	2.60 ± 3.13	86.10 ± 22.02	4.00	2.20 ± 1.48	98.55 ± 13.00
	HDDM_A	4.00	0.40 ± 0.55	88.70 ± 15.83	4.00	0.20 ± 0.45	72.80 ± 23.88
	HDDM_W	4.00	0.00	34.85 ± 4.84	4.00	2.40 ± 0.55	35.85 ± 5.80
	DDM	2.60 ± 0.89	2.60 ± 0.55	155.02 ± 23.98	3.20 ± 0.84	2.20 ± 2.39	162.55 ± 8.49
	VFDDM_K	4.00	0.00	40.85 ± 5.06	4.00	0.40 ± 0.55	41.85 ± 0.88
	VFDDM_H	4.00	0.00	39.80 ± 4.56	4.00	0.40 ± 0.55	39.25 ± 1.02
	VFDDM_M	4.00	0.00	40.15 ± 4.51	4.00	0.40 ± 0.55	39.70 ± 1.05
HT	FHDDM	4.00	0.00	44.10 ± 3.69	4.00	0.40 ± 0.55	43.95 ± 1.11
	MWDDM	4.00	0.60 ± 0.55	33.50 ± 4.25	4.00	6.60 ± 2.07	32.65 ± 1.43
	MDDM	4.00	0.00	41.10 ± 3.97	4.00	1.00 ± 0.71	41.10 ± 0.68
	WMDDM	4.00	0.40 ± 0.55	44.25 ± 3.55	4.00	0.60 ± 0.55	44.45 ± 0.87
	BDDM	4.00	0.00	37.85 ± 4.91	4.00	2.20 ± 2.39	37.95 ± 1.62
	RDDM	3.60 ± 0.55	3.40 ± 3.65	97.60 ± 11.00	3.00 ± 0.71	5.20 ± 2.17	101.35 ± 10.11
	HDDM_A	4.00	0.80 ± 0.84	57.60 ± 11.88	4.00	2.00 ± 1.58	66.55 ± 14.72
	HDDM_W	4.00	0.20 ± 0.45	35.25 ± 3.73	4.00	3.40 ± 1.52	33.80 ± 1.90
	DDM	3.60 ± 0.55	2.40 ± 2.79	150.68 ± 11.34	3.00 ± 0.71	1.40 ± 0.89	172.03 ± 16.76
	VFDDM_K	4.00	0.20 ± 0.45	39.00 ± 4.76	4.00	1.00 ± 0.71	41.95 ± 0.33
	VFDDM_H	4.00	0.20 ± 0.45	39.00 ± 4.76	4.00	1.20 ± 1.10	39.55 ± 1.72
	VFDDM_M	4.00	0.20 ± 0.45	39.00 ± 4.76	4.00	1.20 ± 0.84	39.95 ± 1.56

Table 6. The true positives, false positives, and detection delay performance of different drift detectors on gradual drift datasets (The best results are highlighted in bold).

Classifier	Datasets	LED—Gradual			Circles—Gradual
Classifier	Metrics	TP	FP	DD	TP	FP	DD
NB	FHDDM	3.00	0.00	351.13 ± 72.72	2.60 ± 0.55	0.20 ± 0.45	172.47 ± 82.56
	MWDDM	3.00	0.00	253.87 ± 46.63	3.00	2.20 ± 1.64	132.87 ± 64.91
	MDDM	3.00	0.00	321.53 ± 67.64	2.60 ± 0.55	0.40 ± 0.55	144.90 ± 66.02
	WMDDM	3.00	0.00	329.07 ± 66.56	2.60 ± 0.55	0.20 ± 0.45	164.30 ± 78.32
	BDDM	3.00	0.00	340.93 ± 51.99	2.40 ± 0.55	0.40 ± 0.55	185.67 ± 88.92
	RDDM	3.00	0.20 ± 0.45	337.66 ± 26.70	3.00	2.20 ± 2.17	403.60 ± 57.38
	HDDM_A	3.00	0.00	310.26 ± 53.62	3.00	0.40 ± 0.55	237.46 ± 115.83
	HDDM_W	3.00	0.00	287.53 ± 72.13	3.00	1.40 ± 1.34	156.93 ± 75.61
	DDM	3.00	0.40 ± 0.55	432.80 ± 44.56	1.80 ± 1.10	2.20 ± 1.48	443.80 ± 262.17
	VFDDM_K	3.00	0.00	253.67 ± 47.96	2.60 ± 0.55	0.40 ± 0.55	157.73 ± 81.61
	VFDDM_H	3.00	0.00	253.67 ± 47.96	2.60 ± 0.55	0.60 ± 0.89	155.97 ± 81.94
	VFDDM_M	3.00	0.00	253.67 ± 47.96	2.60 ± 0.55	0.40 ± 0.55	143.63 ± 64.23
HT	FHDDM	3.00	0.00	336.93 ± 62.91	3.00	0.00	95.87 ± 50.92
	MWDDM	3.00	0.00	253.87 ± 46.63	3.00	1.20 ± 1.30	50.93 ± 29.11
	MDDM	3.00	0.00	328.20 ± 67.41	3.00	0.00	103.27 ± 46.83
	WMDDM	3.00	0.00	335.73 ± 64.82	3.00	0.00	90.07 ± 45.19
	BDDM	3.00	0.00	340.93 ± 51.99	3.00	0.60 ± 0.89	132.47 ± 73.71
	RDDM	2.80 ± 0.45	0.00	338.13 ± 26.04	3.00	0.40 ± 0.55	279.07 ± 6.66
	HDDM_A	3.00	0.00	299.80 ± 49.30	3.00	1.20 ± 1.30	73.80 ± 55.97
	HDDM_W	3.00	0.00	287.47 ± 72.14	3.00	0.80 ± 0.84	76.27 ± 30.30
	DDM	2.80 ± 0.45	0.20 ± 0.45	432.10 ± 45.92	3.00	0.60 ± 0.89	433.13 ± 19.25
	VFDDM_K	3.00	0.00	253.67 ± 47.96	3.00	0.00	79.07 ± 37.00
	VFDDM_H	3.00	0.00	253.67 ± 47.96	3.00	0.00	93.33 ± 30.81
	VFDDM_M	3.00	0.00	253.67 ± 47.96	3.00	0.00	97.47 ± 38.61

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Han, M.; Meng, F.; Li, C. Variance Feedback Drift Detection Method for Evolving Data Streams Mining. Appl. Sci. 2024, 14, 7157. https://doi.org/10.3390/app14167157

AMA Style

Han M, Meng F, Li C. Variance Feedback Drift Detection Method for Evolving Data Streams Mining. Applied Sciences. 2024; 14(16):7157. https://doi.org/10.3390/app14167157

Chicago/Turabian Style

Han, Meng, Fanxing Meng, and Chunpeng Li. 2024. "Variance Feedback Drift Detection Method for Evolving Data Streams Mining" Applied Sciences 14, no. 16: 7157. https://doi.org/10.3390/app14167157

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Variance Feedback Drift Detection Method for Evolving Data Streams Mining

Abstract

1. Introduction

2. Related Works

2.1. The Definition of Concept Drift

2.2. Concept Drift Detection Methods

3. Proposed Algorithms

3.1. Variance Estimation

3.2. Variance Feedback

3.3. Variance Feedback Drift Detection Method

3.4. Space and Time Complexity Analysis

4. Experiment Evaluation

4.1. Datasets

4.2. Experimental Settings

4.3. Experimental Analysis

4.3.1. Base Classifiers Performance Analysis

4.3.2. Drift Detection Performance Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI