Next Article in Journal
Root Resorption during Orthodontic Treatment with Clear Aligners vs. Fixed Appliances—A Systematic Review
Previous Article in Journal
Cephalometrics in Obstructive Sleep Apnea Patients with Mixed Dentition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

SCALE-BOSS-MR: Scalable Time Series Classification Using Multiple Symbolic Representations †

Department of Digital Systems, University of Piraeus, 185 34 Piraeus, Greece
*
Author to whom correspondence should be addressed.
This is the extended version of the conference paper “SCALE-BOSS: A framework for scalable time-series classification using symbolic representations” published in the year 2022 as a proceedings of the 12th Hellenic Conference on Artificial Intelligence.
Appl. Sci. 2024, 14(2), 689; https://doi.org/10.3390/app14020689
Submission received: 6 December 2023 / Revised: 6 January 2024 / Accepted: 9 January 2024 / Published: 13 January 2024
(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

Abstract

:
Time-Series-Classification (TSC) is an important machine learning task for many branches of science. Symbolic representations of time series, especially Symbolic Fourier Approximation (SFA), have been proven very effective for this task, given their abilities to reduce noise. In this paper, we improve upon SCALE-BOSS using multiple symbolic representations of time series. More specifically, the proposed SCALE-BOSS-MR incorporates into the process a variety of window sizes combined with multiple dilation parameters applied to the original and to first-order differences’ time series, with the latter modeling trend information. SCALE-BOSS-MR has been evaluated using the eight datasets with the largest training size of the UCR time series repository. The results indicate that SCALE-BOSS-MR can be instantiated to classifiers that are able to achieve state-of-the-art accuracy and can be tuned for scalability.

1. Introduction

Time Series Classification (TSC) is an important machine learning task for many branches of science. Time series classifiers incorporate intertwined components on building time series representations, learning predictive models exploiting such representations, and measuring similarities between time series. These implement laborious processes towards improving computational efficiency, especially for training TSC models and achieving accuracy of predictions.
TSC can be applied in many types of data such as ECG in medicine; sensor data in diverse fields, including Internet of Things (IoT); and even imaging data. Indicative applications include seizure detection [1], earthquake monitoring [2], insect classification [3], as well as applications in power systems [4]. In many of these applications, having an algorithm that responds fast and accurately is important. Many algorithms have been proposed to tackle the scalability issue [5,6,7,8,9,10,11], which are reviewed subsequently.
This work aims to contribute to this objective by proposing the SCALE-BOSS-MR framework for symbolic TSC that are able to achieve state-of-the-art accuracy with low execution time.
Indeed, the objective of this work is to provide a generic framework for building TSC algorithms that are efficient to learn models of time series, while achieving the accuracy reported by state-of-the-art algorithms. The main ideas behind the proposed framework are as follows: (a) Exploit symbolic representations of time series; (b) leverage TSC with efficient machine learning algorithms, and (c) incorporate techniques to increase efficiency without compromising accuracy. Specifically, we have chosen to use the state-of-the-art SFA symbolic representation of time series, together with the Bag-Of-SFA words (BOSS) approach for encoding the training and test set.
In a previous work, we have proposed the SCALE-BOSS framework [9] to balance between efficiency and accuracy for TSC. The SCALE-BOSS framework has two main variations: One using clustering and the other using classification algorithms for TSC along a pipeline. These variations exploit symbolic time series representations, resulting in concrete TSC algorithms. Specifically, we can build different TSC classifiers with the only requirement that the clustering algorithm uses some form of cluster representatives. Then, simply using a 1-NN classifier, we can classify time series by comparing them only to the (few, compared to the series in the training set) representatives. On the other hand, we can use classification methods that work on the term-frequency vector of the symbolic representation. One of the main conclusions of that study is that we can build very efficient TSC methods using clustering and classification methods that exploit time series’ class representatives, such as k-means, mini-batch k-means, or random forests. We have shown that SCALE-BOSS [9] manages to achieve significant improvement on the efficiency of training and testing TSC models, but with room to improve the accuracy of predictions.
Here, we leverage ideas from state-of-the-art algorithms, extending the SCALE-BOSS framework [9] using multiple time series representations to achieve state-of-the-art accuracy, while maintaining high computational efficiency.
Specifically, the contributions of this work are as follows:
  • We incorporate into SCALE-BOSS-MR and explore the use of
    • multiple window sizes;
    • multiple dilation parameters;
    • trend information by means of time series’ first-order differences.
    to balance accuracy with the scalability of TSC algorithms exploiting symbolic time series representation.
  • We present extensive results from a multitude of classifiers built using the SCALE-BOSS-MR framework. The proposed classifiers can provide either state-of-the-art accuracy or state-of-the-art scalability. Some of the proposed classifiers provide a balance between accuracy and scalability, meaning that they retain state-of-the-art accuracy while also being significantly faster than other state-of-the-art classifiers that use symbolic representations.
The paper is organized as follows: Section 2 provides an overview of the literature and describes how it relates to SCALE-BOSS-MR. Section 3 describes the pipeline of the framework and its instantiations. Section 4 evaluates different TSC algorithms created using the proposed framework, and finally, Section 5 concludes the paper.

2. Literature Review

As pointed out in the introductory part of this article, one of the main ideas of the proposed approach is to use a symbolic representation of time series. The following paragraphs elaborate on such representations and corresponding TSC methods.
In [12], the authors present the SAX-VSM classification algorithm that uses the SAX representation and Term Frequency—Inverse Document Frequency (TF-IDF) weighting. A tf-idf vector is created for each class, after the training set has been transformed into SAX words. Then, the set of target time series is transformed into SAX words and the Term-Frequency (TF) vector is created for each time series in the target set. To classify a time series, the cosine similarity between the TF vectors of that series and of the classes’ centers is computed.
In [13], the authors introduce the Bag of SFA Symbols Ensemble classifier (BOSS) that uses Symbolic Fourier Approximation (SFA) [14] to classify a time series. To compute SFA words, a number of Fourier coefficients are computed, which are grouped based on common prefixes, building histograms per group, discretized, and mapped to an alphabet. The SFA approximation, and thus BOSS, uses a symbolic representation based on the frequency domain, providing information about the whole series. Properties of this representation lead to significant lower training times compared to using the SAX representation. The BOSS ensemble classifier is based on a 1-NN classification using multiple BOSS models at different time series’ substructural sizes. However, BOSS requires the entire training set to be available while classifying target time series.
In [5], the authors present the BOSS-VS classification algorithm where each data point in the training set is transformed into SFA words. Then, a centroid is created for each class and the cosine similarity is computed between centroids and target series, as in [12]. This significantly reduces computational complexity and the memory footprint of the classification algorithm, since now each target time series is only compared to the class centroids. While BOSS-VS improves in scalability, it manages to do so at the expense of accuracy.
In [15], the authors present the K-BOSS-VS algorithm for time series classification, which is based on state-of-the-art symbolic time series classification algorithms, such as BOSS-VS and BOSS [5,13]. The main intuition behind the K-BOSS-VS method is that, by exploiting SFA representations, instead of having a single centroid to represent each class label, we can have K representatives per class. K-BOSS-VS applies the K-means clustering algorithm to each class label of the training set to obtain the K representatives per class. In so doing, it is between using a single representation per class, as in [5], or computing similarities with the entire training set, as in [13]. K-BOSS-VS addresses the scalability problem while at the same time achieving an accuracy greater than BOSS-VS and close to BOSS.
cBOSS [16] aims to speedup BOSS. Due to its grid-search method and the method of retaining ensemble members, BOSS is unpredictable in its time and memory resource usage. cBOSS utilizes an altered random selection of parameters of its ensemble members, allowing the user to control the build using a time contract. In doing so, it manages to significantly speed up BOSS while retaining accuracy.
S-BOSS [17] is a variation of BOSS that takes into account the location of the symbolic words in a series. The intuition is that in some datasets, the locations of certain discriminatory subsequences are important. Some patterns may gain importance only when they occur in a particular location, or a frequently occurring word may be indicative of different classes depending on when/where it occurs.
In [18], the authors propose WEASEL as a middle ground between BOSS-VS [5] and BOSS [13] for time series classification, balancing between accuracy and scalability. It uses SFA, but it does a few novel things: First, WEASEL performs feature discretization to reveal differences between classes; second, it uses windows of variable lengths, also considering the order of windows; and finally, it uses statistical feature selection, leading to significantly reduced runtime. Finally, WEASEL uses unigram and bigrams’ symbolic representation of the time series. WEASEL is more scalable and accurate than BOSS, but it is not as scalable as BOSS-VS, as shown in [9].
ROCKET [7] is a non-symbolic time series classification algorithm that uses numerous (10,000) convolution kernels together with a linear classifier (ridge regression or logistic regression). A significantly faster version of ROCKET, called MiniRocket, is presented in [8], which uses fewer parameters than ROCKET to extract features, without sacrificing accuracy. Multi-Rocket [10] uses first differences to further improve accuracy compared to Mini-Rocket: Given a time series x with subsequent points with arithmetic values x i , i = 0 , 1 , , first-order differences are computed following o u t i = x i + 1 x i , revealing time series trend information. Dilation allows for certain elements in the input time series to be downsampled by a factor d, by including every d-th time series value. The ROCKET family of algorithms does not employ a symbolic representation of time series, and it proves to be very scalable while maintaining state-of-the-art accuracy. However, first-order differences and dilation parameters can also be applied on top of algorithms that use symbolic representations to increase their accuracy.
Leveraging these ideas, WEASEL 2.0 [11] builds an ensemble of classifiers, choosing at random window sizes, dilation and the use of first-order differences. Combining windowing and dilation, each window at offset i and of length l includes l time series elements which are downsampled starting from i by a factor d: This combination increases the receptive field of the algorithms with respect to the windowing process. Using these techniques, WEASEL 2.0 manages to provide state-of-the-art accuracy while maintaining computational efficiency.
Similarly, MR-SQM [19,20] uses different kinds of symbolic representations to improve accuracy.
TDE [21] is a classification algorithm that combines design features of four classifiers (BOSS, WEASEL, S-BOSS, and cBOSS). Like BOSS, TDE is a homogeneous ensemble of nearest neighbor classifiers that uses distance between histograms of word counts and injects diversity via parameter variation. TDE takes the ensemble structure from cBOSS, which is more robust and scalable. The use of spatial pyramids is adapted from S-BOSS and it uses bi-grams like WEASEL. TDE is significantly more accurate than WEASEL and S-BOSS while retaining the scalability of cBOSS.
In [22], the authors present Contracted Shapelets, a method to speedup shapelet-based time series classification by performing early abandon. The authors present binary shapelets in order to address three problems of shapelet transform: loss of class information, managing easy vs. hard to classify classes, and addressing multi-class problems.
In [23], the authors propose Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series and classifies a time series in milliseconds. The models are ensembles of highly randomized Proximity Trees. Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to exemplar time series, leveraging the decades of work into developing relevant measures for time series. The authors show that their multi-resolution multi-domain linear classifier achieves a similar accuracy to the state-of-the-art COTE ensemble, as well as to recent deep learning methods (FCN, ResNet).
TS-CHIEF (Time Series Combination of Heterogeneous and Integrated Embedding Forest) [6] rivals HIVE-COTE in accuracy, but it is significantly faster. TS-CHIEF constructs an ensemble classifier that integrates the most effective methods in the TSC literature such as BOSS and Proximity Forest.
In this paper, we present an improved version of SCALE-BOSS [9], called SCALE-BOSS-MR (i.e., SCALE-BOSS with Multiple Representations). SCALE-BOSS-MR draws inspiration from many diverse state-of-the-art TSC algorithms in order to improve accuracy and bring the framework closer to the state-of-the-art algorithms to which it is compared with in Section 4, without increasing the computational cost of TSC methods. More concretely, in SCALE-BOSS-MR, we explore (a) the use of first-order differences to encode trend information similarly to Multi-Rocket [10], (b) the use of different window sizes over the time series similarly to WEASEL [18], and finally, (c) the use of dilation similarly to Rocket [7] and WEASEL 2.0 [11].

3. Framework Description: From SCALE-BOSS to SCALE-BOSS-MR

As already pointed out, two main ideas behind the SCALE-BOSS framework are as follows: (a) Exploit symbolic representations of time-series, and (b) leverage efficient machine learning algorithms for clustering and classification. Especially, we focus on models that exploit representatives of time series, so as to balance between computational efficiency and prediction accuracy.
As shown in Figure 1, the first step of this framework is to compute the symbolic representation of the training set. Any symbolic representation such as SFA or SAX can be used here.
Then, the second step computes the Term-Frequency Vectors for the training set, thus creating a Bag-Of-SFA Symbols (BOSS) for the training set.
The third step computes the models of time series, which will be used subsequently for making time series label predictions: To construct such a model, we can use either a clustering or a classification mechanism exploiting the term-frequency vectors. In any case, our interest is on these models that learn representatives of classes/clusters to whom test cases will be compared with, without excluding others.
Having these models, the fourth step computes the symbolic representation for the test set, while the fifth step computes the term-frequency vectors for the test set.
The sixth and final step predicts the label for the test case exploiting the model used. Specifically, in this work, and in case we use a time-series clustering mechanism, a 1-NN classifier classifies the test time series by comparing it to the representatives of each cluster. On the other hand, if a classifier is used, then the test time series is classified using the trained classifier model.
For the different instantiations of the framework, here we succinctly mention all the choices made and alternatives in the TSC pipeline:
  • We have chosen SFA as the symbolic representation. The framework can be tuned to other symbolic representations, but we have chosen SFA because it has been proven superior to others (e.g., SAX) [15,24].
  • Regarding clustering algorithms, we evaluated the use of K-Means, Mini-Batch K-Means, BIRCH, DBSCAN, and K-Medoids.
  • Regarding classifiers, we tested the framework with SGD, LinearSVC, AdaBoost, Multi-Layer Perceptron, Naive Bayes, QDA, RBF-VSM, Decision Trees (DT), and Random Forest (RF).
  • Regarding the distance measure for comparing time series in the clustering approach, we use the cosine similarity of their term-frequency vectors.
It is not our purpose to describe here all alternative clustering and classification methods that have been used, but subsequently, we provide details on the most effective ones, focusing on those that provide a condensed representation of time series in a form of time series class representatives.
Considering the clustering algorithms, all algorithms exploit representatives, except DBSCAN [25,26], which was used as a classic algorithm of density-based time series clustering methods. From the classifiers, we consider that DT and RF provide an elaborated representation of time series along model (decision tree) paths, which can be considered to model training set variations, much like class time series class representatives do.
Close to tree representations, BIRCH [27] is an incremental clustering algorithm using Clustering Feature Trees. BIRCH uses the concept of a Clustering Feature (CF): this is a triple that contains the number of data points in the cluster, the linear sum of the points, and the square sum of the points. A CF tree is a height-balanced tree with two parameters: the branching factor B and the threshold T. Each non-leaf node contains at most B entries of the form ( C F , c h i l d i ), and each leaf node consists of L CFs. In addition, each leaf node has two pointers, “prev” and “next”, which are used to chain all leaf nodes together for efficient scans. All entries in a leaf node must satisfy a threshold requirement, with respect to a threshold value, and thus, a leaf node represents a cluster made up of all the subclusters represented by its entries.
While the K-BOSS-VS method [15] applies K-means to each class label of the training set to obtain the K representatives per class, we have evaluated the use of clustering variants that increase the computational efficiency of the overall method, such as Mini-Batch K-Means [28], thus resulting in the MB-K-BOSS-VS configuration. Mini-Batch K-Means is an online variant of the K-Means clustering algorithm that converges using only a subset of the dataset and has results very close to the full K-Means clustering algorithm.
In addition to these clustering algorithms, we also evaluated the use of the K-Medoids algorithm [29]. This is a fast and simple algorithm that calculates the distance matrix once and then finds the cluster medoids in each iteration step.
Using decision trees to classify symbolic representations of time series, instead of using similarities between the representatives and the test set, we feed the normalized term-frequency vectors of the training set into a decision tree (DT) classifier or a random forest (RF) classifier. Then, the trained model is used to infer the class label for each time series in the test set. RF is essentially an ensemble variant of DT, allowing us to vary the number of trees, much like varying the number of class representatives in K-BOSS-VS.
In the evaluation reported in [9], we show that amongst the classifiers, Random Forest and MLP provided the best results. For clustering algorithms, Mini-Batch K-Means and K-Means provided the best accuracy. In any case, although computationally efficient, the accuracy achieved by these methods is lower than that of state-of-the-art methods.
SCALE-BOSS-MR refines the SCALE-BOSS framework by fusing multiple representations into a single term-frequency vector that is in turn fed into the regular SCALE-BOSS pipeline. This procedure was mostly inspired by WEASEL [18]. Multiple representations of time series result from the use of first differences of time series in combination with multiple temporal time series’ windows and multiple dilation filters. Given any specific dilated window, this is processed to obtain a term-frequency vector using unigrams or bigrams, and all the term vectors from all dilated windows are stacked column-wise to provide the representation of the time series.
More specifically, as we can see in Figure 2, the SCALE-BOSS-MR framework works, stage by stage, as follows:
  • Computes the dilated windows for the training set given a configuration of W window sizes in a set of w i n d o w _ c o n f i g s and D dilation filters in a set of d i l a t i o n _ f i l t e r _ c o n f i g s . This results in D W = W × D dilated windows configurations;
  • Computes the D W first-order differences’ dilated windows, i.e., dilated windows for the time series created by the first-order differences of the original time series;
  • Computes the term-frequency vectors of dilated windows for the original time series and of the dilated windows for the first-order differences’ time series, then stacks the term-frequency vectors column-wise;
  • Fits the model using the time series representations of the training set;
  • In the next three stages, computes the symbolic representation for the dilated windows of a time series from the test set, as well as the corresponding first-order differences’ time series, then computes and stacks the term-frequency vectors.
Algorithm 1 provides a succinct and comprehensive description of SCALE-BOSS-MR.
More specifically, lines 1–9 describe the main SBMR function: in lines 2 and 3, the algorithm loops over window configurations and dilation sizes configurations. For each window configuration, it computes the dilated time series according to the dilation parameter and produces the term-frequency vector. Line 6 maintains the resulting term-frequency vector by concatenating column-wise term-frequency vectors produced by each window and dilation configuration.
Lines 11 and 12 of the algorithm initialize an empty term-frequency vector for the train and test set, respectively. In Line 13, the algorithm computes the term-frequency vector for the training set.
Algorithm 1: SCALE-BOSS-MR algorithm
Applsci 14 00689 i001
Incorporating first-order differences into the process is an optional alternative, and one can configure the method using the d o T r e n d variable. In Lines 14 to 16, the algorithm calls for the SBMR function for the first-order differences’ time series of the training set, maintaining the resulting term-frequency vector by concatenating column-wise term-frequency vectors produced by each window and dilation configuration. In Line 17, the classifier is trained on the final term-frequency vectors for the training set. Similarly, in Line 18, the algorithm calls for the SBMR function for the first-order differences’ time series of the test set and proceeds in Lines 19 to 21 to compute the term-frequency vector for the test set. Finally, in Line 22, the classifier makes its predictions, taking as input the final concatenated term-frequency vector for the test set.
The framework has been implemented on top of the pyts [30] Time Series Classification library. The source code of the SCALE-BOSS-MR implementation is provided in https://github.com/aglenis/scale_boss_mr (accessed on 8 January 2024).
We have chosen the word length to be equal to four in all the configurations we explore, given that it is the “default” for the BOSS-VS implementation of pyts.
Additionally, when computing the symbolic representation, we use unigrams, or unigrams and bigrams: Using bigrams and unigrams helps retain sequence information that is inherently lost in the bag-of-words model.
As a final note, it is clear from the specified algorithm that trend information is split in parts by the windowing process applied on first-order differences’ time series, according to the windowing and dilation configurations.
For the different clustering algorithms and classifiers, we have used the implementations provided by the Sci-kit Learn [31] python library.

4. Evaluation

Aiming to balance between accuracy of predictions and efficiency in training and testing, we evaluate instantiations of the SCALE-BOSS-MR framework in comparison to the state-of-the-art methods, in terms of the mean total time and mean accuracy.
The mean total time is the total execution time (both train and test time) for all datasets divided by the total number of datasets. The mean accuracy is the average accuracy across all datasets. We also provide the standard deviation (std) of the total execution time and of accuracy.
Accuracy and execution time are the commonly used measures to report performance in most works in the literature, e.g., [11]. This choice allows us to be directly comparable to other algorithms. However, to comprehensively present the strengths and limitations of SCALE-BOSS-MR, we directly compare SCALE-BOSS-MR configurations with state-of-the-art algorithms such as WEASEL 2.0, Rocket, MiniRocket, and MRSQM.
While Appendix A reports on accuracy and total execution time for individual time series datasets, we report on mean accuracy and mean total execution time in the main part of the paper, comprehensively providing the strengths and limitations of the proposed framework.
Table 1 shows the characteristics of the UCR datasets used in the evaluation. UCR time-series datasets are the de facto datasets for time series classification, and we have chosen to use the eight datasets from the UCR time series repository with the largest training set. T r a i n _ s i z e denote the number of time series in the training set, t e s t _ s i z e denote the number of time series in the test set, n _ c l a s s e s denotes the number of classes in the dataset, and n _ t i m e s t a m p s denotes the length of each time series in the dataset. The datasets are split into train and test time series by the dataset providers.
Table 2 specifies the configurations of window sizes and window steps. When the value is a float, the float represent the percentage of the size of time series. If the value is an integer, then this denotes the number of subsequent time series points included in the window. Integer values of the window sizes are those used by WEASEL in [18]; however, with sufficiently large differences between subsequent window configuration sizes (i.e., four). Similarly, float values are those provided by pyts as initial values, and we add to them to these further configurations for exploration.
Table 3 shows dilation configurations in terms of the dilation factors. These dilation configurations are those mostly suggested in [11]. We subsequently perform a thorough evaluation to determine the best dilation configuration parameters in order to balance between accuracy and execution time.
To present the results, we use the following SCALE-BOSS-MR configurations’ naming convention:
NAME-CLS_METHOD-WINDOW_CONFIG-USE_OF_TREND-DILATION_CONFIG-NGRAM,
where
  • NAME is the name of the method used; for example, SBMR corresponds to SCALE-BOSS-MR;
  • CLS METHOD refers to the classifier used, e.g., Random Forest (RF), MLP, or MB-K-BOSS-VS. The MB-K-BOSS-VS classifier exploits the concatenated term-frequency vector resulting from the SCALE-BOSS-MR workflow, etc.;
  • WINDOW CONFIG refers to the window configuration as described in Table 2;
  • USE OF TREND denotes whether or not we use first-order differences to encode trend information. When the configuration uses first-order differences, this is denoted by “trend”, whereas if not, we denoted it as “noTrend”;
  • DILATION CONFIG denotes the dilation filter values as described in Table 3;
  • NGRAM denotes whether we use unigrams or unigrams together with bigrams. Unigrams are denoted as UG whereas unigrams and bigrams are denoted as BG.
Table 4 shows the results from different SBMR configurations, including also BOSS-RF, BOSS-MLP, and MB-K-BOSS-VS, all with trend information and no multiple window configurations, but with alternative dilation configurations and n-grams.
Additionally, Table 5 shows results from different SBMR configurations, with and without trend information and no dilation, but with different classifiers, window configurations, and n-grams. Table 4 and Table 5 show, among others, that RF are the most competitive classifiers.
Table 6 shows different configurations with RF classifiers, but also with different combinations of window and dilation configurations.
Subsequently, we describe the major findings from the results reported in these tables and provide further results that allow us to show the potential of the proposed approach.

4.1. Evaluation of SBMR with D0 and W0 Configurations

From Table 4 and Figure 3, we can see the following.
  • BOSS-RF starts with an approximately 0.8 accuracy and 9.7 s of mean total execution time;
  • SBMR-RF-W0-trend-D0-UG achieves a 0.827 accuracy with 19.7 s of mean total execution time. This means that adding trend information gives a boost in accuracy, but at the expense of doubling the mean total execution time;
  • SBMR-RF-W0-trend-D6-BG achieves a 0.849 accuracy with 52 s of mean total execution time. This means that even moderate dilation helps to boost accuracy at the expense of further increasing the total execution time;
  • SBMR-RF-W0-trend-D3-BG achieves an accuracy of 0.857 with 70 s of mean total execution time.
From the above, we can conclude the following:
  • Bigrams with dilation help to improve the accuracy of the method;
  • D6 (described in Table 3) performs better than D4 (with little difference) and D5 (with slightly bigger difference) configurations;
  • However, D3 (as described in Table 3) performs best (but with slightly worse total execution time compared to D6 and D4 configurations). This is expected, as the representations from D3 are more in number compared to those of D4 and D6, while it seems that the receptive field introduced by D4 and D6 is not sufficient for the algorithms to discriminate between time series’ classes effectively. It must be noted that in these configurations, dilation applies once to the time series, as these use the W0 window configuration. This is further supported by the results reported by D1 and D2, which incorporate the representations produced by D3 and D6, thus achieving nearly the same accuracy, but with considerably increased total execution time;
  • An additional finding is that representations with bigrams (BG) are in general more effective compared to those using unigrams (UG), in these configurations of the framework.
Table 5 provides evidence on window configurations with no dilation. Contrary to what is reported in Table 4, here, representations with unigrams seem to be more effective compared to those with bigrams, while it is clear that adding representations with trend increases the mean classification accuracy. The W8 configuration reports the best mean accuracy among others, although with increased total execution time. This is so, given that W8 incorporates many window configurations, all with step 1. In contrast, W10 is an indicative case with slightly fewer window configurations, and step 2 achieves worst accuracy, but with significantly reduced mean total execution time.

4.2. Evaluation of SBMR-RF with Multiple Windows and Dilation Configuration

Table 6 and Figure 4 report on configurations combing window and dilation configurations, with representations including bigrams, including trend information. The results show the following:
  • SBMR-RF-W8-D6-trend-BG achieves an accuracy of 0.875 with 489 s of mean total execution time compared to the undilated 0.863 with 216 s of mean total execution time reported in Table 5;
  • SBMR-RF-W14-D6-trend-BG achieves a slightly worst accuracy of 0.863 with significantly reduced mean total execution time of 149 s versus 0.850 with 54 s for the undilated case reported in Table 5;
  • SBMR-RF-W11-D6-trend-BG achieves an accuracy of 0.857 with 116 s compared to 0.846 with 36 s for the undilated case reported in Table 5;
  • SBMR-RF-W10-D6-trend-BG achieves an accuracy of 0.865 with 217 s compared to 0.862 with 72 s for the undilated case reported in Table 5.
From the above, we can conclude the following:
  • Dilation combined with window configurations has a considerable impact on total execution time, as expected, but significantly increases the accuracy of predictions for any of the windowing configurations;
  • Dilation configuration D3 is slightly better in terms of accuracy compared to configuration D6, which is significantly faster than D3. This is due to the fact that D3 increases the time series representations when combined with window configurations, resulting in the worst total execution time compared to D6. This leads us to believe that D6 achieves a good balance between total execution time and accuracy when it is combined with suitable window configurations.

4.3. Evaluation of SBMR with the Ridge Regression with Cross-Validation Classifier

As part of the evaluation, we used a Ridge Regression with a Cross-Validation classifier denoted by RidgeCV. We chose to run experiments using RidgeCV because it is the classifier used in ROCKET, miniRocket, and WEASEL 2.0.
The results with the RidgeCV and W0 configuration can be found in Table 7, showing, at a great extent, consistency with the results reported for RF.
From Table 8 and Figure 5, we can see the following:
  • The best configuration in terms of accuracy is SBMR-RidgeCV-W8-trend-D6-BG, which achieves an accuracy of 0.892 with 517.5 s of mean total execution time.
  • Other window configurations can improve the execution time, but at the expense of lower accuracy:
    • SBMR-RidgeCV-W14-trend-D6-BG achieves an accuracy of 0.882 with 160.4 s of mean total execution time;
    • SBMR-RidgeCV-W11-trend-D6-BG achieves an accuracy of 0.881 with 129 s of mean total execution time;
    • SBMR-RidgeCV-W0-trend-D6-BG achieves an accuracy of 0.844 with 70.2 s of mean total execution time.
  • Opting for no dilation but with different window configurations can further improve the total execution time, but at the expense of even lower accuracy:
    • SBMR-RidgeCV-W11-trend-D0-BG achieves an accuracy of 0.844 with 63.5 s of mean total execution time;
    • SBMR-RidgeCV-W15-trend-D0-BG achieves an accuracy of 0.831 with 40.6 s of mean total execution time;
    • SBMR-RidgeCV-W11-trend-D0-UG achieves an accuracy of 0.827 with 43.7 s of mean total execution time.
From the above, we can see that using RidgeCV SCALE-BOSS-MR gives better results in terms of accuracy when using both multiple windows representation and dilation, compared to RF, without a significant increase in total execution time.

4.4. Comparison with the State-of-the-Art Algorithms

From the state-of-the-art algorithms, WEASEL 2.0 and MR-SQM exploit symbolic representations and are thus closer to our method. Table 9 and Figure 6 report on state-of-the-art algorithms:
  • WEASEL V2.0 achieves the best accuracy of 0.904 with 339 s of mean total execution time compared to 309 of Rocket and 18.7 s of Mini-Rocket;
  • MR-SQM with five SFA representations achieves a 0.889 accuracy with 418 s of mean total execution time. With single SFA representation and single SAX representation, it achieves 0.878 with 176 s of exec time. Using only a single SFA, it achieves an accuracy of 0.861 with 117 s of mean total execution time.
Given the above results, we can conclude the following:
  • SBMR-RidgeCV-W8-trend-D6-BG achieves an almost state-of-the-art accuracy, but with a significant increase in total execution time compared with the state-of-the-art methods. SBMR-RidgeCV-W8-trend-D6-BG is only 1% less accurate than WEASEL 2.0, Rocket, and MiniRocket. This suggests that the added representations help achieve a state-of-the-art accuracy;
  • SBMR-RidgeCV-W11-trend-D6-BG achieves an accuracy of 0.881, very close to the accuracy achieved by state-of-the-art algorithms, and with 129 s of mean total execution time, which is significantly faster than state-of-the-art algorithms exploiting symbolic representations;
  • SBMR-RidgeCV-W15-trend-D6-BG achieves an accuracy of 0.871, which is still close to the state of the art with 79 s of mean total execution time, being even faster than SBMR-RidgeCV-W11-trend-D6-BG.
These results show that the SCALE-BOSS-MR can be configured to achieve high accuracy, with low total execution time, compared to state-of-the-art methods exploiting symbolic representations.

5. Conclusions

In this paper, we extend the SCALE-BOSS framework by incorporating into the process multiple time series representations using multiple time series’ windows’ sizes combined with multiple dilation parameters and applied to the original time series, as well as to the first-order differences’ time series encoding trend information. Specifically, SCALE-BOSS-MR presents an improved version of SCALE-BOSS [9], incorporating multiple time series symbol-only representations, drawing inspiration from many diverse state-of-the-art TSC algorithms, with the objective to achieve state-of-the-art accuracy without increasing the computational cost of TSC methods. In so doing, SCALE-BOSS-MR incorporates (a) first-order differences to encode trend information similarly to Multi-Rocket [10], (b) different window sizes over the time series similarly to WEASEL [18], and (c) multiple dilation parameters similarly to Rocket [7] and WEASEL 2.0 [11].
The major findings are as follows: Adding trend information improved TSC accuracy, while maintaining scalability in terms of execution time. Using both multiple window sizes and trend information, the produced algorithms reach state-of-the-art accuracy.
Specifically:
  • Trend encoding helps both algorithms in the single window case and when using multiple window sizes;
  • Adding dilation configurations to the SCALE-BOSS-MR workflow helps significantly, but it increases the execution time considerably when combined with multiple window configurations;
  • Adding multiple window sizes to the SCALE-BOSS-MR workflow also helps in significantly increasing accuracy;
  • Using multiple window sizes, trend information, and dilation, we can achieve state-of-the-art accuracy. In addition, the resulting method can be tuned to be as efficient as other state-of-the-art methods exploiting symbolic time series representations;
  • Adding a few representations resulting from different dilation configurations improves accuracy while at the same time retaining scalability. This is validated since D6 is only marginally worse than D3 in terms of accuracy, but it is significantly faster and is also combined with suitable window configurations;
  • Ridge regression with cross validation performed the best in terms of accuracy compared to the Random Forest classifier;
  • SCALE-BOSS-MR is only 1% less accurate compared to state-of-the-art algorithms when tuned for accuracy, while it can be significantly faster than state-of-the-art algorithms with minimal loss in accuracy (2–3%) when tuned for scalability.
As future work, we plan to port SCALE-BOSS-MR on a parallel processing framework such as Apache Spark [32], which is similar to [15]. This will allow for (a) incorporating additional time series representations to further increase accuracy in more efficient ways; (b) investigating using multiple instantiations of SCALE-BOSS-MR in an ensemble without compromising efficiency; and (c) providing experimental results with real-world datasets, with big data characteristics and application/domain-related characteristics (e.g., delay).
Furthermore, while SCALE-BOSS-MR shows that it can be tuned to accuracy vs. computational efficiency, we need effective ways of determining the appropriate configurations to address real-world requirements: This is an important part of future work as well.
Finally, we do plan to investigate the use of convolutional neural networks on top of the symbolic representations to further advance accuracy while significantly reducing the total execution time and well as adapting the SCALE-BOSS-MR algorithm for multivariate time series classification.

Author Contributions

Conceptualization, A.G. and G.A.V.; methodology, A.G. and G.A.V.; software, A.G.; validation, A.G. and G.A.V.; formal analysis, A.G. and G.A.V.; investigation, A.G.; resources, not applicable; data curation, A.G.; writing—original draft preparation, A.G. and G.A.V.; writing—review and editing, A.G. and G.A.V.; visualization, A.G. and G.A.V.; supervision, G.A.V.; project administration, G.A.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. The UCR datasets used in this study can be found at: http://www.timeseriesclassification.com/dataset.php (accessed on 8 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The appendix shows detailed accuracy and total execution time results for all the methods and datasets. The detailed tables help compare the proposed methods with other methods proposed in the literature, as well as compare the proposed methods against each other on a per-dataset basis.
Table A1. Accuracy of the classifiers without multiple window configurations.
Table A1. Accuracy of the classifiers without multiple window configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RF-W0-trend-D1-BG0.7050.9110.7280.9140.9000.9300.8180.952
SBMR-RF-W0-trend-D1-UG0.7100.9140.7400.8350.8940.9220.8140.952
SBMR-RF-W0-trend-D3-BG0.7090.9170.7380.9110.8930.9250.8170.945
SBMR-RF-W0-trend-D4-BG0.7090.9100.7410.8950.8840.9150.8050.916
SBMR-RF-W0-trend-D5-BG0.7080.9190.7420.7920.8980.9240.8120.947
SBMR-RF-W0-trend-D6-BG0.7030.9140.7410.8920.8940.9190.8050.924
SBMR-RF-W0-trend-D6-UG0.7070.9120.7590.7840.9070.9200.7930.929
BOSS-MLP0.5760.8720.6950.7760.7950.7980.7740.941
SBMR-MLP-W0-trend-D0-UG0.6410.8920.7940.7490.8590.8660.7790.950
BOSS-RF0.6340.8830.7170.8380.7800.8230.7650.928
SBMR-RF-W0-trend-D0-UG0.6830.9240.7680.8110.8450.8780.7740.936
MB-K-BOSS-VS0.5500.8330.6680.6810.7020.7710.7120.902
SBMR-MB-K-BOSS-VS-W0-trend-D0-UG0.6080.8830.7200.6860.7730.8280.7170.918
Table A2. Total execution time without multiple window configurations.
Table A2. Total execution time without multiple window configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RF-W0-trend-D1-BG87.343135.100122.949194.351152.298151.97711.58938.516
SBMR-RF-W0-trend-D1-UG58.626122.240127.212184.951140.944139.07410.02630.200
SBMR-RF-W0-trend-D2-BG79.353108.209104.967158.977125.302121.1679.59129.938
SBMR-RF-W0-trend-D3-BG55.14086.99877.606122.50294.73393.4387.31423.052
SBMR-RF-W0-trend-D4-BG37.55555.91752.45279.54860.43062.7304.87514.792
SBMR-RF-W0-trend-D5-BG40.50762.40456.89187.58969.40171.2175.38916.574
SBMR-RF-W0-trend-D6-BG40.68963.69157.14990.23972.02570.7455.74919.403
SBMR-RF-W0-trend-D6-UG25.17348.65443.77370.97853.16553.9143.92111.617
BOSS-MLP14.53712.92413.11318.98016.65616.8052.8364.653
SBMR-MLP-W0-trend-D0-UG19.61524.75725.05036.26030.01529.3162.9737.908
BOSS-RF5.87911.83710.54718.04913.82613.8150.9312.741
SBMR-RF-W0-trend-D0-UG8.86923.78121.48736.32628.16427.6981.6955.371
MB-K-BOSS-VS5.99113.66912.06421.53217.65517.5010.9793.261
SBMR-MB-K-BOSS-VS-W0-trend-D0-UG8.15023.27221.99835.86227.67329.0411.4825.154
Table A3. Accuracy using different windows’ configurations.
Table A3. Accuracy using different windows’ configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-MB-K-BOSS-VS-W1-D0-BG0.6730.7300.6160.8860.7370.7850.7390.969
SBMR-MB-K-BOSS-VS-W1-trend-D0-BG0.6880.7910.6440.9110.7920.8330.7380.953
SBMR-MB-K-BOSS-VS-W1-trend-D0-UG0.6860.7770.6090.8920.8040.8270.7230.988
SBMR-MB-K-BOSS-VS-W1-noTrend-D0-UG0.6640.7190.5880.8680.7520.7860.7300.979
SBMR-RF-W1-noTrend-D0-BG0.7130.8110.6260.9220.8100.8500.7900.988
SBMR-RF-W1-trend-D0-BG0.7230.8560.6380.9140.8510.8820.8070.987
SBMR-RF-W1-trend-D0-UG0.7290.8280.6440.9140.8530.8800.8010.992
SBMR-RF-W1-noTrend-D0-UG0.7170.7890.6320.9160.8180.8510.7880.994
SBMR-RF-W10-trend-D0-UG0.7130.9330.8150.8350.9010.9120.8120.971
SBMR-RF-W11-trend-D0-UG0.7080.9260.7950.8160.8870.8890.8020.945
SBMR-RF-W12-trend-D0-UG0.7160.9240.7880.8300.8830.8880.8020.931
SBMR-RF-W13-trend-D0-UG0.7240.9300.8170.8350.9030.9080.8030.963
SBMR-RF-W14-trend-D0-UG0.7100.9290.8040.8380.8840.8800.8030.951
SBMR-RF-W2-noTrend-D0-UG0.7180.7930.6150.9240.8220.8610.7900.992
SBMR-RF-W3-noTrend-D0-UG0.7190.8100.6280.9160.8440.8730.8100.996
SBMR-RF-W4-trend-D0-UG0.7320.8480.6720.9160.8670.8930.8030.996
SBMR-RF-W4-noTrend-D0-UG0.7210.8140.6620.9110.8530.8830.7960.997
SBMR-RF-W5-noTrend-D0-UG0.7170.8050.6700.9190.8380.8630.7950.996
SBMR-RF-W6-trend-D0-UG0.7150.9080.7370.8270.9050.8960.7960.950
SBMR-RF-W7-noTrend-D0-UG0.7140.9120.7740.8510.8600.8830.8000.949
SBMR-RF-W8-noTrend-D0-BG0.7120.9300.8010.8590.8940.9000.8090.984
SBMR-RF-W8-trend-D0-BG0.7240.9300.8040.8300.9170.9230.8000.980
SBMR-RF-W8-trend-D0-UG0.7290.9340.8040.8590.9210.9140.8080.982
SBMR-RF-W8-noTrend-D0-UG0.7180.9220.8050.8490.8910.8980.8030.985
SBMR-RF-W9-trend-D0-UG0.7200.9270.8110.8240.9140.9160.8030.981
Table A4. Total execution time using different windows’ configurations.
Table A4. Total execution time using different windows’ configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-MB-K-BOSS-VS-W1-D0-BG86.35620.89418.92059.32525.62426.8256.21416.277
SBMR-MB-K-BOSS-VS-W1-trend-D0-BG200.63341.44737.514111.83451.80250.08012.11735.905
SBMR-MB-K-BOSS-VS-W1-trend-D0-UG63.76634.82232.983105.89140.91841.3509.53719.486
SBMR-MB-K-BOSS-VS-W1-noTrend-D0-UG32.92217.80016.82558.22622.51822.3075.0619.755
SBMR-RF-W1-noTrend-D0-BG53.76426.27421.37064.25523.51825.1758.18916.317
SBMR-RF-W1-trend-D0-BG123.62148.22744.306113.60844.69847.86512.85031.346
SBMR-RF-W1-trend-D0-UG67.55141.56339.191123.41046.11244.98511.64822.346
SBMR-RF-W1-D0-UG38.88622.34619.38564.72123.26821.3856.19210.634
SBMR-RF-W10-trend-D0-UG28.82190.49582.290141.950105.976106.1546.00320.042
SBMR-RF-W11-trend-D0-UG19.47145.17941.17569.44851.51050.9433.55310.818
SBMR-RF-W12-trend-D0-UG25.85255.47649.94085.22262.44062.6804.54513.631
SBMR-RF-W13-trend-D0-UG40.024113.978104.673176.298133.941131.0578.02826.638
SBMR-RF-W14-trend-D0-UG24.44469.88561.702106.37777.70678.1664.88715.989
SBMR-RF-W2-noTrend-D0-UG66.42934.47132.83786.85145.06245.4099.46418.323
SBMR-RF-W3-noTrend-D0-UG54.58064.91957.721190.97195.15795.1129.92937.914
SBMR-RF-W4-trend-D0-UG115.171272.809248.441712.521338.855337.43919.80074.526
SBMR-RF-W4-noTrend-D0-UG55.183135.333123.038369.460164.462163.19810.08336.445
SBMR-RF-W5-noTrend-D0-UG34.06491.95082.449239.26183.30685.7055.60818.888
SBMR-RF-W6-trend-D0-UG25.72466.80950.068103.18878.10377.7105.08715.794
SBMR-RF-W7-noTrend-D0-UG27.86976.87868.603116.07786.83487.1525.63717.672
SBMR-RF-W8-noTrend-D0-BG58.735129.312117.890196.222148.158146.6369.26031.182
SBMR-RF-W8-trend-D0-BG130.044264.569241.091404.723305.697300.87220.34267.718
SBMR-RF-W8-trend-D0-UG70.762251.439226.320408.547302.159297.02317.23758.612
SBMR-RF-W8-D0-UG34.509117.547106.560183.777137.370136.8768.00526.503
SBMR-RF-W9-trend-D0-D0-UG55.596197.256200.075294.347220.346219.08211.66940.845
Table A5. Accuracy results with different windows and dilation configurations.
Table A5. Accuracy results with different windows and dilation configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RF-W10-trend-D6-BG0.7220.9300.7930.9140.8990.9240.8140.924
SBMR-RF-W11-trend-D3-BG0.7240.9260.7740.9160.8880.9150.8100.916
SBMR-RF-W11-trend-D6-BG0.7230.9260.7750.9240.8940.9170.8030.891
SBMR-RF-W14-trend-D3-BG0.7190.9320.7830.9110.8900.9280.8180.941
SBMR-RF-W14-trend-D6-BG0.7250.9200.7800.9110.9000.9240.8160.929
SBMR-RF-W8-trend-D6-BG0.7300.9280.7880.9110.9220.9420.8190.959
Table A6. Total execution time results with different windows and dilation configurations.
Table A6. Total execution time results with different windows and dilation configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RF-W10-D6-trend-bigram279.795250.885230.631333.031270.118274.10522.24383.030
SBMR-RF-W11-D3-trend-bigram402.083206.374187.194255.903219.389212.97420.059104.887
SBMR-RF-W11-D6-trend-bigram196.900128.388117.495158.459132.424129.83712.07155.198
SBMR-RF-W14-D3-trend-bigram328.484257.580235.637343.221281.788279.82124.58995.934
SBMR-RF-W14-D6-trend-bigram179.336176.598156.272233.769186.544186.19816.46558.888
SBMR-RF-W8-D6-trend-bigram548.205564.149514.789815.115619.852621.87650.155178.285
Table A7. Accuracy using ridge regression classifier without multiple window configurations.
Table A7. Accuracy using ridge regression classifier without multiple window configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RidgeCV-W0-noTrend-D0-UG0.5790.8850.6650.6680.7430.7960.7330.877
SBMR-RidgeCV-W0-trend-D0-UG0.6220.9140.7430.6490.8300.8430.7520.891
SBMR-RidgeCV-W0-trend-D1-BG0.7170.9360.7210.8860.9210.9400.7870.960
SBMR-RidgeCV-W0-trend-D2-BG0.7110.9230.7150.8860.9170.9390.7910.958
SBMR-RidgeCV-W0-trend-D3-BG0.7010.9230.7490.8570.9100.9330.7810.960
SBMR-RidgeCV-W0-trend-D4-BG0.6960.9150.7280.8380.8960.9120.7790.953
SBMR-RidgeCV-W0-trend-D5-BG0.7010.9340.7480.7700.9090.9290.7790.959
SBMR-RidgeCV-W0-trend-D6-BG0.6990.9240.7410.8320.9070.9260.7880.936
SBMR-RidgeCV-W0-trend-D6-UG0.6480.9210.7370.7460.8920.9130.7550.929
Table A8. Total execution time using ridge regression classifier without multiple window configurations.
Table A8. Total execution time using ridge regression classifier without multiple window configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RidgeCV-W0-noTrend-D0-UG8.01613.95612.34319.99215.52815.3291.1443.432
SBMR-RidgeCV-W0-trend-D0-UG15.59027.43924.76637.58728.38429.8502.0436.809
SBMR-RidgeCV-W0-trend-D1-BG151.967158.446145.818209.656182.595186.43613.10539.796
SBMR-RidgeCV-W0-trend-D2-BG121.808117.256105.820159.941128.159128.9519.79529.982
SBMR-RidgeCV-W0-trend-D3-BG99.91990.81182.360124.79897.51699.3818.20422.887
SBMR-RidgeCV-W0-trend-D4-BG132.66363.13660.23779.28263.76361.8366.26014.825
SBMR-RidgeCV-W0-trend-D5-BG121.14667.77761.38681.09163.43162.8495.92815.386
SBMR-RidgeCV-W0-trend-D6-BG135.88460.46054.29280.88562.65862.7745.67414.681
SBMR-RidgeCV-W0-trend-D6-UG23.03747.29242.23872.09054.20154.6923.78912.554
Table A9. Accuracy using ridge regression classifier with different windows’ configurations.
Table A9. Accuracy using ridge regression classifier with different windows’ configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RidgeCV-W11-trend-D6-BG0.7500.9440.8210.9410.9220.9440.7900.941
SBMR-RidgeCV-W11-trend-D0-BG0.7290.9380.8170.7430.8850.9030.8120.924
SBMR-RidgeCV-W11-trend-D0-UG0.7040.9320.8100.7140.8680.8840.7790.930
SBMR-RidgeCV-W14-trend-D6-BG0.7290.9420.8200.9270.9200.9400.8190.956
SBMR-RidgeCV-W15-trend-D6-BG0.7410.9380.7940.9350.9120.9290.8170.906
SBMR-RidgeCV-W15-trend-BG0.7210.9310.8070.7540.8690.8970.7750.895
SBMR-RidgeCV-W8-trend-D6-BG0.7470.9390.8100.9380.9360.9550.8300.984
SBMR-RidgeCV-W0-trend-D6-BG0.6990.9240.7410.8320.9070.9260.7880.936
Table A10. Total execution time using ridge regression classifier with different windows’ configurations.
Table A10. Total execution time using ridge regression classifier with different windows’ configurations.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
SBMR-RidgeCV-W11-trend-D6-BG259.304139.684128.264164.312137.236135.37212.77755.314
SBMR-RidgeCV-W11-trend-D0-BG134.42268.57262.96982.93864.84164.7856.75222.797
SBMR-RidgeCV-W11-trend-D0-UG35.63652.45148.51076.23659.06558.9645.11413.579
SBMR-RidgeCV-W14-trend-D6-BG247.075179.955164.707237.269190.337189.24416.63958.201
SBMR-RidgeCV-W15-trend-D6-BG200.77980.73375.72780.19072.67672.6137.62741.114
SBMR-RidgeCV-W15-trend-D0-BG112.87841.06138.14141.75734.92135.4014.23816.034
SBMR-RidgeCV-W8-trend-D6-BG577.872604.649545.924845.690664.549665.13052.951183.486
SBMR-RidgeCV-W0-trend-D6-BG158.73471.13164.67291.02776.76275.7186.97316.592
Table A11. Accuracy of state-of-the-art algorithms.
Table A11. Accuracy of state-of-the-art algorithms.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
MRSQM_nsax0_nsfa10.6520.9440.8150.9270.9030.9290.7330.989
MRSQM_nsax0_nsfa50.7030.9480.8160.9430.9350.9500.8160.999
MRSQM_nsax1_nsfa00.6230.8740.7360.9110.9010.9190.7590.987
MRSQM_nsax1_nsfa10.6790.9420.8120.9540.9160.9330.7860.998
MiniRocket0.7480.9480.8190.9430.9490.9650.8340.998
Rocket0.7530.9430.8050.9490.9510.9680.8381.000
WEASEL_V20.7600.9610.8400.9590.9270.9550.8320.997
Table A12. Total execution time of state-of-the-art algorithms.
Table A12. Total execution time of state-of-the-art algorithms.
DatasetCropFordAFordBHandOutlinesNonInvasiveFetalECGThorax1NonInvasiveFetalECGThorax2PhalangesOutlinesCorrectTwoPatterns
Algorithm
MRSQM_nsax0_nsfa1264.978104.80590.142138.350153.971153.7689.63022.638
MRSQM_nsax0_nsfa5689.978380.430360.333669.216584.349562.89538.78965.097
MRSQM_nsax1_nsfa0211.168104.785100.683137.459121.358120.78111.65125.750
MRSQM_nsax1_nsfa1306.893158.834149.386277.161235.267234.79015.55632.008
MiniRocket70.20118.75518.27711.86011.73012.0593.6243.249
Rocket221.754368.653347.407555.628444.713411.41935.93889.637
WEASEL_V2251.385420.989389.773651.314445.368433.09735.58690.074

References

  1. Chaovalitwongse, W.A.; Prokopyev, O.A.; Pardalos, P.M. Electroencephalogram (EEG) time series classification: Applications in epilepsy. Ann. Oper. Res. 2006, 148, 227–250. [Google Scholar] [CrossRef]
  2. Arul, M.; Kareem, A. Applications of shapelet transform to time series classification of earthquake, wind and wave data. Eng. Struct. 2021, 228, 111564. [Google Scholar] [CrossRef]
  3. Potamitis, I. Classifying insects on the fly. Ecol. Inform. 2014, 21, 40–49. [Google Scholar] [CrossRef]
  4. Susto, G.A.; Cenedese, A.; Terzi, M. Chapter 9—Time-Series Classification Methods: Review and Applications to Power Systems Data. In Big Data Application in Power Systems; Arghandeh, R., Zhou, Y., Eds.; Elsevier: Amsterdam, The Netherlands, 2018; pp. 179–220. [Google Scholar] [CrossRef]
  5. Schäfer, P. Scalable time series classification. Data Min. Knowl. Discov. 2016, 30, 1273–1298. [Google Scholar] [CrossRef]
  6. Shifaz, A.; Pelletier, C.; Petitjean, F.; Webb, G.I. TS-CHIEF: A scalable and accurate forest algorithm for time series classification. Data Min. Knowl. Discov. 2020, 34, 742–775. [Google Scholar] [CrossRef]
  7. Dempster, A.; Petitjean, F.; Webb, G.I. ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Min. Knowl. Discov. 2020, 34, 1454–1495. [Google Scholar] [CrossRef]
  8. Dempster, A.; Schmidt, D.F.; Webb, G.I. Minirocket: A very fast (almost) deterministic transform for time series classification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 248–257. [Google Scholar]
  9. Glenis, A.; Vouros, G.A. SCALE-BOSS: A framework for scalable time-series classification using symbolic representations. In Proceedings of the 12th Hellenic Conference on Artificial Intelligence, Corfu, Greece, 7–9 September 2022; pp. 1–9. [Google Scholar]
  10. Tan, C.W.; Dempster, A.; Bergmeir, C.; Webb, G.I. MultiRocket: Multiple pooling operators and transformations for fast and effective time series classification. Data Min. Knowl. Discov. 2022, 36, 1623–1646. [Google Scholar] [CrossRef]
  11. Schäfer, P.; Leser, U. WEASEL 2.0—A Random Dilated Dictionary Transform for Fast, Accurate and Memory Constrained Time Series Classification. arXiv 2023, arXiv:2301.10194. [Google Scholar] [CrossRef]
  12. Senin, P.; Malinchik, S. Sax-vsm: Interpretable time series classification using sax and vector space model. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; pp. 1175–1180. [Google Scholar]
  13. Schäfer, P. The BOSS is concerned with time series classification in the presence of noise. Data Min. Knowl. Discov. 2015, 29, 1505–1530. [Google Scholar] [CrossRef]
  14. Schäfer, P.; Högqvist, M. SFA: A symbolic fourier approximation and index for similarity search in high dimensional datasets. In Proceedings of the 15th International Conference on Extending Database Technology, Berlin, Germany, 27–30 March 2012; pp. 516–527. [Google Scholar]
  15. Glenis, A.; Vouros, G.A. Balancing between scalability and accuracy in time-series classification for stream and batch settings. In Proceedings of the International Conference on Discovery Science, Thessaloniki, Greece, 19–21 October 2020; pp. 265–279. [Google Scholar]
  16. Middlehurst, M.; Vickers, W.; Bagnall, A. Scalable dictionary classifiers for time series classification. In Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Manchester, UK, 14–16 November 2019; pp. 11–19. [Google Scholar]
  17. Large, J.; Bagnall, A.; Malinowski, S.; Tavenard, R. On time series classification with dictionary-based classifiers. Intell. Data Anal. 2019, 23, 1073–1089. [Google Scholar] [CrossRef]
  18. Schäfer, P.; Leser, U. Fast and accurate time series classification with weasel. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 637–646. [Google Scholar]
  19. Nguyen, T.L.; Ifrim, G. MrSQM: Fast time series classification with symbolic representations. arXiv 2021, arXiv:2109.01036. [Google Scholar]
  20. Nguyen, T.L.; Ifrim, G. Fast time series classification with random symbolic subsequences. In Proceedings of the International Workshop on Advanced Analytics and Learning on Temporal Data, Grenoble, France, 19–23 September 2022; pp. 50–65. [Google Scholar]
  21. Middlehurst, M.; Large, J.; Cawley, G.; Bagnall, A. The temporal dictionary ensemble (TDE) classifier for time series classification. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Ghent, Belgium, 14–18 September 2020; pp. 660–676. [Google Scholar]
  22. Bostrom, A.; Bagnall, A. Binary shapelet transform for multiclass time series classification. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXII; Springer: Berlin/Heidelberg, Germany, 2017; pp. 24–46. [Google Scholar]
  23. Lucas, B.; Shifaz, A.; Pelletier, C.; O’Neill, L.; Zaidi, N.; Goethals, B.; Petitjean, F.; Webb, G.I. Proximity forest: An effective and scalable distance-based classifier for time series. Data Min. Knowl. Discov. 2019, 33, 607–635. [Google Scholar] [CrossRef]
  24. Mahato, V.; O’Reilly, M.; Cunningham, P. A Comparison of k-NN Methods for Time Series Classification and Regression. In Proceedings of the AICS, Dublin, Ireland, 6–7 December 2018; pp. 102–113. [Google Scholar]
  25. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the kdd, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
  26. Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 2017, 42, 1–21. [Google Scholar] [CrossRef]
  27. Zhang, T.; Ramakrishnan, R.; Livny, M. BIRCH: An efficient data clustering method for very large databases. ACM Sigmod Rec. 1996, 25, 103–114. [Google Scholar] [CrossRef]
  28. Sculley, D. Web-scale k-means clustering. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 1177–1178. [Google Scholar]
  29. Park, H.S.; Jun, C.H. A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 2009, 36, 3336–3341. [Google Scholar] [CrossRef]
  30. Faouzi, J.; Janati, H. pyts: A Python Package for Time Series Classification. J. Mach. Learn. Res. 2020, 21, 1–6. [Google Scholar]
  31. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  32. Zaharia, M.; Xin, R.S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M.J.; et al. Apache spark: A unified engine for big data processing. Commun. ACM 2016, 59, 56–65. [Google Scholar] [CrossRef]
Figure 1. SCALE-BOSS workflow.
Figure 1. SCALE-BOSS workflow.
Applsci 14 00689 g001
Figure 2. SCALE-BOSS-MR workflow.
Figure 2. SCALE-BOSS-MR workflow.
Applsci 14 00689 g002
Figure 3. Total execution time (in seconds) and accuracy for single window representation.
Figure 3. Total execution time (in seconds) and accuracy for single window representation.
Applsci 14 00689 g003
Figure 4. Average total execution time (in seconds) and accuracy for multiple window configurations.
Figure 4. Average total execution time (in seconds) and accuracy for multiple window configurations.
Applsci 14 00689 g004
Figure 5. Average total execution time (in seconds) and accuracy using ridge regression.
Figure 5. Average total execution time (in seconds) and accuracy using ridge regression.
Applsci 14 00689 g005
Figure 6. Average total execution time (in seconds) and accuracy for the state-of-the-art representations.
Figure 6. Average total execution time (in seconds) and accuracy for the state-of-the-art representations.
Applsci 14 00689 g006
Table 1. Characteristics of the datasets.
Table 1. Characteristics of the datasets.
NameTrain_SizeTest_Sizen_Classesn_Timestamps
Crop720016,8002446
FordB36368102500
FordA360113202500
NonInvasiveFetalECGThorax21800196542750
NonInvasiveFetalECGThorax11800196542750
PhalangesOutlinesCorrect1800858280
HandOutlines100037022709
TwoPatterns100040004128
Table 2. Window configurations.
Table 2. Window configurations.
NameWindow_Configs (Window Sizes)Window Step
W0241
W10.1, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.90.0125
W20.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.90.0125
W30.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.90.00625
W40.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.90.003125
W50.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.90.003125
W64, 12, 281
W74, 8, 12, 16, 20, 24, 281
W84, 8, 12, 16, 20, 24, 28, 32, 36, 401
W912, 16, 20, 24, 28, 32, 36, 401
W1012, 16, 20, 24, 28, 32, 36, 402
W1112, 16, 20, 24, 28, 32, 36, 404
W124, 8, 12, 16, 20, 24, 28, 32, 36, 404
W134, 8, 12, 16, 20, 24, 28, 32, 36, 402
W1412, 16, 20, 24, 28, 322
W1512, 16, 20, 24, 28, 32, 36, 40, 328
Table 3. Dilation configurations.
Table 3. Dilation configurations.
NameDilation_Filter_Configs
D01 (no dilation)
D11, 5, 7, 9, 11
D21, 7, 9, 11
D31, 7, 11
D41, 11
D51, 7
D61, 9
Table 4. Accuracy and total execution time of the classifiers using the W0 windows configuration.
Table 4. Accuracy and total execution time of the classifiers using the W0 windows configuration.
Accuracy_MeanAccuracy_stdTotal_Time_MeanTotal_Time_std
Algorithm
SBMR-RF-W0-trend-D1-BG0.8570.089111.757.8
SBMR-RF-W0-trend-D3-BG0.8570.08570.036.4
SBMR-RF-W0-trend-D2-BG0.8560.08592.147.0
SBMR-RF-W0-trend-D6-BG0.8490.08252.426.7
SBMR-RF-W0-trend-D1-UG0.8480.083101.657.3
SBMR-RF-W0-trend-D4-BG0.8470.07846.023.6
SBMR-RF-W0-trend-D5-BG0.8430.08551.226.5
SBMR-RF-W0-trend-D6-UG0.8390.08238.821.6
SBMR-RF-W0-trend-D0-UG0.8270.08019.111.6
SBMR-MLP-W0-trend-D0-UG0.8160.09021.910.6
BOSS-RF0.7960.0889.75.5
BOSS-MLP0.7780.10212.55.4
SBMR-MB-K-BOSS-VS-trend-D0-UG0.7660.09819.011.7
MB-K-BOSS-VS0.7270.10111.56.9
Table 5. Accuracy and execution time of configurations with multiple window sizes.
Table 5. Accuracy and execution time of configurations with multiple window sizes.
Accuracy_MeanAccuracy_stdTotal_Time_MeanTotal_Time_std
Algorithm
SBMR-RF-W8-trend-D0-UG0.8690.079204.0130.6
SBMR-RF-W8-trend-D0-BG0.8630.081216.8123.2
SBMR-RF-W9-trend-D0-UG0.8620.080154.996.8
SBMR-RF-W10-trend-D0-UG0.8620.07872.745.5
SBMR-RF-W8-noTrend-D0-BG0.8610.080104.660.6
SBMR-RF-W13-trend-D0-UG0.8610.07491.855.9
SBMR-RF-W8-noTrend-D0-UG0.8590.07893.859.1
SBMR-RF-W14-trend-D0-UG0.8500.07354.833.4
SBMR-RF-W11-trend-D0-UG0.8460.07436.521.3
SBMR-RF-W12-trend-D0-UG0.8450.07044.925.8
SBMR-RF-W7-noTrend-D0-UG0.8430.07260.836.6
SBMR-RF-W6-trend-D0-UG0.8420.08152.832.3
SBMR-RF-W4-trend-D0-UG0.8410.097264.9203.1
SBMR-RF-W1-trend-D0-BG0.8320.10358.336.5
SBMR-RF-W1-trend-D0-UG0.8300.10149.631.9
SBMR-RF-W4-noTrend-D0-UG0.8300.100132.1105.1
SBMR-RF-W5-noTrend-D0-UG0.8250.09880.167.9
SBMR-RF-W3-noTrend-D0-UG0.8240.10675.750.8
SBMR-RF-W2-noTrend-D0-UG0.8140.11042.323.4
SBMR-RF-W1-noTrend-D0-BG0.8140.10629.817.8
SBMR-RF-W1-noTrend-D0-UG0.8130.10525.817.2
SBMR-MB-K-BOSS-VS-W1-trend-D0-BG0.7940.09867.656.8
SBMR-MB-K-BOSS-VS-W1-trend-D0-UG0.7880.11243.527.9
SBMR-MB-K-BOSS-VS-W1-noTrend-D0-BG0.7670.10632.524.9
SBMR-MB-K-BOSS-VS-W1-noTrend-D0-UG0.7600.11323.115.4
Table 6. Accuracy and execution time for combinations of window and dilation configurations.
Table 6. Accuracy and execution time for combinations of window and dilation configurations.
Accuracy_MeanAccuracy_stdTotal_Time_MeanTotal_Time_std
Algorithm
SBMR-RF-W8-trend-D6-BG0.8750.079489.0234.5
SBMR-RF-W14-trend-D3-BG0.8650.077230.8105.2
SBMR-RF-W10-trend-D6-BG0.8650.073217.9100.4
SBMR-RF-W14-trend-D6-BG0.8630.074149.268.3
SBMR-RF-W11-trend-D3-BG0.8590.073201.1103.7
SBMR-RF-W11-trend-D6-BG0.8570.073116.354.1
Table 7. Accuracy and execution time of RidgeCV classifier without multiple windows.
Table 7. Accuracy and execution time of RidgeCV classifier without multiple windows.
Accuracy_MeanAccuracy_stdTotal_Time_MeanTotal_Time_std
Algorithm
SBMR-RidgeCV-W0-trend-D1-BG0.8580.095135.966.4
SBMR-RidgeCV-W0-trend-D2-BG0.8550.094100.248.8
SBMR-RidgeCV-W0-trend-D3-BG0.8520.09078.238.0
SBMR-RidgeCV-W0-trend-D6-BG0.8440.08759.637.4
SBMR-RidgeCV-W0-trend-D5-BG0.8410.09559.833.8
SBMR-RidgeCV-W0-trend-D4-BG0.8400.08960.236.4
SBMR-RidgeCV-W0-trend-D6-UG0.8180.10138.721.9
SBMR-RidgeCV-W0-trend-D0-UG0.7800.10121.511.4
SBMR-RidgeCV-W0-noTrend-D0-UG0.7430.10011.26.0
Table 8. Accuracy and execution time of RidgeCV classifier with multiple windows and dilation configurations.
Table 8. Accuracy and execution time of RidgeCV classifier with multiple windows and dilation configurations.
Accuracy_MeanAccuracy_stdTotal_Time_MeanTotal_Time_std
Algorithm
SBMR-RidgeCV-W8-trend-D6-BG0.8920.079517.5247.6
SBMR-RidgeCV-W14-trend-D6-BG0.8820.077160.476.3
SBMR-RidgeCV-W11-trend-D6-BG0.8810.076129.068.3
SBMR-RidgeCV-W15-trend-D6-BG0.8710.07178.951.7
SBMR-RidgeCV-W0-trend-D6-BG0.8440.08770.243.6
SBMR-RidgeCV-W11-trend-D0-BG0.8440.07563.535.9
SBMR-RidgeCV-W15-trend-D0-BG0.8310.07240.530.0
SBMR-RidgeCV-W11-trend-D0-UG0.8270.08543.622.6
Table 9. Accuracy and execution time of state-of-the-art algorithms.
Table 9. Accuracy and execution time of state-of-the-art algorithms.
Accuracy_MeanAccuracy_Pop_stdTotal_Time_MeanTotal_Time_Pop_std
Algorithm
WEASEL_V20.9040.078339.6189.9
Rocket0.9010.084309.3167.7
MiniRocket0.9000.08218.720.1
MRSQM_nsax0_nsfa50.8890.093418.8239.4
MRSQM_nsax1_nsfa10.8780.101176.2101.1
MRSQM_nsax0_nsfa10.8610.109117.276.1
MRSQM_nsax1_nsfa00.8390.113104.259.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Glenis, A.; Vouros, G.A. SCALE-BOSS-MR: Scalable Time Series Classification Using Multiple Symbolic Representations. Appl. Sci. 2024, 14, 689. https://doi.org/10.3390/app14020689

AMA Style

Glenis A, Vouros GA. SCALE-BOSS-MR: Scalable Time Series Classification Using Multiple Symbolic Representations. Applied Sciences. 2024; 14(2):689. https://doi.org/10.3390/app14020689

Chicago/Turabian Style

Glenis, Apostolos, and George A. Vouros. 2024. "SCALE-BOSS-MR: Scalable Time Series Classification Using Multiple Symbolic Representations" Applied Sciences 14, no. 2: 689. https://doi.org/10.3390/app14020689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop