Next Article in Journal
Upper-Limb Robotic Exoskeleton for Early Cardiac Rehabilitation Following an Open-Heart Surgery—Mathematical Modelling and Empirical Validation
Next Article in Special Issue
Semiparametric Integrated and Additive Spatio-Temporal Single-Index Models
Previous Article in Journal
Solitonic Aspect of Relativistic Magneto-Fluid Spacetime with Some Specific Vector Fields
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Co-Optimization Algorithm Utilizing Particle Swarm Optimization for Linguistic Time Series

1
Faculty of Natural Sciences and Technology, Tay Bac University, Sonla 360000, Vietnam
2
Faculty of Information Technology, East Asia University of Technology, Bacninh 220000, Vietnam
3
Graduate University of Sciences and Technology, Vietnam Academy of Science and Technology, Hanoi 100000, Vietnam
4
Faculty of Information Technology, University of Transport and Communications, Hanoi 100000, Vietnam
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(7), 1597; https://doi.org/10.3390/math11071597
Submission received: 24 February 2023 / Revised: 21 March 2023 / Accepted: 23 March 2023 / Published: 25 March 2023
(This article belongs to the Special Issue Nonparametric Regression Models: Theory and Applications)

Abstract

:
The linguistic time-series forecasting model (LTS-FM), which has been recently proposed, uses linguistic words of linguistic variable domains generated by hedge algebras (HAs) to describe historical numeric time-series data. Then, the LTS-FM was established by utilizing real numeric semantics of words induced by the fuzziness parameter values (FPVs) of HAs. In the existing LTS-FMs, just the FPVs of HAs are optimized, while the used word set is still chosen by human experts. This paper proposes a co-optimization method of selecting the optimal used word set that best describes numeric time-series data in parallel with choosing the best FPVs of HAs to improve the accuracy of LTS-FMs by utilizing particle swarm optimization (PSO). In this co-optimization method, the outer loop optimizes the FPVs of HAs, while the inner loop optimizes the used word set. The experimental results on three datasets, i.e., the “enrollments of the University of Alabama” (EUA), the “killed in car road accidents in Belgium” (CAB), and the “spot gold in Turkey” (SGT), showed that our proposed forecasting model outperformed the existing forecasting models in terms of forecast accuracy.

1. Introduction

The human ability to predict future events and phenomena has attracted the interest of the scientific community for many years, with many forecasting methodologies proposed based on observed historical data. In particular, the forecasting method based on time-series analysis has been investigated by numerous researchers using various models, such as ARMA, ARIMA, and so on. In time-series forecasting models, future values can be forecasted based on only past data. The characteristics of time-series data, such as trends, seasonality, stability, outliers, etc., have been considered to establish forecasting models for future values.
The fuzzy time series proposed by Q. Song and B. S. Chissom [1,2,3] is a forecasting method of time-series analysis that combines the principles of the fuzzy set theory proposed by L. A. Zadeh [4] with traditional time-series techniques. The fuzzy time series is particularly useful in handling the uncertainty, vagueness, and imprecision of data, which is often encountered in real-world applications. The approach allows for modeling the uncertainty and vagueness of the data by representing the relationships between linguistic variables as fuzzy sets rather than crisp sets. In a fuzzy time series, the forecasted values are not simply single-point estimates, but rather a range of values that reflect the uncertainty in the data. This provides a more realistic and robust representation of future trends in the time series, especially when dealing with non-linear and complex data. Fuzzy time series have been applied in a variety of domains, including finance, economics, environmental science, and engineering, where traditional time-series methods may not be adequate. Recently, many proposed models use optimization algorithms, such as genetic algorithm [5,6], particle swarm optimization (PSO) [7,8], etc., to optimize forecasting model parameter values to improve the forecasting accuracy.
People often use natural language as a tool to communicate effectively with each other. They also store and process recorded information through linguistic variables with their values. Linguistic values are also used by humans to forecast events occurring in nature and society so that they can better prepare for future planning. In addition to the fuzzy set theory for handling linguistic variables, hedge algebras (HAs), proposed by N. C. Ho and W. Wechler [9,10], exploit the order-based semantics structure of the word domains of linguistic variables, which provides a mathematical formalism for generating the computational semantics of words that can be applied to solve real-life application problems in various domains, such as image processing [11], fuzzy control [12], data mining [13,14,15,16,17], etc. HAs are qualitative models; therefore, they need to be quantified by measurable quantities based on qualitative semantics. The words of linguistic variables convey their qualitative semantics in relation to the other words in the whole variable domain. Because the linguistic words of a variable are interdependent, they only disclose their full qualitative semantics in the comparative context of the whole variable domain. The fuzziness measure of linguistic words [18] is the crucial concept of fuzzy information and plays an important role in quantification by hedge algebras. The quantitative or numerical semantics of a word induced by its fuzziness measure is the semantically quantifying mapping (SQM) value that is the basis for generating computational semantics for solving real-world application problems. Recently, hedge algebras have been applied to solve time-series forecasting problems in such a way that historical numeric time-series data are transformed into linguistic data by using the real numeric semantics of words defined based on their corresponding SQM values. Then, the LTS-FMs were established [19,20,21,22].
In LTS-FMs, instead of partitioning historical data into intervals, a datum of the historical data is assigned a linguistic word based on the nearest real numeric semantic. By doing so for all historical data, the numeric time series is transformed into an LTS. The linguistic words used to describe historical data are used to define the logical relationship between the data of the current year and those of the next year. Then, the linguistic logical relationships (LLRs) and linguistic logical relationship groups (LLRGs) are established. The fuzzy forecasted values are induced and, finally, based on them, the corresponding crisp forecasted values are computed. It can be seen that the SQM values of words determined by the FPVs of HAs are crucial in establishing LTS-FMs. Obtaining an optimal set of FPVs is not a trivial task, so an optimization algorithm should be applied to obtain it automatically. Among the optimization algorithms, PSO is efficient and is applied to solving many real-world problems [7,8,14,16,21]. It needs few algorithm parameters, so it is easy to implement. In [21], the FPVs of HAs were automatically optimized for LTS-FMs by applying PSO. However, the word set used to describe numeric time-series data was chosen by human experts, so the selected words may not have reflected the nature of the historical data. To overcome this drawback, this paper presents a method of selecting an optimal word set that best describes numeric time-series data in parallel with optimizing the FPVs of HAs by applying a co-optimization algorithm of particle swarm optimization (Co-PSO). In Co-PSO, the outer loop optimizes the FPVs of HAs, while the inner loop optimizes the used word set. The experimental results for three datasets, i.e., the “enrollments of the University of Alabama” (EUA), the “killed in car road accidents in Belgium” (CAB), and the “spot gold in Turkey” (SGT), showed that our proposed forecasting model had better forecasting accuracy than the existing models.
The remainder of this paper is organized as follows. Section 2 briefly restates the theory of hedge algebras and some concepts related to LTS-FM and PSO. The proposed forecasting model, called COLTS, is introduced in Section 3. In Section 4, some experiments with three datasets and discussions about the forecasted results are addressed. Finally, the summary of this work and some suggestions for future work are provided in Section 5.

2. Background

2.1. A Brief Introduction of Hedge Algebras

Each linguistic domain of variable 𝒳, denoted by Dom(𝒳), consists of a word set that can be generated from two generator words, e.g., “cold” and “hot”, by the action of linguistic hedges on them. For example, with two hedges of “very” and “rather”, the words generated by the action of two those hedges on the two generator words of “cold” and “hot” can be “very cold”, “rather cold”, “very hot”, “very very hot”, and so on. It can be observed that they are linearly ordered and are comparable, e.g., “very cold” ≤ “cold” ≤ “rather cold” ≤ “hot” ≤ “very hot” ≤ “very very hot”.
From the above intuitive observation, Ho et al. introduced hedge algebras (HAs) in 1990 [9,10], a mathematical structure that can directly manipulate the word domain of 𝒳. An HA of 𝒳, denoted by AX , is an order-based structure AX = (X, G, C, H, ≤), where X ⊆ Dom(𝒳) is a word set of 𝒳; G = {c, c+} is a set of two generators, where cc+; C = {0, W, 1} is a set of three constants satisfying the order relationship 0 ≤ cWc+ ≤ 1, where 0, 1, and W are the lowest, highest, and neutral constants, respectively; a set of hedges of H = HH+, where H and H+ are the negative and positive hedges, respectively; and ≤ is an operator that indicates the order relation between the inherent word semantics of 𝒳.
String representation can be used to represent the words in X, so a word xX is either c or ωc, where c ∈ {c, c+}, ω = hnh1, hiH, i = 1, …, n. H(x) denotes all words generated from x, so H(x) = {ωx, ωH}. If all hedges in H are linearly ordered, all words in X are also linearly ordered. In this case, linear HA is achieved. Hereafter, some main properties of the linear Has are presented:
  • The signs of the negative generator c and the positive generator word c+ are sign(c = −1 and sign(c+) = +1, respectively;
  • Every hH+ increases the semantic of c+ and has sign(h) = +1, whereas, every hH- decreases the semantic of c and has sign(h) = −1;
  • If hedge h strengthens the of hedge k, the relative sign between h and k is sign(h, k) = +1. On other hand, if the hedge h weakens the semantic of the hedge k, sign(h, k) = −1. Thus, the sign of a word x = hnhn−1h2h1c is specified as follows:
sign(x) = sign(hn, hn−1) × … × sign(h2, h1) × sign(h1) × sign(c).
Based on the word sign, if sign(hx) = +1, xhx, and if sign(hx) = −1, hxx.
Based on the syntactical semantics of words generated by HAs, the words in H(x), xX, induced from x, have had their semantics changed by the hedges in H, but they still convey the original semantics of x. Therefore, H(x) can be considered as the fuzziness of x and the diameter of H(x) is considered as the fuzziness measure of x, denoted by fm(x).
Assume that AX is a linear HA. The function fm: X → [0, 1] is the fuzziness measure of the words in X, provided that the following properties are satisfied [13]:
(F1): fm(c) + fm(c+) = 1 and h H f m h u = f m u , for u X ;
(F2): fm(x) = 0 for all H(x) = x, especially, fm(0) = fm(W) = fm(1) = 0;
(F3): x , y X ,   h H , the proportion f m h x f m x = f m h y f m y is called the fuzziness measure of hedge h, denoted by μ(h).
From the properties of (F1) and (F3), fm(x), where x = hnh1c and c ∈ {c, c+}, is computed as fm(x) = μ(hn)… μ(h1)fm(c), where h H μ h = 1 . For a given word in X, its fuzziness measure can be computed when the values of fm(c) and μ(hj) ∈ H are specified.
Semantically quantifying mapping (SQM) [13] of AX is a mapping of v : X 0 ,   1 , provided that the following conditions are satisfied:
(SQM1): It preserves the order-based structure of X, i.e., x y v x v y ,   x X ;
(SQM2): It is one-to-one mapping and v x is dense in [0, 1].
Let fm be a fuzziness measure on X, i = q 1 μ h i = α , i = 1 p μ h i = β , α , β > 0 , and α + β = 1 . v x is computed recursively based on fm as follows:
(1)
v W = θ = f m c ,   v c = θ α f m c = β f m c ,   v c + = θ + α f m c + ;
(2)
v h j x = v x + s i g n h j x i = s i g n j j f m ( h i x ) ω h j x f m h j x , where j ∈ [−q^p] = {j: −qjp & j ≠ 0} and
ω ( h j x ) = 1 2 [ 1 + s i g n ( h j x ) s i g n ( h p h j x ) ( β α ) ] { α , β } .
The SQM values of words are the basis of computing the real numerical semantics of words, and then a time series is transformed into LTS.

2.2. Linguistic Time Series-Forecasting Model

Based on the theory of hedge algebras, Hieu et al. [19] introduced the concept of LTS and its application to the enrollment forecasting problem. The quantitative (numerical) semantics, known as SQM values, of linguistic words were directly used to establish the LTS-FM. Specifically, the numerical semantics of words were linearly transformed to the real numerical semantic domain of the universe of discourse (UD) of the linguistic variable. Thus, each datum of a time series, whether recorded in numerical or linguistic values, could be naturally associated with the corresponding real numerical semantics of the used linguistic words of the LTS. These are also the distinguishing and outstanding properties of LTS compared with fuzzy time series.
In order to have a theoretical basis for proposing a forecasting model, in this subsection, some concepts of LTS proposed by Hieu et al. [19] are presented.
Definition 1
([19]). (LTS) Let 𝕏 be a set of natural linguistic words of variable 𝒳 defined on the UD to describe its numeric values. Then, any chronological series L(t), t = 0, 1, 2, …, where L(t) is a finite subset of 𝕏, is called a linguistic time series.
Definition 2
([19]). (LLR) Suppose Xi and Xj are the linguistic words representing the data at time t and t + 1, respectively. Then, there exists a relationship between Xi and Xj called a linguistic logical relationship (LLR), denoted by
XiXj
Definition 3
([19]). (LLRG) Assume that there are LLRs, such as
XiXj1,
XiXj2,
XiXjn.
Then, they can be grouped into a linguistic logical relationship group (LLRG) and are denoted by
XiXj1, Xj2, …, Xjn.
In [19], Hieu et al. also presented a forecasting model called LTS-FM. The procedure of an LTS-FM is depicted in Figure 1.
In Figure 1, the LTS-FM procedure includes six steps, which can be briefly described as follows (for more detail, please see [19,22]):
Step 1. Determine the UD of the linguistic variable, the syntactical semantics, and the FPVs of the associated Has, and choose the used linguistic words to describe the designated time series;
Step 2. Calculate the quantitative semantics of the used words;
Step 3. Map the quantitative semantics of the used words to the real domain of the UD to obtain the real numerical semantics;
Step 4. Transform the designated time series into LTS. For each specified datum, its semantics are specified based on the nearest real semantic;
Step 5. Establish the LLRs of words, then group them into the LLRGs;
Step 6. Forecast based on LLRGs, then compute the crisp forecasted values.
The evaluation measures of the forecasting models are the mean-square error (MSE), the root-mean-square error (RMSE), and the mean-absolute-percentage error (MAPE) which are calculated as follows:
M S E = 1 n i = 1 n F i R i 2 ,
R M S E = M S E ,
M A P E = 1 n i = 1 n F i R i F i × 100 % .
where n denotes the number of forecasted data, and Fi and Ri are the forecasted and real data at the time i, respectively.

2.3. Standard PSO

In 1995, Kennedy and Eberhart introduced an optimization method, so-called particle swarm optimization (PSO) [22,23]. This is a swarm intelligence-based optimization method that has been applied to solve a lot of real-world problems. It mimics the way the birds fly to find food sources. The birds in the swarm will follow the leader who is the nearest to the food source. Suppose that a swarm S = {x1, x2, …, xN} with N particles, with each particle’s position being X i t at generation t computed as follows:
X i t + 1 = X i t + V i t + 1 ,
where V i t + 1 is xi’s velocity at generation t + 1, computed as follows:
V i t + 1 = ω × V i t + c 1 × r 1 × ( P i t X i t ) + c 2 × r 2 × ( P g t X i t ) ,
where P g t and P i t are the best global and local solutions found so far, respectively; r1 and r2 are two random numbers uniformly distributed in [0, 1]; and c1, c2, and ω are self-cognitive coefficient, social cognitive coefficient, and inertia weight, respectively. The standard PSO is described in brief as follows:
Step 1: Initialize a swarm S with two random vectors, the position vector X and velocity vector V. Initialize the number of cycle constants N and the cycle variable t;
Step 2: Calculate each particle’s objective value f( X i t );
Step 3: Check each particle’s objective. If the current position X i t is better than the personal best P i t then update P i t with X i t ;
Step 4: Check if there exists the best particle position in the current cycle whose objective value is better than the objective value of the global best P g t , then update P g t with the best particle position;
Step 5: Update each particle’s velocity V i t by Equation (4) and move to its new position X i t by Equation (5);
Step 6: Terminate if variable t reaches the maximum number of cycles; otherwise, let t = t + 1 and go to Step 2.

3. Improve the LTS-FM by the Co-Optimization of PSO

In this subsection, a hybrid LTS-FM integrated with particle swarm co-optimization (COLTS) is presented. In the LTS-FMs, the word set in the domain of a linguistic variable is generated automatically by hedge algebras associated with a linguistic variable. Therefore, the cardinality of the generated word set is unlimited. However, in a specific period, a given number of words are selected to describe the numeric time series. The number of automatically generated words depends on the max specificity level k of the words specified by human experts, and the maximum word length in the word domain is equal to k.
In the proposed optimization model of LTS presented in [21], the word set used to describe the historical numeric time-series data (so-called used word set) should be selected intuitively by human experts before the fuzziness parameter value optimization processes. Hence, the used word set depends on the human expert’s intuition and may not be optimal. To select the optimal used word set to best describe historical numeric time-series data from the linguistic variable domain, an optimization process should be executed. The FPVs should be optimized in parallel with the word set’s selection to select the best numerical semantics of used words. Therefore, a co-optimization process for concurrently selecting the optimal used word set and FPVs is applied by utilizing PSO. Specifically, the co-optimization process includes inner and outer loops. The inner loop (inner PSO) optimizes the used word set, while the outer loop (outer PSO) optimizes the FPVs. Real encoding is used for both the outer and inner loops. Specifically, as in other applications of hedge algebras [11,12,13,14,15,16,17], in our experiments, only two linguistic hedges (Little and Very) are used to generate the word set of the linguistic variable domain, so the number of optimized FPVs is only 2. They are the fuzziness measure of one of two generator words (e.g., fm(c)) and the fuzziness measure of one of two hedges (e.g., μ L i t t l e ). From the constraints of fm(c) + fm(c+) = 1 and μ L i t t l e + μ V e r y = 1 , it can be inferred that fm(c+) = 1 − fm(c) and μ V e r y = 1 μ L i t t l e . Each particle of the outer loop Y i t = {fm(c), μ L i t t l e }, i = 1, …, N, in the swarm represents those two FPVs. For the inner loop, each particle corresponds to a solution represented as an array of real numbers p i = w 1 , , w d w , w j 0 , 1 . Each word xl of the used word set Xi is selected from word set Wset of X L by the zero-based index, as follows:
X i t = { x l W s e t | l = w j × W s e t   , 0 l < W s e t }
where denotes the integer part of a real number.
The details of all steps of the proposed co-optimization algorithm are as follows (Algorithms 1 and 2):
Algorithm 1. UWO (Gwmax, Wset, dw, δ)//Used word set-optimization procedure
Input:
   Parameters: Gwmax, Wset, dw, δ;
   // Gwmax is the number of generations, Wset is the word set of X L , dw is the cardinality of the used word set selected from Wset, δ is a set of FPVs.
Output: The optimal word set S g G w m a x 1 and the associated best MSE value;
Begin
  Randomly initialize a swarm W0 = { Y i 0 ; V i 0 | i = 1, …, M, | Y i 0 |= dw};
  // Y i t is a subset of index values in interval [0, |Wset|], where |Wset| is the cardinality of Wset, | Y i t | is the number of elements of Y i t .
  Sort the elements of Y i 0 ;
  For each particle xi in swarm do begin
    Implement the LTS forecasting procedure from Step 2 to Step 6 based on δ and Y i 0 ;
    Evaluate the value of the MSE of xi by Equation (1);
    Assign the personal best position S i 0 of xi to the current position;
  End;
  Assign the global best position S g 0 to the best position in current swarm;
   t = 1;
  Repeat
    For each particle xi in swarm do begin
      Compute new velocity V i t of xi by Equation (5);
      Compute new position Y i t of xi by Equation (4);
      Sort the elements of Y i t ;
      Implement the LTS forecasting procedure from Step 2 to Step 6 based on δ and Y i t ;
      Evaluate the value of the MSE of xi by Equation (1);
      If Y i t is better than Y i t 1  then
       Update S i t of xi based on the value of MSE;
      End;
     End;
     Update S g t based on the values of MSE;
     t = t + 1;
  Until t = Gwmax;
  Return the best position S g G w m a x 1 and its best MSE value;
End.
Algorithm 2. PSCO_FPVO//Fuzziness parameter value optimization
Input: The designated time-series dataset D;
   Parameters: N, Gmax, Gwmax, the syntactical semantics of HAs, kmax, dw;
   //Gmax and Gwmax are the number of generations of outer and inner PSO, respectively; kmax is the maximum word length; dw is the used word set’s cardinality.
Output: The optimal FPVs P g G m a x 1 and the best-used word set W*set;
Begin
  Generate the word set Wset of X L with the maximum word length kmax utilizing HA;
  Randomly initialize a swarm S0 = { X i 0 ; V i 0 | i = 1, …, N}, where X i 0 = f m c ,   μ L i t t l e ;
  // X i t is a set of FPVs
  //The best used word set W*set and the best MSE value MSE*
  (W*set, MSE*) = (∅, +∞);
  For each particle xi do begin
   (W*set, MSE*) = UWO(Gwmax, Wset, dw, δ = X i 0 ); //call the inner PSO
    F i 0 = MSE*; //Fitness value associated with particle xi
   Assign the personal best position P i 0 of xi to the current position;
  End;
  Assign the global best position P g 0 the best position in current swarm;
  t = 1;
  Repeat
   For each particle xi do begin
     Compute new velocity V i t of xi by Equation (5);
     Compute new position X i t of xi by Equation (4);
     (Wtmp_set, MSE) = UWO(Gwmax, Wset, dw, δ = X i t ); //call the inner PSO
      F i t = MSE; //Fitness value associated with particle i
     If  F i t is better than F i t 1  then begin
       Update the personal best P i t of xi based on F i t ’s value;
       If MSE is better than MSE* then begin
         (W*set, MSE*) = (Wtmp_set, MSE);
         Update the global best position P g t ;
       End;
     End;
   End;
   t = t + 1;
Until t = Gmax;
Return the best position P g G m a x 1 and W*set;
End.
As shown in the UWO and PSCO_FPVO algorithms, the PSCO_FPVO algorithm optimizes the FPVs and runs as an outer iteration of COLTS. Each particle of PSCO_FPVO represents a given FPV set X i t = f m c ,   μ L i t t l e , which is the input of UWO. In turn, UWO finds the optimally used word set from word set Wset of X L based on the given FPV set as its input. Each particle of UWO represents a given used word set. The output of UWO includes the local best used word set contained in S g G w m a x 1 corresponding to the input FPV set and the associated local best MSE value. The output of PSCO_FPVO includes the best FPV set contained in P g G m a x 1 and the global best used word set W*set, which is the best found solution so far.

4. Experimental Studies and Discussion

The experiments aim to show the necessity of optimizing the used linguistic word set to describe a numeric time series in parallel with optimizing FPVs, and our proposed forecasting models outperformed the models used for comparison. To show the efficiency of our proposed models, their experimental results on the three forecasting problems of the EUA, CAB, and GT were evaluated and compared with those proposed by Uslu et al. [6], Chen et al. [8], and Phong et al. [21]. The objective function of the optimization problem is the MSE. The values of MSE, RMSE, and MAPE were used to evaluate the forecasting models used for comparison.
In the experiments, our proposed forecasting models were implemented using the C# programming language running on Microsoft Windows 10 64 bit with the hardware configuration of an Intel Core i5-8250U 1.6 GHz CPU integrated with 8 GB of RAM. Each experiment was attempted five times. Then, the smallest obtained MSE values were chosen as the forecasting accuracy.
To limit the running time, the parameter values of the optimization algorithm were set as in the existing forecasting models; for the inner iteration, the number of cycles was 100 and the number of particles was 30, and for the outer iteration, the number of cycles was 30 and the number of particles was 20. By trial and error and our experience in applying PSO in our previous studies, for both inner and outer iterations, the Inertia weight ω was 0.4, and both the self-cognitive factor c1 and the social cognitive factor c2 were 2.0.

4.1. Forecast the “Enrollments of the University of Alabama”

To show the efficiency of the proposed LTS-FM described above, it was applied to forecast the EUA, as shown in Table 1. As in the counterparts, the UD of the EUA was U = [13,000, 20,000] and the number of linguistics used to describe the numeric time series was 16. Instead of partitioning U into the intervals in the existing methods, it was transformed into linguistic words to form an LTS in our proposed forecasting methods. The linguistic word set was generated by the associated hedge algebras, so its cardinality was not limited. In fact, in a specific period, the number of words used to describe the numeric time series was limited. Therefore, PSO was applied to select the best used word set to improve the accuracy of the forecasting model.
To facilitate a significant comparison with the existing methods, seven, fourteen, and sixteen linguistic words were, in turn, used to describe U, and to show the significance of the word length; the maximum word length was, in turn, limited to 3, 4, and 5. The best MSE and MAPE values in accordance with the best word set selected by the inner iteration of the linguistic word optimization process and the best FPVs optimized by the outer iteration are shown in Table 2. It can be seen that, with the same number of used words, the higher the maximum word length was, the better the MSE value we obtained. However, this was not absolutely true when slightly increasing the number of used words. In the case of a maximum word length of 3, the LTS-FMs with 14 used words were better than that with 16 used words when compared by both the MSE and MAPE values. The cause of this situation will be analyzed more deeply in a future study. It may be attributed to the limitation of the cardinality of the word domain of the linguistic variable. According to the theory of HAs [9,10], when limiting the maximum word length to 3 and 2 linguistic hedges, the number of words in the word domain generated by the associated HAs is only 17. In the case of 16 used words, to choose 16 words out of 17, PSO does not have much choice, and among them, at least one word has semantics that are suitable for the natural distribution of time-series data, leading to worse forecasted results than those obtained in the case of 14 used words. In the case of maximum word lengths of 4 and 5, the higher the number of used words was, the better the MSE value we obtained. Therefore, the best MSE was reached in the case of 16 used words and a maximum word length of 5.
To show the efficiency of our proposed LTS-FMs with co-optimization PSO in comparison with other forecasting models, the best experimental results of the proposed LTS-FMs with maximum word lengths of 3, 4, and 5, denoted by COLTS3, COLTS4, and COLTS5, respectively, were compared with the state-of-the-art LTS-FM of Phong et al. [21] and the existing FTS-FMs, such as the CCO6 model of Chen and Chung, applying a genetic algorithm [5], HPSO [7], applying a PSO algorithm, Uslu et al. [6], considering the number of iterations of a fuzzy logical relationship, and Chen et al. [8], applying PSO and a new defuzzification technique. The experimental results and their comparative presentation are shown in Table 1 and visualized in Figure 2. By analyzing the experimental results shown in Table 1, it can be recognized that our proposed LTS-FM with a word length of 3 (COLTS3) had the same MSE and MAPE values as the LTS-FM proposed by Phong et al. [21] and had better MSE and MAPE values than those of FTS-FMs CCO6 [5], HPSO [7], Uslu et al. [6], and Chen et al. [8]. Meanwhile, both COLTS4 and COLTS5 had better MSE and MAPE values than those of the compared forecasting models. Recall that the word set used to describe the numeric time-series data in the LTS-FM proposed in [21] was chosen by human experts. Therefore, it depended on their cognitive recognition. To achieve a better result, they could also perform some trial and error to obtain the best word set. The number of cycles and the number of particles of the PSO in [21] were too large, at 1000 and 300, respectively. Based on the comparison results described above, it can be stated that the proposed LTS-FMs were better than the FTS-FMs and the COLTS5 was the best.

4.2. Forecast the “Killed in Car Road Accidents in Belgium”

The proposed LTS-FMs were also applied to the CAB’s problem to evaluate them again. The minimum and maximum values of the historical data of CAB observed from 1974 to 2004 were 953 and 1644, respectively. Thus, the UD was defined as U = [900, 1700]. The number of used words was set to 17, which was equal to the number of intervals in [6,8].
It is easy to see from the comparison results in Table 3 that the MSE values of COLTS3, COLTS4, and COLTS5 were, in turn, 794, 444, and 421, which were much better than those of the compared forecasting methods of Uslu et al. [6] and Chen et al. [8], which were 1731 and 1024, respectively. When comparing by the MAPE values, we can also see that all three of our LTS-FMs, COLTS3, COLTS4, and COLTS5, had better MAPE values than those of Uslu et al. and Chen et al. [6,8], respectively. Therefore, our proposed LTS-FMs outperformed the FTS-FMs presented in [6,8] in forecasting the problem of CAB. In addition, among our three LTS-FMs, COLTS5 was better than COLTS4 and, in turn, COLTS4 was better than COLTS3. Therefore, the statement “with the same number of used words, the longer the maximum word length is, the better MSE value we obtain” presented above is also true for the CAB problem.

4.3. Forecast the “Spot Gold in Turkey”

The proposed LTS-FMs were applied once again to a more complex forecasting problem of SGT. The minimum and maximum of the historical data of SGT observed from December 7th to November 10th were 30,503 and 62,450, respectively, so the UD was set to [30,000, 63,000]. The number of used words was 16, which was equal to the number of intervals in [6,8].
The experimental results of our proposed LTS-FMs shown in Table 4 were compared with those of Uslu et al. and Chen et al. [6,8], respectively. It can be seen in Table 4 that, when compared by both the MSE and MAPE values, all three of our proposed LTS-FMs, COLTS3, COLTS4, and COLTS5, outperformed the FTS-FMs of Uslu et al. and Chen et al. [6,8], respectively, in forecasting the problem of SGT. When compared by the MSE or RMSE values, COLTS5 was better than COLTS4, and COLTS3 was worse than COLTS4. However, when compared by the MAPE value, COLTS5 was slightly worse than COLTS4. In general, once again, we can state that, with the same number of used words, the longer the maximum word length, the better the MSE value obtained.

5. Conclusions

In this study, we proposed a hybrid LTS-FM with a co-optimization procedure using PSO to determine applicable fuzziness parameter values and the best word set describing the observed data. When compared with the former studies, our proposed model achieved better forecasting accuracies. Based on this research, we can continue studying to improve the forecasted accuracy of the forecasting model and apply it to different forecasting application problems with various datasets. In addition, research on better data representation methods using LTS to improve the accuracy and the performance of the forecasting model is a good idea for future work.

Author Contributions

Conceptualization, N.D.H. and P.D.P.; methodology, N.D.H. and P.D.P.; software, P.D.P. and M.V.L.; validation, M.V.L.; data curation, M.V.L. and N.D.H.; writing, P.D.P. and N.D.H.; visualization, M.V.L.; supervision, P.D.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the Vietnam Ministry of Education and Training under grant number B2022-TTB-01.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Song, Q.; Chissom, B.S. Fuzzy Time Series and its Model. Fuzzy Sets Syst. 1993, 54, 269–277. [Google Scholar] [CrossRef]
  2. Song, Q.; Chissom, B.S. Forecasting Enrollments with Fuzzy Time Series–Part I. Fuzzy Sets Syst. 1993, 54, 1–9. [Google Scholar] [CrossRef]
  3. Song, Q.; Chissom, B.S. Forecasting Enrollments with Fuzzy Time Series–Part II. Fuzzy Sets Syst. 1994, 62, 1–8. [Google Scholar] [CrossRef]
  4. Zadeh, L.A. Fuzzy Sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
  5. Chen, S.M.; Chung, N.Y. Forecasting Enrollments of Students by Using Fuzzy Time Series and Genetic Algorithms. Inf. Manag. Sci. 2006, 17, 1–17. [Google Scholar]
  6. Uslu, V.R.; Bas, E.; Yolcu, U.; Egrioglu, E. A fuzzy time series approach based on weights determined by the number of recurrences of fuzzy relations. Swarm Evol. Comput. 2014, 15, 19–26. [Google Scholar] [CrossRef]
  7. Kuo, I.H.; Horng, S.J.; Kao, T.W.; Lin, T.L.; Lee, C.L.; Pan, Y. An improved method for forecasting enrollments based on fuzzy time series and particle swarm optimization. Expert Syst. Appl. 2009, 36, 6108–6117. [Google Scholar] [CrossRef]
  8. Chen, S.M.; Zou, X.Y.; Gunawan, G.C. Fuzzy time series forecasting based on proportions of intervals and particle swarm optimization techniques. Inf. Sci. 2019, 500, 127–139. [Google Scholar] [CrossRef]
  9. Ho, N.C.; Wechler, W. Hedge algebras: An algebraic approach to structures of sets of linguistic domains of linguistic truth values. Fuzzy Sets Syst. 1990, 35, 281–293. [Google Scholar]
  10. Ho, N.C.; Wechler, W. Extended algebra and their application to fuzzy logic. Fuzzy Sets Syst. 1992, 52, 259–281. [Google Scholar] [CrossRef]
  11. Huy, N.H.; Ho, N.C.; Quyen, N.V. Multichannel image contrast enhancement based on linguistic rule-based intensificators. Appl. Soft Comput. J. 2019, 76, 744–762. [Google Scholar]
  12. Le, B.H.; Anh, L.T.; Binh, B.V. Explicit formula of hedge-algebras-based fuzzy controller and applications in structural vibration control. Appl. Soft Comput. 2017, 60, 150–166. [Google Scholar]
  13. Ho, N.C.; Nam, H.V.; Khang, T.D.; Chau, N.H. Hedge Algebras, Linguistic- valued logic and their application to fuzzy reasoning. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 1999, 7, 347–361. [Google Scholar]
  14. Ho, N.C.; Son, T.T.; Phong, P.D. Modeling of a semantics core of linguistic terms based on an extension of hedge algebra semantics and its application. Knowl.-Based Syst. 2014, 67, 244–262. [Google Scholar]
  15. Ho, N.C.; Thong, H.V.; Long, N.V. A discussion on interpretability of linguistic rule based systems and its application to solve regression problems. Knowl.-Based Syst. 2015, 88, 107–133. [Google Scholar]
  16. Phong, P.D.; Du, N.D.; Thuy, N.T.; Thong, H.V. A hedge algebras based classification reasoning method with multi-granularity fuzzy partitioning. J. Comput. Sci. Cybern. 2019, 35, 319–336. [Google Scholar] [CrossRef] [Green Version]
  17. Hoang, V.T.; Nguyen, C.H.; Nguyen, D.D.; Pham, D.P.; Nguyen, V.L. The interpretability and scalability of linguistic-rule-based systems for solving regression problems. Int. J. Approx. Reason. 2022, 149, 131–160. [Google Scholar] [CrossRef]
  18. Ho, N.C.; Long, N.V. Fuzziness measure on complete hedge algebras and quantifying semantics of terms in linear hedge algebras. Fuzzy Sets Syst. 2007, 158, 452–471. [Google Scholar] [CrossRef]
  19. Hieu, N.D.; Ho, N.C.; Lan, V.N. Enrollment forecasting based on linguistic time series. J. Comput. Sci. Cybern. 2020, 36, 119–137. [Google Scholar] [CrossRef]
  20. Hieu, N.D.; Ho, N.C.; Lan, V.N. An efficient fuzzy time series forecasting model based on quantifying semantics of words. In Proceeding of 2020 RIVF International Conference on Computing and Communication Technologies (RIVF), Ho Chi Minh, Vietnam, 14–15 October 2020; pp. 1–6. [Google Scholar]
  21. Phong, P.D.; Hieu, N.D.; Linh, M.V. A Hybrid Linguistic Time Series Forecasting Model combined with Particle Swarm Optimization. In Proceedings of the International Conference on Electrical, Computer and Energy Technologies (ICECET 2022), Prague, Czech Republic, 20–22 July 2022. [Google Scholar]
  22. Kennedy, J.; Eberhart, R.C. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Piscataway, NJ, USA, 27 November–1 December 1995; IEEE Service Center: Piscataway, NJ, USA, 1995; pp. 1942–1948. [Google Scholar]
  23. Eberhart, R.C.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; IEEE Service Center: Piscataway, NJ, USA, 1995; pp. 39–43. [Google Scholar]
Figure 1. The procedure of an LTS-FM.
Figure 1. The procedure of an LTS-FM.
Mathematics 11 01597 g001
Figure 2. Comparison chart of the forecasted values of the EUA of the compared forecasting models.
Figure 2. Comparison chart of the forecasted values of the EUA of the compared forecasting models.
Mathematics 11 01597 g002
Table 1. Comparative simulation of the enrollments forecast in the case of 16 used words and a maximum word length of 5.
Table 1. Comparative simulation of the enrollments forecast in the case of 16 used words and a maximum word length of 5.
YearEnrollmentsCCO6 [5]HPSO [7]Uslu et al. [6]Chen et al. [8]Phong et al. [21]COLTS3COLTS4COLTS5
197113,055
197213,56313,71413,55513,65013,46913,51513,51513,56213,598
197313,86713,71413,99413,65013,95214,00114,00113,75913,900
197414,69614,88014,71114,83614,59614,80014,80014,72214,817
197515,46015,46715,34415,33215,43915,50915,50915,41215,445
197615,31115,17215,41115,44715,24115,50915,50915,46415,487
197715,60315,46715,41115,44715,92515,50915,50915,46415,487
197815,86115,86115,41115,44715,88015,75215,75215,79815,877
197916,80715,83116,81616,74616,80116,69316,69316,79916,805
198016,91917,10617,14017,07517,00916,94916,94916,97516,995
198116,38816,38016,46416,38016,26016,77916,77916,43116,323
198215,43315,46415,45715,50415,43515,55315,55315,41215,445
198315,49715,17215,44715,43115,21215,50915,50915,46415,487
198415,14515,17215,44715,07715,28215,13215,13215,28615,221
198515,16315,46715,33215,29715,34415,13215,13215,31215,241
198615,98415,46716,02715,84815,71415,75215,75215,82415,898
198716,85916,83116,74616,83516,83316,69316,69316,83316,825
198818,15018,05518,21118,14518,01617,88817,88818,19318,205
198918,97018,99819,05918,88018,93718,91118,91118,83318,845
199019,32819,30019,05919,41819,34519,43919,43919,24619,389
199119,33719,14919,05919,26019,14719,30719,30719,14319,253
199218,87619,14919,05919,03119,15219,04319,04318,93618,981
MSE35,32431,72231,68423,71022,40322,40397556332
RMSE187.95178.11178.00153.98149.68149.6898.7779.57
MAPE1.13%0.84%0.90%0.73%0.72%0.72%0.49%0.40%
Table 2. The MSE values of the first-order LTS-FMs with different numbers of words and different maximum word lengths.
Table 2. The MSE values of the first-order LTS-FMs with different numbers of words and different maximum word lengths.
Maximum Word LengthEvaluation MethodNumber of Used WordsNumber of Used WordsNumber of Used Words
71416
3MSE24,11119,98922,415
MAPE0.80%0.63%0.72%
4MSE21,28410,8539758
MAPE0.74%0.50%0.49%
5MSE19,79596396332
MAPE0.72%0.44%0.40%
Table 3. Comparative simulation of CAB forecast in the case of 16 used words and a maximum word length of 5.
Table 3. Comparative simulation of CAB forecast in the case of 16 used words and a maximum word length of 5.
YearActual DataUslu et al. [6]Chen et al. [8]COLTS3COLTS4COLTS5
19741574
1975146015061451149814951464
1976153614531490150015021515
1977159715981622155515921610
1978164415841575159316381623
1979157215841593157915681568
1980161615061585159315821610
1981156415841582155715601583
1982146415061513148514641462
1983147914531494147414711474
1984136913751393142313671381
1985130813831336135213331315
1986145614541419145014401462
1987139014531485141114171428
1988143213831384141913991391
1989148815091459147414791483
1990157415981585153415721580
1991147115061451149815001468
1992138013751369141113671381
1993134613831361137713331323
1994141513831437140013921381
1995122812311217121312761250
1996112211351152113611391156
1997115011801172113411221131
1998122412451211122512251187
1999117311351147116011581187
2000125312451245124912441246
2001128812841280122712811253
2002114511431148115811301163
200310359701028105810321032
2004953970953964987982
MSE17311024794444421
RMSE41.6132.0028.1821.0720.52
MAPE2.29%1.77%1.68%1.253%1.250%
Table 4. The forecasted values of the SGT of our proposed FTS-FMs compared with those of existing forecasting models.
Table 4. The forecasted values of the SGT of our proposed FTS-FMs compared with those of existing forecasting models.
DateActual Spot GoldUslu et al. [6]Chen et al. [8]COLTS3COLTS4COLTS5
7 December30,503
8 January33,13232,740.1832,341.3832,023.4533,145.2833,317.23
8 February35,20134,882.7834,479.3634,437.434,368.7734,748.84
8 March38,52937,409.6638,605.4738,926.5337,751.738,047.16
8 April38,30039,894.2338,203.3439,174.4538,551.0138,697.57
8 May36,14237,023.8837,406.6736,597.2736,770.3837,115.76
8 June35,83737,409.6636,749.3635,157.3635,971.0736,465.34
8 July37,07437,409.6636,452.8537,719.5537,151.0437,608.49
8 August32,95532,740.1831,805.5133,950.3832,537.2332,693.81
8 September33,27734,882.7834,335.4234,437.433,745.9533,755.9
8 October38,29537,409.6638,120.7137,719.5538,545.4738,720.65
8 November38,67737,023.8837,402.3137,804.2538,551.0138,697.57
8 December40,72439,894.2340,726.3340,381.4240,353.840,187.07
9 January43,98543,666.2144,515.6743,539.1344,427.0343,640.45
9 February49,93149,662.449,800.7747,966.4850,125.4749,223.14
9 March50,82351,971.9950,962.6651,420.150,985.2851,307.51
9 April46,16745,938.0745,869.846,046.646,786.9646,273.1
9 May46,71646,435.446,548.2446,526.5747,031.6646,568.11
9 June47,33746,435.447,067.0246,526.5747,031.6646,568.11
9 July46,08846,435.447,653.8346,526.5747,031.6646,568.11
9 August45,83946,435.446,473.5946,526.5747,031.6646,568.11
9 September48,05346,435.446,238.347,966.4848,255.1548,043.13
9 October49,59249,662.448,330.4150,213.1350,501.2750,458.94
9 November53,69351,971.9954,338.0652,866.2753,035.9553,317.76
9 December54,55354,188.4154,509.9654,270.6654,472.254,596.41
10 January53,02254,188.4153,663.0153,483.9554,472.253,323.23
10 February53,61354,188.4154,183.7954,232.7154,472.253,031.02
10 March55,03154,188.4154,471.0754,270.6654,472.254,304.20
10 April55,18154,188.4155,887.6855,209.2155,316.7254,836.46
10 May60,30060,069.3260,030.7860,575.5661,184.0660,631.89
10 June62,10060,069.3259,888.4661,171.3862,123.9560,829.09
10 July60,50059,849.561,610.8961,181.6761,184.0660,914.75
10 August59,20060,069.3260,079.8459,725.2159,495.7859,526.34
10 September61,25060,069.3261,520.7461,894.4661,184.0661,480.46
10 October62,45062,437.1560,797.5262,393.8962,123.9561,339.03
10 November61,60059,849.561,945.861,894.4661,184.0661,339.03
MSE1,030,692805,291504,909332,503302,749
RMSE1015.23897.38710.57576.63550.23
MAPE1.80%1.55%1.38%0.98%1.00%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hieu, N.D.; Linh, M.V.; Phong, P.D. A Co-Optimization Algorithm Utilizing Particle Swarm Optimization for Linguistic Time Series. Mathematics 2023, 11, 1597. https://doi.org/10.3390/math11071597

AMA Style

Hieu ND, Linh MV, Phong PD. A Co-Optimization Algorithm Utilizing Particle Swarm Optimization for Linguistic Time Series. Mathematics. 2023; 11(7):1597. https://doi.org/10.3390/math11071597

Chicago/Turabian Style

Hieu, Nguyen Duy, Mai Van Linh, and Pham Dinh Phong. 2023. "A Co-Optimization Algorithm Utilizing Particle Swarm Optimization for Linguistic Time Series" Mathematics 11, no. 7: 1597. https://doi.org/10.3390/math11071597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop