Next Article in Journal
A Multi-Criteria Decision Support and Application to the Evaluation of the Fourth Wave of COVID-19 Pandemic
Previous Article in Journal
Control Meets Inference: Using Network Control to Uncover the Behaviour of Opponents
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Opposition-Based Learning CRO Algorithm for Solving the Shortest Common Supersequence Problem

1
School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
2
Department of Computer Science and Information Technologies, Universidad del Bío-Bío, Chillán 3780000, Chile
*
Authors to whom correspondence should be addressed.
Entropy 2022, 24(5), 641; https://doi.org/10.3390/e24050641
Submission received: 17 March 2022 / Revised: 24 April 2022 / Accepted: 29 April 2022 / Published: 3 May 2022

Abstract

:
As a non-deterministic polynomial hard (NP-hard) problem, the shortest common supersequence (SCS) problem is normally solved by heuristic or metaheuristic algorithms. One type of metaheuristic algorithms that has relatively good performance for solving SCS problems is the chemical reaction optimization (CRO) algorithm. Several CRO-based proposals exist; however, they face such problems as unstable molecular population quality, uneven distribution, and local optimum (premature) solutions. To overcome these problems, we propose a new approach for the search mechanism of CRO-based algorithms. It combines the opposition-based learning (OBL) mechanism with the previously studied improved chemical reaction optimization (IMCRO) algorithm. This upgraded version is dubbed OBLIMCRO. In its initialization phase, the opposite population is constructed from a random population based on OBL; then, the initial population is generated by selecting molecules with the lowest potential energy from the random and opposite populations. In the iterative phase, reaction operators create new molecules, where the final population update is performed. Experiments show that the average running time of OBLIMCRO is more than 50% less than the average running time of CRO_SCS and its baseline algorithm, IMCRO, for the desoxyribonucleic acid (DNA) and protein datasets.

1. Introduction

As a well-known non-deterministic polynomial hard (NP-hard) problem [1], the shortest common supersequence (SCS) problem has been widely applied in real life and in many bioinformatics and computer science fields, including pattern finding [2], multiple sequence alignment problems [3], deoxyribonucleic acid (DNA) sequencing [4], data compression [5], and artificial intelligence (AI) planning [6]. To find the optimal solution for the SCS problem many different studies have been proposed, including artificial bee colonies (ABC) [7], ant colony optimization (ACO) [8], enhanced beam search (IBS) [9], deposition and reduction (DR) [10], and the CRO_SCS (chemical reaction optimization (CRO) for the SCS problem) algorithm [11].
Compared to other heuristic algorithms, CRO algorithms have achieved better results in solving the SCS problem. The CRO algorithm is inspired by the process of obtaining low-potential energy molecules from chemical reactions in nature. The algorithm calls the solution of the problem a molecule, which is obtained from an iterative process in which four reaction operators are used to generate new molecules performing local and global search. To solve the SCS problem, Saifullah et al. proposed the CRO_SCS algorithm, a CRO-based proposal with the inclusion of a check and repair function to repair molecules after iterations [11]. The operator checks each character of the generated superstring and records the number of illegal characters. If the number of violations exceeds a certain standard, the string is discarded. If the number of violations is less than a predefined value, it is repaired. Compared to AOC, ABC, IBS and DR, CRO_SCS is able to shorten the average length of SCS and reduce the average running time.
In order to further improve the performance of the CRO algorithm in solving the SCS problem, a new algorithm based on CRO_SCS, the IMCRO algorithm, was proposed in [12]. It introduced the circular shift operator and the two-step crossover operator in the decomposition reaction and the intermolecular collision reaction. Compared to CRO_SCS, the average length of the shortest common superstring calculated by IMCRO is shorter by 1.02 characters on average.
Even though these CRO-based algorithms have shown important performance gains, they can only perform iterative searches based on the current solution. On the one hand, they are affected by the randomness of the initial population, while on the other they face the problems of slow convergence and premature convergence [13]. The opposition-based learning (OBL) mechanism has arisen as a powerful technique for overcoming these problems. In OBL, the current solution and the reverse solution are examined at the same time during the search process, which improves the diversity of the population and at the same time expands the search range of the algorithm. By doing this the algorithm has more opportunities to escape from the current search area, improving its global search ability. This can effectively solve the premature optimization problem of CRO-based algorithms and achieve convergence in fewer iterative steps, which accelerates the convergence speed [14].
This article introduces the OBL into IMCRO, proposing an opposition-based learning CRO algorithm dubbed OBLIMCRO. The main contribution of this paper is as follows:
  • Proposing an OBL function to solve the SCS problem by generating an opposite solution corresponding to the original one. When a new solution is produced, its reverse solution will be generated by the OBL function.
  • Introducing the OBL mechanism into the construction of the population by mixing random molecules and opposite molecules, which can be used to optimize the quality of the population and enrich the diversity of the population.
  • Proposing an improved CRO-based algorithm with the OBL mechanism in the initialization stage and the iteration stage, resulting in speeding up the convergence and solving the premature trap.
The rest of the paper is organized as follows: related works are summarized in Section 2; the design details of OBLIMCRO are presented in Section 3, where the OBL mechanism for population generation and updating is depicted; to illustrate the performance of OBLIMCRO, an evaluation strategy is defined and the resulting experiments and analysis are shown in Section 3.2.3; finally, the conclusions of the paper are drawn in Section 5.

2. Related Work

David Maier first defined the SCS problem in 1976 [15], which was proved to be NP-complete for sequences over an alphabet of size five. It has been widely used in bioinformatics and computer science. Different algorithms have been proposed to solve the SCS problem, including heuristic algorithms such as [8], ABC [7], IBS [9], DR [10], and CRO_SCS [11].
The CRO algorithm was originally proposed by Lam and Li in 2010 [16], and was inspired by the process of chemical reactions. In short, it is based on the fact that a chemical reaction undergoes sub-reactions where certain intermediate states correspond to a reaction. The molecule becomes more stable when the molecule’s energy is lower than it was in the prior state.
Following the CRO principle, the CRO_SCS algorithm effectively solves the SCS problem. Compared to other heuristic algorithms, the average length of the shortest common superstring obtained by CRO_SCS is smaller under the same conditions. However, as a heuristic algorithm it faces the problems of slower convergence speed and premature maturity. The IMCRO algorithm [12] is an improved algorithm based on CRO_SCS. It introduces the two-step crossover and circular shift operators for the intermolecular collision and the decomposition reactions, respectively. The use of the reaction operator expands the degree of molecular structure change, improving overall algorithm performance. The premature problem is addressed by the algorithm’s local search and global search capabilities. Compared to CRO_SCS, the average length of the shortest common superstring output by IMCRO is reduced by 1.02 characters.
By combining the CRO with other heuristic algorithms, many novel CRO-based algorithms have been proposed, including random molecule-based chemical reaction optimization (RMCRO) [17], particle swarm and CRO (HP-CRO) [18], employed bee operator-based CRO (EBCRO) [19] and bat mutation-based CRO (BMCRO) [20].
OBL is an optimization strategy used in machine learning that was proposed by Tizhoosh in 2005. The main idea of the algorithm is to simultaneously evaluate both the current solution and its opposite solution during the search process. The algorithm then selects the solution closer to the optimal solution from among both the current and opposite solutions. It then performs subsequent searches, reaching the optimal solution in fewer iterations and speeding up the search process.
Meta-heuristic algorithms use OBL to obtain better results in terms of optimizing the population quality and accelerating the speed of convergence. The work presented in [21] combines the harmony search algorithm (HS) and the OBL mechanism to construct the opposition-based learning global harmony search algorithm (OLGHS). The OBL mechanism participates in the establishment of the initial harmony library, and is considered in the initialization process. For forward and reverse solutions, it chooses the proper solution to establish the initial harmony library. This is in order to ensure that the quality of the initial harmony library is appropriate as well as to solve the premature problem by optimizing the quality of the harmony library.
Several novel CRO algorithms combined with OBL have been proposed. Examples include the quasi-opposition learning chemical reaction optimization algorithm (QOCRO) [13], used to solve the reaction energy scheduling problem, and the elite opposition-based learning chemical reaction difference (EOLCRDE) optimization algorithm [22], used to solve the power dispatch problem.
To the best of our knowledge, the OBL mechanism has not previously been used to solve SCS problems. This work introduces OBL in IMCRO (our previous work) to solve the aforementioned problem.

3. OBLIMCRO

3.1. Opposition-Based Learning

The OBL strategy included in IMCRO is intended to generate “high-quality” solutions to accelerate the iteration speed. In each iteration of the heuristic algorithm, when a new solution (e.g., the forward solution) is produced its reverse solution is provided by OBL as well. The forward solution and the reverse solution are compared and the better one joins the population. This selection process is performed before starting the next iteration, and is equivalent to expanding the search range.
Notice that by applying the OBL mechanism the quality of the population is optimized and the diversity of the population is enriched. This results in improving the algorithm’s global search ability, allowing it to avoid falling into the local optimum trap.

3.2. Oblimcro Framework

The same as IMCRO, OBLIMCRO has three stages, initialization, iteration, and finalization, as shown in Figure 1. The general procedure is described in Algorithm 1, while the stages of the framework are detailed in the following subsections.
Algorithm 1 Framework of OBLIMCRO
Input: Sets of populations.
  1:
Initialization of parameters, such as buffer, Initial KE, PopSize, KELossRate, MoleColl, β and α .
  2:
Create PopSize number of random molecules
  3:
Calculate opposite molecules of random molecules
  4:
Mix random molecules and opposite molecules
  5:
Select PopSize molecules with lowest potential energy from the mix set
  6:
while the stopping criteria is not met do
  7:
  Generate t [ 0 , 1 ]
  8:
  if  ( N u m H i t M i n H i t ) > α  then
  9:
    Randomly select one molecule m
10:
    if ( N u m H i t M i n H i t ) > α  then
11:
      Trigger Decomposition
12:
    else
13:
      Trigger On-wall Ineffective Collision
14:
    end if
15:
  else
16:
    Randomly select two molecules m 1 and m 2
17:
    if  K E β  then
18:
      Trigger Synthesis
19:
    else
20:
      Trigger Inter-molecular ineffective collision
21:
    end if
22:
  end if
23:
  Check for new better solution
24:
  Calculate opposite molecule
25:
  Choose the solution with lower PE from the solution and its opposite molecule
26:
  Update population with the chosen solution
27:
end while
Output: Best solution.

3.2.1. Initialization Stage

Initialization is the first stage of OBLIMCRO, where the molecules are initialized. Initial values of molecules include PopSize, KELossRate, MoleColl, buffer, Initial KE, α , and β . Therein, Popsize is the number of all feasible solutions, KELossRate is the maximum percentage of KE reduction, MoleColl is used to determine whether the chemical reaction is inter-molecule or uni-molecule, KE is the initial kinetic energy of the molecule, and α and β are the thresholds for intensification and diversification.
Molecules in the initial population group are generated with OBL. First, the initial molecule group, RX = x 1 , x 2 , , x n , is randomly generated; then, its opposite group, OX = o x 1 , o x 2 , , o x n o x i = obl x i , is generated with the function obl based on the OBL mechanism. Afterwards, the random and opposite group are mixed as MX, where MX = RX OX . The individuals in MX are sorted according to their potential energy and the PopSize molecules with the lowest potential energy are selected to form the initial population, denoted as IX. Figure 2 shows an example of the initialization process, where the number of molecules in RX, OX, MX, and IX is PopSize.
The construction of RX, OX, MX, and IX takes place in the initialization stage. RX is constructed with a random inserting operation, as depicted in Figure 3. Therein, the initial superstring C is randomly generated as an array the elements of which are composed of letters in the string alphabet. The element of S i is randomly inserted into C, where S i is one string of the input string set S and 1 i PopSize . This random inserting operation is performed on all strings in S to obtain the initial population with PopSize molecules. For further processing, each supersequence is encoded as a set of integer values. For instance, Σ is an alphabet set and Σ = { a , c , g , t } . It is encoded as I = { 0 , 1 , 2 , 3 } , where each element of Σ is encoded as an integer in I in the corresponding position of the set. Then, the set { 0 , 1 , 2 , 3 , 2 , 1 , 3 , 0 } can represent the supersequence E = { a , c , t , g , t , c , g , a } , as shown in Figure 4.
OX is obtained by the OBL mechanism after the initialization of RX. For each molecule X with n elements in RX, where X = x 1 , x 2 , , x n , its opposite molecule OX is obtained using the function obl ( ) , as defined in Formula (1). Therein, the function o ( x ) obtains the inverse number of the integer x. For a real number x the feasible region of which is [a,b], its inverse number ox is defined in Formula (2). Taking a molecule X = { 0 , 1 , 2 , 3 , 2 , 1 , 3 , 0 } with a feasible region [0,3] as an example, according to Formula (1), the opposite molecule is OX = { 3 , 2 , 1 , 0 , 1 , 2 , 0 , 3 } .
OX = obl ( X ) = obl ( { x 1 , x 2 , . . . , x n } ) = { o ( x 1 ) , o ( x 2 ) , . . . , o ( x n ) }
o x = o ( x ) = a + b x
After obtaining RX and OX, MX is obtained by performing the collective union operation on RX and OX, where MX = RX OX = x 1 , x 2 , , x n , o x 1 , o x 2 , , o x n with the scale as 2*PopSize. The molecules are sorted according to the molecular potential energy of the molecules in MX. Half of the molecules in MX with the lowest potential energy are selected to construct IX, which is the initial molecule population.

3.2.2. Iteration Stage

The iteration stage consists of two subtasks, reaction and repair. There are four main operators used in the action subtask: on-wall ineffective collision, decomposition, inter-molecular ineffective collision, and synthesis. On-wall ineffective collision is realized by the collision of a single molecule. Then, one element of the molecule is randomly selected and changed to form a new molecule. The operator is used for local search. Decomposition refers to a molecule colliding with the wall and splitting into two or more molecules, and is usually used for global search. Inter-molecular ineffective collision is similar to on-wall ineffective collision. Synthesis is the combination of two molecules into a new molecule, which is equivalent to the inverse reaction of decomposition.
The four operators fall into two types, uni-molecule reaction and inter-molecule reaction. Therein, uni-molecule reactions include on-wall ineffective collision and decomposition, while inter-molecule reactions include inter-molecular ineffective collision and synthesis. The four operators in this stage are the same as those defined in [12].
At the start of each iteration step, a parameter t is randomly generated. This parameter determines whether uni-molecule reactions or inter-molecule reactions are triggered. In particular, uni-molecule reactions are triggered when t > MoleColl ; otherwise, inter-molecule reactions are triggered. The specific operator is further determined according to values of α and β . In uni-molecule reactions, decomposition is triggered if ( NumHit MinHit ) > α ; otherwise, on-wall ineffective collision is triggered. In inter-molecule reactions, synthesis is triggered if K E β ; otherwise, inter-molecular ineffective collision is triggered.
Through the processing of the chemical operators a new molecule, X i , is created based on the molecule X i and an updating process is triggered, as shown in Algorithm 2. First, the opposite molecule, O X i , is created according to Formula (1). The potential energy of X i and O X i are compared and the molecule with lower potential energy is chosen to be returned to the initial population for updating.
Algorithm 2 Update based on OBL.
Input: new molecule X i .
O X i = obl( X i );
  2:
PE( X i ) = f( X i );
 
 PE( O X i ) = f( O X i );
  4:
if PE( X i ) < P E ( O X i ) then;
 
    U X i = X i ;
  6:
else
 
    U X i = O X i ;
  8:
end if
 
if U X i matches stopping criteria then
10:
  output U X i ;
 
else
12:
  put U X i into IX to update the initial group;
 
end if
Similar to that in IMCRO, a repair function to check this solution is triggered when a solution is obtained. If the requirement of the problem is satisfied, the obtained solution is repaired by a repair algorithm until the termination condition is met. Then, it enters the final stage, where the best solution is output and returned when the stopping criteria are met; otherwise, iteration is triggered again and the reactions are repeated.

3.2.3. Finalization

At the end of each iteration, termination conditions are checked and a new molecule is obtained through the OBL-based updating process (Algorithm 2). If the conditions are met, either a new molecule is output as the result or the iteration continues. There are two termination conditions. One is that the potential energy of the new molecule is less than the specified threshold, and the other is that the number of iterations reaches a certain upper limit. If either of the two conditions is met, it is deemed to have met the termination condition.

4. Experiments and Evaluation

In order to evaluate the performance and efficiency of OBLIMCRO for solving the SCS problem, we carried out a set of experiments where OBLIMCRO was compared to IMCRO and CRO_SCS, as well as to heuristic algorithms such as ACO, IBS, and DR.

4.1. Configuration of Experiments

The execution environment for the compared algorithms was a Windows-based personal computer with an i5-4210U CPU and 4.0GB RAM. We adopted the same parameters for the experiments as those in IMCRO [12]. In particular, MoleColl was 0.2, KELossRate was 0.6, and PopSize was 20.
The datasets were the same as those used in IMCRO [12], including a random DRM dataset with fifteen instances and a real DRL dataset with eleven instances. There were fifteen DNA sequences in the random DRM dataset, with each instance containing four different types of characters ( Σ = 4 ) . There were six DNA sequences and five protein sequences in the real DRL dataset. There were twenty different types of characters ( Σ = 20 ) in each protein sequence.
In the experiment, each compared algorithm was run twenty times with the different datasets. For each instance we obtained 200 different results, including the length of SCS and the execution time. The average length and the average execution time for the SCS were obtained from these 200 results. After repeating the process described above, results such as the average SCS length, represented as L, and the average execution time, represented as T, were obtained. The value of L of the specialized algorithm alg for the instance ins is denoted as L a l g ( i n s ) . Meanwhile, the value of T of the specialized algorithm alg for the instance ins is denoted as T a l g ( i n s ) ; a l g is one of the compared algorithms, e.g., alg D R , I B S , A C O , C R O S C S , I M C R O , O B L I M C R O , and i n s is one type of the datasets, e.g., ins D R M D R L .
The maximum number of iterations for the CRO operations and the current potential energy exceeding the threshold were the two stopping criteria for the execution of the compared algorithms. Specifically, the maximum number of iterations was 500 and the threshold of the potential energy was specialized according to the structure of the particular instance. As depicted in Algorithm 1, if one of the two criteria were satisfied, the algorithm was stopped and the final solution was output.

4.2. Results and Analysis

The objective of the compared algorithms was to obtain the shortest average SCS length and the minimum average execution time. Table 1, Table 2, Table 3 and Table 4 show the results of the algorithms with these two types of datasets. Therein, n is the instance’s string number and k is the string length. The emboldened values in each table are the best results.
Table 1 shows that L O B L I M C R O ( i n s ) < L a l g ( i n s ) , where ins DRM and alg D R , I B S , A C O , C R O S C S , I M C R O . Table 2 shows that T O B L I M C R O ( i n s ) T a l g ( i n s ) , where i n s D R M and alg D R , I B S , A C O , C R O S C S , I M C R O . The results show that when compared with DR, IBS, ACO, CRO_SCS and IMCRO, OBLIMCRO obtained the minimum average SCS length and minimum average execution time.
Table 3 shows that L O B L I M C R O ( i n s ) < L a l g ( i n s ) , where ins DRL except i n s { P R O T 3 , P R O T 4 , P R O T 5 } and alg D R , I B S , A C O , C R O S C S , I M C R O . Meanwhile, L O B L I M C R O ( P R O T 3 ) , L O B L I M C R O ( P R O T 4 ) , and L O B L I M C R O ( P R O T 5 ) are closest to L I M C R O ( P R O T 3 ) , L I M C R O ( P R O T 4 ) , and L C R O _ S C S ( P R O T 5 ) , respectively, which are the smallest values. Table 4 shows that T O B L I M C R O ( i n s ) T a l g ( i n s ) , where i n s D R L and a l g { D R , I B S , A C O , C R O _ S C S , I M C R O } .
The results shown in Table 1 and Table 3 indicate that OBLIMCRO obtains the minimum value in both average SCS length and average execution time for DNA instances in real datasets. However, OBLIMCRO does not considerably overwhelm IMCRO and CRO_SCS for protein instances, while OBLIMCRO, IMCRO and CRO_SCS achieve smaller values of the average SCS length in comparison with the other algorithms. In particular, in comparison with IMCRO OBLIMCRO can reduce L for DNA instances, as shown in Formula (3), where RC is the average reduced value. In DRM, RC with OBLCRO is 2.11, where num = 15. On the other hand, RC with OBLCRO is 2.24 in DRL, where num = 6.
Except for the reduction in average SCS length, the reduction in average running time is better. In the random set, the average running time of the OBLIMCRO algorithm is the lowest among the three CRO algorithms. Compared with the CRO_SCS algorithm, after using the OBLIMCRO algorithm the average running time is shortened by 52.8%. Compared with IMCRO, after using the OBLIMCRO algorithm the average running time is shortened by 50.9%. In the real set, a substantial improvement in the average running time was found. Compared with CRO_SCS, the average running time of OBLIMCRO is shortened by 54.3%, and compared with IMCRO, the running time of OBLIMCRO is shortened by 50.3%. It can be seen that whether using experimental data in a random set or using a real set, after using OBLIMCRO the running time can be shortened by more than half when compared with CRO_SCS and IMCRO. The average reduction in average running time, Δ T , for T can be obtained using Formula (4).
R C = i = 1 n u m ( L I M C R O / C R O _ S C S ( D N A i ) L O B L I M C R O ( D N A i ) ) n u m
T = T I M C R O / C R O _ S C S ( i ) T O B L I M C R O ( i ) T L M C R O / C R O _ S C S ( i )
To prove the credibility of the results above, we applied a statistical significance test to L O B L I M C R O on L C R O _ S C S and L I M C R O using t-tests [23]. The p-value of the t-test was 2.83 × 10 5 and 0.0013 for the random datasets (Table 1), respectively, and 0.04 and 0.0164 for the real datasets (Table 3), respectively. All of these p-values are smaller than 0.05, which means that the difference in the average SCS length between OBLIMCRO and IMCRO or CRO_SCS is statistically significant. The results of the t-tests statistically verify the credibility of the results, showing that OBLIMCRO can efficiently reduce the average SCS length for most of the datasets in comparison with IMCRO and CRO_SCS.
Next, we applied a statistical significance test to T O B L I M C R O on T C R O _ S C S and T I M C R O using t-tests, as well. For the random dataset (Table 1), the p-value of the t-test was 0.12 and 0.13 respectively, while for the real datasets (Table 3) it was 0.021 and 0.024 respectively. As all of these p-values are smaller than 0.05, the difference in the average running time between OBLIMCRO and IMCRO or CRO_SCS is statistically significant. The results of the t-tests statistically verifies the credibility of te results, showing that in comparison with IMCRO and CRO_SCS, OBLIMCRO can efficiently reduce the running time required to resolve the SCS problem.
The number of iterations required to reach the best solution with the CRO algorithms is shown in Table 5 and Table 6, and is related to the PE threshold of each instance. For the three CRO algorithms, the number of CRO operations per iteration is less than 500, which is configured as the maximum iterations. This indicates that the three CRO algorithms converge invariantly.

5. Conclusions

We proposed a novel algorithm by introducing an OBL mechanism into IMCRO to solve the SCS problem. The proposed algorithm, named OBLIMCRO, aims to tackle common problems found in state-of-the-art algorithms for SCS. The OBL mechanism participates in the process of constructing and updating the population, which optimizes the quality of the population that is used to obtain the final solution. Experimental results show that for DNA proteins in both random and real datasets, the average running time is reduced and the average length required to solve the SCS is shortened. The results indicate that with the help of OBL, OBLIMCRO can achieve more optimal solutions with a faster convergence speed.

Author Contributions

Conceptualization, Y.L.; Methodology, W.D.; Supervision, F.L.; Writing—original draft, C.C.; Writing—review editing, J.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project on the Shanghai Action Plan of Technological Innovation (20DZ1201400, 22ZR1416500) and sponsored by Shanghai Sailing Program (20YF1410900).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Yongjun Luo and Yong Li are thanked for expert advice and inspiring discussions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Räihä, K.; Ukkonen, E. The Shortest Common Supersequence Problem over Binary Alphabet is NP-Complete. Theor. Comput. Sci. 1981, 16, 187–198. [Google Scholar] [CrossRef] [Green Version]
  2. Ning, K.; Ng, H.K.; Leong, H.W. Finding Patterns in Biological Sequences by Longest Common Subsequencesand Shortest Common Supersequences. In Proceedings of the Sixth IEEE Symposium on BioInformatics and BioEngineering (BIBE’06), Arlington, VA, USA, 16–18 October 2006; pp. 53–60. [Google Scholar]
  3. Garg, A.; Garg, D. Progressive alignment using Shortest Common Supersequence. In Proceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Delhi, India, 24–27 September 2014; pp. 1113–1117. [Google Scholar]
  4. Mangal, K.; Kumar, R. A Recursive Algorithm for Generalized Constraint SCS Problem. Natl. Acad. Sci. Lett. 2016, 39, 273–276. [Google Scholar] [CrossRef]
  5. Timkovskii, V.G. Complexity of common subsequence and supersequence problems and related problems. Cybernetics 1989, 25, 565–580. [Google Scholar] [CrossRef]
  6. Foulser, D.E.; Li, M.; Yang, Q. Theory and Algorithms for Plan Merging. Artif. Intell. 1992, 57, 143–181. [Google Scholar] [CrossRef]
  7. Jaradat, A.S.; Noaman, M.M. Solving Shortest Common Supersequence Problem Using Artificial Bee Colony Algorithm. Int. J. ACM Jordan 2011, 2, 180–185. [Google Scholar]
  8. Rajendran, S.; Rajendran, C.; Ziegler, H. An Ant-Colony Algorithm to Transform Jobshops into Flowshops: A Case of Shortest-Common-Supersequence Stringology Problem. In Bio-Inspired Models of Network, Information, and Computing Systems, Proceedings of the 5th International ICST Conference, BIONETICS 2010, Boston, MA, USA, 1–3 December 2010; Revised Selected Papers; Springer: Berlin/Heidelberg, Germany, 2010; pp. 413–424. [Google Scholar]
  9. Mousavi, S.R.; Bahri, F.; Tabataba, F. An enhanced beam search algorithm for the Shortest Common Supersequence Problem. Eng. Appl. AI 2012, 25, 457–467. [Google Scholar] [CrossRef] [PubMed]
  10. Ning, K.; Leong, H.W. Towards a better solution to the shortest common supersequence problem: The deposition and reduction algorithm. BMC Bioinform. 2006, 7, S12–S22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Saifullah, C.M.K.; Islam, M.R. Chemical reaction optimization for solving shortest common supersequence problem. Comput. Biol. Chem. 2016, 64, 82–93. [Google Scholar] [CrossRef] [PubMed]
  12. Luo, F.; Chen, C.; Fuentes, J. An improved chemical reaction optimization algorithm for solving the shortest common supersequence problem. Comput. Biol. Chem. 2020, 88, 107327. [Google Scholar] [CrossRef] [PubMed]
  13. Dutta, S.; Paul, S.; Roy, P.K. Optimal allocation of SVC and TCSC using quasi-oppositional chemical reaction optimization for solving multi-objective ORPD problem. J. Electr. Syst. Inf. Technol. 2018, 5, 83–98. [Google Scholar] [CrossRef]
  14. Tizhoosh, H. Opposition-Based Learning: A New Scheme for Machine Intelligence. In Proceedings of the International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06), Vienna, Austria, 28–30 November 2005; Volume 1, pp. 695–701. [Google Scholar]
  15. Maier, D. The Complexity of Some Problems on Subsequences and Supersequences. J. ACM 1978, 25, 322–336. [Google Scholar] [CrossRef]
  16. Lam, A.Y.S.; Li, V.O.K. Chemical Reaction Optimization: A tutorial—(Invited paper). Memetic Comput. 2012, 4, 3–17. [Google Scholar] [CrossRef] [Green Version]
  17. Yang, Q.; Yang, Z.; Hu, G.; Du, W. A New Fusion Chemical Reaction Optimization Algorithm Based on Random Molecules for Multi-Rotor UAV Path Planning in Transmission Line Inspection. J. Shanghai Jiaotong Univ. Sci. 2018, 23, 671–677. [Google Scholar] [CrossRef]
  18. Li, Z.; Nguyen, T.T.; Chen, S.; Truong, T.K. A hybrid algorithm based on particle swarm and chemical reaction optimization for multi-object problems. Appl. Soft Comput. 2015, 35, 525–540. [Google Scholar] [CrossRef]
  19. Li, Z.; Yuan, T.; Yang, B.; Jiang, S.; Xie, Y. EBCRO: Hybrid chemical reaction with employed bee operator. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, 29–31 July 2017; pp. 192–200. [Google Scholar]
  20. Ngam, R.; Li, Z. Bat-Mutation Chemical Reaction Optimization Algorithm for Conflict Optimization Problem: Case of Bandwidth Utilization. J. Comput. Theor. Nanosci. 2017, 14, 5118–5127. [Google Scholar]
  21. Zhai, J.; Qin, Y. Opposition-based learning in global harmony search algorithm. Control. Decis. 2019, 34, 1449–1455. [Google Scholar]
  22. Fei, H. The Research of Dynamic Economic Dispatch Integrated with Wind Power System Based on Chemical Reaction Optimization Algorithm; Hunan University: Changsha, China, 2017. [Google Scholar]
  23. Papapetrou, M.; Kugiumtzis, D. Investigating long range correlation in DNA sequences using significance tests of conditional mutual information. Comput. Biol. Chem. 2014, 53, 32–42. [Google Scholar] [PubMed]
Figure 1. Framework of OBLIMCRO.
Figure 1. Framework of OBLIMCRO.
Entropy 24 00641 g001
Figure 2. Initialization based on OBL.
Figure 2. Initialization based on OBL.
Entropy 24 00641 g002
Figure 3. Population generation.
Figure 3. Population generation.
Entropy 24 00641 g003
Figure 4. Solution representation.
Figure 4. Solution representation.
Entropy 24 00641 g004
Table 1. Average SCS Length in random datasets.
Table 1. Average SCS Length in random datasets.
nkL
ACODRIBSCRO_SCSIMCROOBLIMCRO
51022.521.219.920.219.719.6
101026.725.125.225.324.924.8
501031.531.130.029.328.628.6
1001033.032.532.032.131.530.5
5100207.4198.2184.0181.6180.5179.8
10100233.7226.2210.0209.4208.3207.6
50100263.7262.0252.0244.4243.8241.4
100100270.1269.2261.1252.1251.0247.3
500100277.2277.8273.6267.6266.7266.5
1000100281.7278.5276.8270.1269.7269.7
5000100282.9282.9281.5271.9270.6269.9
10010002535.62531.62466.72443.22442.12440.1
50010002565.62578.82540.22532.12530.52530.2
100010002570.82581.42555.52535.62533.92531.4
500010002590.62586.92571.62562.92561.82558.7
Table 2. Average execution time in random datasets.
Table 2. Average execution time in random datasets.
nkT/s
ACODRIBSCRO_SCSIMCROOBLIMCRO
5100.80.0180.030.0080.0080.005
10101.000.0330.030.030.030.02
50102.30.10.070.080.0650.034
100103.50.150.120.080.0550.021
51005.90.60.140.020.0130.005
101008.61.180.220.140.120.05
5010016.34.070.460.370.260.10
10010023.57.280.910.650.520.30
50010065.527.33.061.690.920.28
1000100127.969.26.452.661.950.56
5000100706.6339.441.655.015.012.48
1001000207.7420.66.335.755.52.1
5001000651.11205.337.9315.8514.126.43
100010001296.52116.861.6739.922.019.56
500010003101.63761.4487.16480.02480.01238.31
Table 3. Average SCS Length in real datasets.
Table 3. Average SCS Length in real datasets.
NamenkL
ACODRIBSCRO_SCSIMCROOBLIMCRO
DNA-11005001346.91332.61280.71271.41271.01270.0
DNA-25005001520.01404.61352.71351.81350.81349.7
DNA-310010002712.22670.12542.92442.52440.82438.9
DNA-450010003092.12782.72664.42532.42530.32530.2
DNA-5100100297.8285.4272.3252.1251.5250.6
DNA-6500100405.2291.5288.3267.2266.3266.2
PROT-11005006908.24851.44349.74312.44311.64310.5
PROT-25005008910.45545.25229.35041.15040.85028.2
PROT-31000500110865748.75395.65301.75301.95302.1
PROT-41001001303.51005.6913.3920.9921.2921.8
PROT-55001001776.21205.11107.61126.01125.91126.7
Table 4. Average execution time in real datasets.
Table 4. Average execution time in real datasets.
NamenkT/s
ACODRIBSCRO_SCSIMCROOBLIMCRO
DNA-1100500151.3349.43.001.911.910.33
DNA-2500500613.1540.617.148.357.831.91
DNA-31001000334.4483.610.315.855.481.43
DNA-450010001514.11156.542.415.715.466.36
DNA-510010041.977.870.920.560.490.23
DNA-650010092.037.552.950.810.730.34
PROT-1100500560.31125.392.431.9821.188.90
PROT-25005001450.21905.4307.852.2950.5024.01
PROT-310005003205.44002.71905.6116.12110.7662.46
PROT-410010016.431.213.51.661.540.94
PROT-5500100123.595.765.34.634.532.69
Table 5. Iterations in random datasets.
Table 5. Iterations in random datasets.
nkThreshold CRO _ SCS IMCROOBLIMCRO
51010131110
101025272625
50103544
100101322
510040414140
1010040414141
5010010111111
10010071088
5001001222
10001001222
50001001321
100100055575656
500100010111111
100010004655
500010001211
Table 6. Iterations in real datasets.
Table 6. Iterations in real datasets.
NamenkThreshold CRO _ SCS IMCROOBLIMCRO
DNA-110050020205201150
DNA-25005006616150
DNA-3100100030301301240
DNA-4500100016161161106
DNA-51001003643
DNA-650010041454141
PROT-110050040402401205
PROT-25005008828171
PROT-310005004434133
PROT-41001005525135
PROT-55001002212110
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Luo, F.; Chen, C.; Fuentes, J.; Li, Y.; Ding, W. An Opposition-Based Learning CRO Algorithm for Solving the Shortest Common Supersequence Problem. Entropy 2022, 24, 641. https://doi.org/10.3390/e24050641

AMA Style

Luo F, Chen C, Fuentes J, Li Y, Ding W. An Opposition-Based Learning CRO Algorithm for Solving the Shortest Common Supersequence Problem. Entropy. 2022; 24(5):641. https://doi.org/10.3390/e24050641

Chicago/Turabian Style

Luo, Fei, Cheng Chen, Joel Fuentes, Yong Li, and Weichao Ding. 2022. "An Opposition-Based Learning CRO Algorithm for Solving the Shortest Common Supersequence Problem" Entropy 24, no. 5: 641. https://doi.org/10.3390/e24050641

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop