An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm

Hamdy, Walid; Ismail, Amr; Awad, Wael A.; Ibrahim, Ali H.; Hassanien, Aboul Ella

doi:10.3390/s23042219

Open AccessArticle

An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm

by

Walid Hamdy

^1,†,

Amr Ismail

¹,

Wael A. Awad

²,

Ali H. Ibrahim

¹ and

Aboul Ella Hassanien

^3,*,†

¹

Faculty of Science, Port Said University, Port Said 42511, Egypt

²

Faculty of Computers and Artificial Intelligence, Damietta University, El-Gadeeda 34519, Egypt

³

Faculty of Computers and Artificial Intelligence, Cairo University, Giza 12613, Egypt

^*

Author to whom correspondence should be addressed.

^†

Scientific Research Group in Egypt (SRGE), Giza 13062, Egypt.

Sensors 2023, 23(4), 2219; https://doi.org/10.3390/s23042219

Submission received: 21 January 2023 / Revised: 11 February 2023 / Accepted: 14 February 2023 / Published: 16 February 2023

(This article belongs to the Special Issue Precision Agriculture with Deep and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

MicroRNAs (miRNA) are small, non-coding regulatory molecules whose effective alteration might result in abnormal gene manifestation in the downstream pathway of their target. miRNA gene variants can impact miRNA transcription, maturation, or target selectivity, impairing their usefulness in plant growth and stress responses. Simple Sequence Repeat (SSR) based on miRNA is a newly introduced functional marker that has recently been used in plant breeding. MicroRNA and long non-coding RNA (lncRNA) are two examples of non-coding RNA (ncRNA) that play a vital role in controlling the biological processes of animals and plants. According to recent studies, the major objective for decoding their functional activities is predicting the relationship between lncRNA and miRNA. Traditional feature-based classification systems’ prediction accuracy and reliability are frequently harmed because of the small data size, human factors’ limits, and huge quantity of noise. This paper proposes an optimized deep learning model built with Independently Recurrent Neural Networks (IndRNNs) and Convolutional Neural Networks (CNNs) to predict the interaction in plants between lncRNA and miRNA. The deep learning ensemble model automatically investigates the function characteristics of genetic sequences. The proposed model’s main advantage is the enhanced accuracy in plant miRNA–IncRNA prediction due to optimal hyperparameter tuning, which is performed by the artificial Gorilla Troops Algorithm and the proposed intelligent preying algorithm. IndRNN is adapted to derive the representation of learned sequence dependencies and sequence features by overcoming the inaccuracies of natural factors in traditional feature architecture. Working with large-scale data, the suggested model outperforms the current deep learning model and shallow machine learning, notably for extended sequences, according to the findings of the experiments, where we obtained an accuracy of 97.7% in the proposed method.

Keywords:

CNN model; deep learning; ensemble learning; lncRNA; IndRNN; microRNA; plant

1. Introduction

Eukaryotic genomes produce non-coding RNAs (ncRNAs) with broad transcription features, and researchers discovered that only around 1–2% of transcripts are engaged in coding the protein [1]. Non-coding RNAs (ncRNAs) are most transcripts that are not implicated in protein coding [2]. In the past decade, the number and ncRNA function of species have become important research topics in biology. ncRNA is categorized as short non-coding RNA (sncRNA) or long non-coding RNA (lncRNA) depending on if the length of the transcript is greater than 200 nt [3]. The most common kinds of this sncRNA are lncRNA and miRNA [4]. The increasing knowledge of function ncRNA technicality has the attention of a growing number of researchers.

Detection and function investigation of sncRNA has become a popular topic. Additionally, the relationship between lncRNA and miRNA, according to researchers, is significant in the control of gene expression and is linked to species development, material metabolism, embryonic development, and the incidence of numerous disorders [5]. As a result, it is critical to understand how RNA molecules interact and how they operate. A thorough investigation of the interplay between lncRNA and miRNA will transform our present comprehension of cell structures and organization and provide significant scientific and medicinal benefits.

In plants, there are two kinds of interactions between lncRNA and miRNA. (1) In addition to performing a direct role as a miRNA precursor, lncRNA can be spliced into shorter miRNA. For example, MiR869a and miR160c may be spliced into lncRNAs npc521 and npc83 [6]. (2) lncRNA can be cleaved by miRNA as a target [7]. By functioning as a target for miRNA, lncRNA controls the phosphate balance in plants in vivo by reducing miRNA’s inhibitory impact on genes. The “sponge effect” [8] describes how lncRNA might operate as a miRNA decoy, competing with mRNA for miRNA binding and thereby regulating the expression of miRNA target genes. In tomato yellow mosaic virus (TYMV), two lncRNAs, slylnc1077 and slylnc0195, were discovered to behave as decoys for miRNAs [9]. Slylnc0195 expression is up-regulated in tomatoes that are infected with TYMV; however, miR166a expression is downregulated.

According to studies, the relationship between lncRNA and miRNA regulates plant disease impedance, cell differentiation, vernalization, fruiting and flowering, drought resistance, cold tolerance, and other biotic and abiotic stress resistance mechanisms [10]. There are fewer investigations on the relationship between lncRNA and miRNA in plants than in human and animal studies. Furthermore, only a few lncRNA and miRNA mechanisms of action have been verified, resulting in a lack of experimental data on lncRNA and miRNA in the area of plants, making bioinformatics criteria for in-depth analysis of the interactions between lncRNA and miRNA difficult to satisfy. Thus, obtaining a vast amount of data about the interactions of lncRNA and miRNA in relation to plant growth and development is critical for in-depth research of the functional mechanism of lncRNA and miRNA interaction in plants.

To anticipate the interactions between lncRNA and miRNA, at the moment, shallow machine learning is used in many articles to structure systems for prediction through features selected in the prediction of the interactions between lncRNA and miRNA. However, there are numerous issues such as a lack of training data, human factors, and large noise, which together result in reduced predictability of the results. In recent years, the development of computational approaches for finding connections in different biological datasets has received much attention [11]. This approach employs a two-layer Convolutional Neural Network (CNN) [12] to understand sequence characteristics and discover functional domains in nucleotide sequences. After that, a two-layer IndRNN is used to understand long-term functional domain dependences to categorize data [13].

The relationship between miRNA and lncRNA has been shown in studies to have a key regulatory function in plant disease resistance, vernalization, cell differentiation, flowering and fruiting, cold tolerance, drought resistance, and other biotic and abiotic stressors. In comparison to people and animals, there have been few studies on the interplay between miRNA and lncRNA in plants. Furthermore, only a few miRNA and lncRNA modes of action have been verified, resulting in a lack of experimental data on miRNA and lncRNA in plants, making it difficult to meet bioinformatics requirements for in-depth investigation of the interaction between miRNA and lncRNA. As a result, a vast amount of data on the interaction between miRNA and lncRNA related to plant growth and development is critical for in-depth research of the functional mechanism of the interaction between miRNA and lncRNA in plants. Predicting the link between lncRNA and miRNA is the primary goal of deciphering their functional activities. The prediction accuracy and reliability of traditional feature-based classification systems are commonly injured due to small data size, human factor limitations, and a large amount of noise. According to recent studies, it was found that when any error, even a slight one, occurs in the predicted relationship between lncRNA and miRNA, this leads to significant problems in the plant classification. This inspired our research, seeking to improve the performance of the prediction relationship between lncRNA and miRNA, which play vital roles in controlling the biological processes of animals and plants. The study’s contributions are summarized as follows:

We propose CNN-RNN, a deep learning model to predict the interaction in plants between lncRNA and miRNA.
The GTO algorithm is used to learn the parameters of the CNN-RNN. As a result of the GTO’s convergence performance, the prediction success of CNN-RNN with GTO has been increased.
We lay the groundwork for further investigation into the interaction mechanism of miRNA and lncRNA in plants.

2. Related Work

Deep learning has recently been popular in a variety of miRNA datasets, as it outperforms traditional methods in miRNA prediction. For prediction, researchers used either pre-trained models or ordinary deep learning [14].

LncRNAs and miRNAs have been demonstrated to control each other. Zhang et al. [15] suggested “M6A-HPCS”, which enhanced the accuracy of m6A prediction by introducing a new methodology, the nucleotide physical–chemical property selection (HPCS) algorithm. Zhou et al. [16] developed SRAMP, which predicted m6A locations in humans and other animals such as mice using sequence features and the random forest (RF) approach. This approach generated three RF classifiers using KNN encoding, binary encoding, and spectrum encoding. Finally, a weighted sum was used to combine the three categorization models into a single model. However, the accuracy of m6A location predicted on the benchmark yeast datasets was not as good as that of the benchmark mammalian datasets using this strategy. Chen et al. [17] presented the “MethyRNA” technique in 2017, which used the SVM algorithm to classify RNA sequences and sequence-derived feature coding. Qiang et al. [18] suggested the “M6AMRFS” framework, which coupled the Sequential Forward Search (SFS) algorithm and the F1-score to enhance the ability of feature representation and used the XGBoost algorithm to develop a new method of feature representation that uses two features described for RNA code sequences. Authors in [19] suggested “M6Apred-EL” using ensemble learning in the same year. Three SVM classifiers were trained using this technique, which looked at ring function hydrogen chemical properties, physical–chemical information, physical–chemical characteristics, and position-specific data. For the first time, Ref. [20] presented the “DeepM6ASeq” approach, which combined CNN and BLSTM networks with one-hot encoding. Zou et al. [21], in 2019, developed the deep learning model “Gene2vec”. This technique considered the influence of RNA sequence prediction window length on experimental accuracy. It addressed four potential sequence encoding methods and, for the first time, used the word2vec model for RNA encoding sequences. Ref. [22] introduced the “WHISTLE” model in this year, which merged RNA sequence characteristics and genomic features and categorized the sequences using the SVM method; this model increased the m6A site performance of prediction (average accuracy: 94.8% and 88.0% under mature mRNA models and the whole transcript, respectively). Liu et al. [23] created a systematic approach that can concurrently identify m6A, m5C, m1A, and Ato-I adjustments in several species. The majority of the approaches listed above are based on standard methods of machine learning, which lack deep learning exploration and primarily concentrate on feature extraction, resulting in poor model generalizability and prediction accuracy.

A novel meta-heuristic Artificial Gorilla Troops Optimizer (GTO) technique was suggested recently [24]. GTO displays gorilla social behavior. It effectively solves mathematical problems, producing optimal solutions in a fraction of the time required by other leading techniques. GTO has recently been used to tackle optimization problems in a variety of fields. GTO was employed by Ahmed Ginidi et al. [25] for photovoltaic model parameter extraction, while Abdel-Basset et al. [26] presented memory-based improved GTO (MIGTO) for the same problem. To handle feature selection in biological data, Piri et al. [27] introduced discrete artificial GTO (DAGTO). Liang et al. [28] updated GTO with opposition-based learning and parallel methods, offering OPGTO to decrease errors in the wireless sensor network’s node location. For the problem of global optimization, Xiao et al. [29] presented enhanced GTO based on two strategies, namely lens opposition-based learning and adaptive β-hill climbing.

Essam et al. [30] discusses how the Gorilla Troops Algorithm is used to minimize the energy consumption of large-scale wireless sensor networks. As for the energy usage scores, optimum sink node placements and localization faults are addressed to carefully examine the efficacy of the considered MO approaches and determine the optimal positions and the smallest number of sink nodes that will satisfy the entire network. Authors show that the MOGTO model allocated the smallest sink nodes across all network sizes, demonstrating its efficacy in reducing energy consumption averages and increasing the network’s lifetime. Wu, Tingyao et al. [31] proposed a new Modified Gorilla Troops Optimizer (MGTO). There are three aspects to the improvement strategies: Quasi-Reflection-Based Learning (QRBL), Beetle Antennae Search Based on Quadratic Interpolation (QIBAS), and Teaching–Learning-Based Optimization (TLBO). They compared the MGTO with traditional GTO, SSA, GWO, ROLGWO, PSO, WOA, AOA, HSCAHS, and DSA algorithms to prove that MGTO has efficiency and promising potential in real-world optimization challenges. El-Dabah et al. [32] investigated how well a power system stabilizer (PSS) unit may be modified by using the GTO algorithm. They used the Integral Time Square Error (ITSE) as a fitness function that should ideally be minimized. They also used four alternative controllers to study a single machine scheme model as a model for the infinite bus. They determined that the GTO algorithm has faster convergence over the other compared optimization techniques. Bhadoria et al. [33] proposed a Chaotic Gorilla Troops Optimizer (CGTO) to present a novel solution to the power generation scheduling problem. The gorilla update technique first collects a binary string of generators in order to find the global best solution (s), and then performs a chaotic operation. Finally, they reported that the CGTO has a strong performance compared with other various techniques for solving the power generation scheduling problem.

3. CNN’s Model Setting and Phases

Initially, shallow machine learning approaches depending on feature engineering were used to classify and predict issues. However, due to their numerous disadvantages, researchers have begun to focus on deep learning methods [34]. Deep learning has recently become popularly employed for sequence classification [35], biological information [36], image processing [37], computer vision [38], natural language processing [39], and other sectors, with positive outcomes.

3.1. Structure of IndRNN and CNN

CNNs and recurrent neural networks (RNN) are the most common deep learning models [40]. Their variations make up the majority of existing deep-learning models. CNN-RNN is the combination of IndRNN and CNN. The model employs a two-layer CNN to extract significant features from the best filter. The CNN convolution layers extract data feature information on multiple levels [41] and process the feature during the pooling layers to obtain the best classification features. The collected feature information is then passed to the layers of IndRNN, which uses it to learn more about the feature dependencies. A dropout layer is used in the model to avoid overfitting. Simultaneously, the ReLU function was chosen as the function activation since it outperforms the sigmoid function in terms of promoting sparse and successfully decreasing the gradient likelihood value [42]. Figure 1 depicts the specific construction of the ensemble CNN.

Long-term reliance between sequences can be learned by IndRNN. The model employs a two-layer IndRNN structure to learn sequence dependencies better. IndRNN is a simple structure that may be readily expanded to many network designs, unlike classic RNN. Because neurons in the same layer are autonomous, each neuron’s behavior can be studied without considering the impact of another neuron. It can tackle the gradient explosion and gradient disappearance problems in standard RNNs by deepening the network level without sacrificing the ability to deliver the training loop or rely on gateway parameters [43] while still preserving long-term memory. As a result, the gradients can be successfully transmitted in various time steps. Many IndRNNs can be stacked together to build a bigger network, allowing the network to be more in-depth and persistent, investigating information across channels and understanding data dependency. Equation (1) can describe the update status as follows:

h_{t} = σ (W_{x_{i}} + U ⊙ h_{t - 1} + b)

(1)

where x_t and h_t denote the inputs and hidden states, respectively, at timestep t. U and W are recurring inputs and the weights of the currents, respectively, while b is the neuron’s bias. W₁, W₂, and ReLU and Recurrent represent the loop processing and input weights for each step, with ReLU as the activation function, and BN signifies standardized batch processing. As shown in Figure 2, further deepening of the IndRNN model can be achieved by layering this structure.

3.2. Ensemble CNN (CNN and IndRNN)

We propose a new model, CNN-RNN, that depends on a deep learning ensemble basis of traditional IndRNN and CNN. The model is divided into two halves. The classic CNN is a feed-forward neural network that uses convolution to extract features and then pool layers to understand local input data characteristics. Another component is IndRNN, which is an RNN expansion. Internal feedback connection, internal memory, and feed-forward adjustment between processing components are properties of RNN. As a result, it positively impacts the processing of sequence information.

On the other hand, CNN ignores the connection between non-continuous sequences and only looks at the connection between continuous sequences when dealing with sequence data. Although RNN is well suited to processing sequence data, it is challenging to address long-term information dependency. There are also issues with gradient disappearance and gradient explosion. CNN-RNN combines the strengths of CNN and IndRNN. This allows for the comprehensive extraction of feature information and the consideration of long-term sequence dependence. Figure 3 depicts the general architecture of the ensemble model.

4. Artificial Gorilla Troops Optimizer (GTO)

The Artificial Gorilla Troops Optimizer (GTO) was recently described as a metaheuristic optimization method based on gorilla behavior [24]. In solving numerous engineering issues, the GTO optimizer demonstrated remarkable accuracy and efficiency [25]. Furthermore, by raising the number of search capabilities, the GTO method has an extraordinary capacity to produce desirable outcomes and acceptable performance for various system dimensions. It also outperforms other optimizers in all similar dimensions because other optimizers’ efficiency degrades as the number of dimensions increases. Another benefit of the GTO is that it excels at balancing exploration and exploitation skills in the face of large-scale problems [24].

During this phase, each gorilla is set up as a competitor for the best solution in each iteration, with the best solution being known as the silverback gorilla. Three different mechanisms are summarized in Equation (2).

G X (t - 1) = {\begin{array}{l} (U_{r} - L_{r}) r_{1} + L_{r} & i f r a n d < 0 \\ (r_{2} - D) X_{r} (t) + S \times V & i f r a n d \geq 0.5 \\ X (i) - S (S (X (t) - G X_{r} (t)) + r_{3} (X (t) - G X_{r} (t))) & i f r a n d < 0.5 \end{array}

(2)

where O is the parameter for migration to an unknown site. GX(t − 1) represents the candidate of the gorilla vector position in the next iteration, rand is a random variable, X(t) represents the gorilla position of the current vector, and X(i) represents the candidate gorilla member number [44]. Furthermore, Lr and Ur denote the lower and upper limits of the problem variables, respectively. GX_r(t) denotes the position of one of the randomly picked gorilla candidates, and X_r(t) represents the position of this random gorilla. Furthermore, the random variables r₁, r₂, and r₃ fall in the range [0, 1]. The following equations are used to compute the parameters D, V, and S.

D = N \times (1 - t ∕ M a x I t)

(3)

S = D * R

(4)

V = T / X (i)

(5)

where MaxIt is the maximum number of iterations, R is the random number in the range [−1, 1], and T is the random value in the range [−D, D]. The male gorillas in the group are accustomed to following the silverback to find food. A “silverback” is the optimal solution found during this step. This behavior can be expressed quantitatively as follows:

G_{x} (t + 1) = S * {(| 1 \ N \sum_{J - 1}^{N} G_{j} (t) |)}^{1 \ 2^{S}}

(6)

while the choice of adult females can be mathematically stated as follows:

G_{x} (t + 1) = Y_{silverback} - (Y_{silverback} - Y (t)) * (2 * r a n d - 1) * A

(7)

where

Y_{silverback}

is the best candidate solution’s position (silverback), and A is a constant parameter.

Finally, at the end of the exploitation phase, the fitness function solution is updated with the best solution. The following are the main steps of the GTO Algorithm 1:

Create a population with random positions.
Set the parameters D, S, L, U, and MaxIt.
Determine the gorilla’s position using Equation (2).
Evaluate each gorilla’s fitness function.
Set the best solution as the silverback’s location.
Update the gorilla position based on D and W values using Equation (7).
Display the best gorilla posture and keep the fitness function updated until the maximum number of iterations is reached.

For evaluating the fitness function score for each solution, as previously stated, each solution is composed of random floating point values between 0 and 1. A fitness/objective function is required to evaluate the individual solution in feature selection (FS). The primary goal of feature selection is to enhance forecast accuracy while lowering the number of features. More specifically, in this variation of the proposed work, the objective function (OF), which combines both criteria, is described as

O F (O) = α * c l a s s i f i c a t i o n_{e r r o r} + (1 - α) * \frac{L S}{L}

(8)

where classification_error is the learning algorithm’s error rate, LS is the length of the feature substring, L is the original dimension, and α (here, 0.99) is a control parameter for the effect of classification performance and feature size.

Algorithm 1. contains the detailed algorithm for the proposed GTO
Algorithm for the proposed GTO
1	for all Gorilla Oi do
2	Alter the gorilla location by Equation (1)
3	Use the sigmoid function to turn the gorilla location into a probability value
4	Using Equation (6), for compute the candidate gorilla position in the discrete domain
5	end for
6	for I = 1to N do
7	Compute OF of each candidate gorilla (Gi)
8	If Gi is fitter than Oi, replace it,
9	end for
10	Set the best location as the Silverback
11	for all Gorilla Oi do
12	if D ≥ W then
13	Using Equation (7) for changing the Gorilla’s location
14	Else
15	Using Equation (6) for changing the Gorilla’s location
16	end if
17	end for
18	for I = 1to N do
19	Compute OF of each candidate gorilla (Gi)
20	Replace Oi if Gi is more appropriate, where G is the candidate Gorilla location
21	end for
22	Set best location as the Silverback
23	end for

5. Experiment

First, data preparation was performed before training the model. A, T, C, and G were represented by the numbers 1, 2, 3, and 4, respectively. The data were subsequently transformed into a matrix by the embedding layer, then delivered to CNN. The convolution operations were used to extract the feature information via the convolution layer, while the maximum pooling operation was used to filter out the significant local feature information. After the ReLU function was activated, the vector matrix was turned into a features map, used as one of IndRNN’s input layers. IndRNN was used to understand the relationship between features thoroughly. Lastly, the predicted results were obtained by translating the vector features for the output of IndRNN to a concrete numbering using the dense layers and mapping the number to [0, 1] using the sigmoid function. The BP method [45] was used for calculating the loss layer by layer to update the parameter based on the difference between the real and forecasted values. To avoid overfitting, we used a dropout layer with a value of 0.5. The model’s learning rate was set to 0.01, the batch size was 128, and the model was optimized using the Artificial Gorilla Troops Optimizer (GTO), as shown in Table 1.

6. Evaluation Phase

In this stage, five evaluation indicators are often used in classification issues and were used here to assess the predictive strength of the proposed approach: specificity, accuracy, precision, F1-score, and recall. Specificity is defined as the conditional probability of actual negatives having a secondary class, equating to the likelihood of a negative mark being true, and it’s calculated using Equation (9). Accuracy is the percentage of validation predictions made for all forecasts, usually expressed as percentages and determined using Equation (10). Precision is a metric that assesses a model’s ability to correctly forecast the value for specific categories and is calculated using Equation (11). Recall measures the proportion of correctly identified positive patterns, determined using Equation (12). The F1-score is the average weight of recall and precision calculated with Equation (13).

Specificity = \frac{T N}{T N + F P}

(9)

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(10)

Precision = \frac{T P}{T P + F P}

(11)

Recall = \frac{T P}{T P + F N}

(12)

F 1 - score = 2 * \frac{Precision * Recall}{Precision + Recall}

(13)

where TP denotes the positive number of positively predicted classes, TN denotes the negative number of negatively predicted classes, FN denotes the positive number of negatively predicted classes, and FP denotes the negative number of classes that are positively predicted.

7. Results and Discussion

7.1. Dataset Description

The experiment used the common wheat (Triticum aestivum) dataset, as shown in Figure 4. We downloaded 384 mature wheat miRNA sequences with high credibility from PNRD [46] and 19,011 wheat lncRNA sequences from GreeNC [47] because there is no public database of miRNA and lncRNA interaction pairs. As seen in Table 2, the identical sequences were eliminated, leaving 298 miRNAs and 18,468 lncRNAs.

7.2. Data Preprocessing

In this study, the miRNA–lncRNA interaction prediction tool used was psRNATarget. The sequences of target genes that can interact with miRNA were found by examining the degree of matching between miRNA and target sequences in plants. The filtered lncRNAs and miRNAs were then entered into the psRNATarget program for prediction, yielding a positive dataset of 18,468 miRNA–lncRNA interaction pairings. The construction of a negative dataset with substantial interference capabilities is required to validate the model’s performance further. A small percentage of miRNAs participate in interaction pairings due to their low and short sequence lengths; a small percentage of miRNAs participate in interaction pairings due to their low and short sequence lengths; consequently, the experiment mostly analyzed lncRNA sequences. To begin, whole lncRNAs were sorted into two groups: those that participated in this interaction and those that did not. Then, using the Needleman Wunsch algorithm [48], a similarity comparison between the two groups for lncRNA was performed, and samples of lncRNAs with similarities of more than 80% were eliminated [49].

After similarity elimination, lncRNAs that were not engaged in the lncRNA–miRNA interaction were randomly paired with all miRNAs to produce the negative sample datasets. A random sampling approach was applied to obtain the same numbers for negative samples as positive samples to guarantee the balance of negative and positive samples. The positive and negative datasets were jumbled randomly to create the 39,593 data points needed for the experiment. We employed the SMOTE method [50] to enhance the sample size by producing characteristic data that resemble the samples to address data insufficiency and small sample size issues. We randomly selected an eigenvalue from a positive sample, calculated the eigenvalue of the closest positive sample, and then created new positive samples between the two using positive samples as an example. We iterated the previous steps until the sample data were large enough. Because the dataset’s maximum sequence length exceeds 8000 nt, the training phase takes a long time. At the same time, there were just 315 sequences longer than 4000 nucleotides. As a result, we discarded sequences that were longer than 4000 nt. The findings show that after deleting data with sequence lengths of more than 4000 nt, CNN-RNN accuracy did not improve much, but the training time was considerably reduced. The original dataset is dataset 1, and dataset 2 is updated after deleting the data with sequence lengths of more than 4000 nt. Three experiments were carried out, which are presented in Table 2. Although the accuracy of CNN-RNN changed somewhat, the time of training for each batch was reduced by more than half.

7.3. k-mer Features of miRNA Sequence

Triticum aestivum features were extracted using a hybrid CNN-RNN model. The experiments used tenfold cross-validation to ensure the accuracy and dependability of the data. The experimental dataset was divided into ten groups: nine for training and one for verification. The medium values of 10 experiments are used as the final results after experimenting 10 times alternatively. The main extracted feature and secondary extracted feature of the sequence are the key features retrieved in this experiment. The most prevalent extracted feature is k-mer. Each k-mer contains nucleotides K that can be A, T, C, or G. The experiments extract sequence characteristics from 3-mer (64 dimensions), 2-mer (16 dimensions), and 1-mer (4 dimensions). To match the above k-mer, a sliding window with a length of k and a sliding step size of one is employed. The experiment also retrieved the sequence’s gap features, such as the initial gap feature (A*A, 64 dimensions) and the second gap feature (A**A, 256 dimensions).

Secondary structural features decided the primary functions of RNA molecules. According to studies, the more stable an RNA sequence’s structure leads to more free energy is produced during folding to build secondary structures; the more stable the secondary structure is the additional complimentary basis pairing it creates, with higher G and C values. This experiment extracted the sequences’ basis complimentary pairing rate (E₁) and C and G values (E₂), and normalized minimum free energies (DM). The ViennaRNA [51] toolbox was used to identify the point bracket form for the secondary sequence structures, as well as the least free energy created through the production of this secondary structure, which is characterized as follows:

D M = M F E / L

(14)

E_{1} = n_p a i r s ∕ (L ∕ 2)

(15)

E_{2} = (n_G + n_C) ∕ L

(16)

where

n_p a i r s

is the maximum number of base pairs that may be paired at the sequences, L is the length of the sequence,

n_C

and

n_G

are the frequency occurrences for C and G, and MFE is the minimal free energy for the sequence.

There were 485 dimensions derived, covering both fundamental and secondary structural elements. The 485-dimensional feature vectors were created by fusing these features. Every feature vector was concatenated at vector sets for model testing and training. Table 3 shows the complete feature information.

The experiments also used tenfold cross-validation, with 90% of the data being used for training and 10% for testing. On the Triticum aestivum dataset, CNN-RNN is first compared to shallow machine learning approaches, including traditional machine learning algorithms such as random forest, k-nearest neighbor (k-NN), and support vector machine (SVM). Although deep learning harvests information automatically, the important features may be lost in the process, resulting in a generic and not optimum condition. As a result, deep learning approaches may not perform as well as shallow machine learning models.

The proposed model was compared to shallow machine learning models and another deep learning model to verify its performance. Table 4 and Figure 5 demonstrate the experimental results of our suggested model and the shallow machine learning models. Table 4 shows that our suggested model achieves greater than 96% for all four assessment factors; clearly, this is higher than other models, demonstrating that our proposed model outperforms shallow machine learning approaches. Experimental data suggest that our proposed model outperforms shallow machine learning in the categorization of miRNA–lncRNA interactions.

The proposed model was compared to various deep learning models such as LSTM, IndRNN, CNN, and CNN+LSTM, and shallow machine learning methods. Each model was trained and tested using six sets of data and tenfold cross-validation; accuracy was utilized as the assessment criterion. The Triticum aestivum dataset is divided into six groups, with maximum sequence lengths of 3000 nt, 2500 nt, 2000 nt, 1500 nt, 1000 nt, and 500 nt for each group. Figure 6 depicts the data distribution. Table 5 shows the categorization findings.

Table 5 shows that the LSTM accuracy dramatically reduces as sequence length increases, whereas the CNN+LSTM accuracy marginally decreases. Only the accuracy of the proposed model and CNN remained unchanged, but our proposed method’s accuracy is substantially greater than CNN’s. The findings suggest that our proposed method outperforms previous deep learning models regarding miRNA–lncRNA interaction accuracy, particularly when the length of the sequences is rather large. We examined the loss convergences rates for the models when the length of the sequences is 3000 nt to test our model’s performance further. The loss convergence rate in 20 iterations is compared in Figure 7. In terms of both convergence rate and degree of convergence, our suggested strategy outperforms existing deep learning models.

In recent years, much work has been devoted to creating computer approaches for finding connections in diverse biological datasets. Many researchers have used shallow machine learning methods to construct a prediction model through feature selection in the prediction of the interaction between miRNA and lncRNA, but there are many problems such as fewer training data, large noise, and more human factors, resulting in low reliability of the prediction results. The comparative analysis of the proposed model with state-of-the-art models showed that the proposed model has better performance, with accuracy of 97.7%, greater than the models described in [5,10,17,19,21,22], as shown in Table 6. Additionally, we compared our model with the XGBoost model. We applied the same dataset in this model after comparison, and our model was slightly better than the XGBoost model, as shown in Table 7.

As shown in Table 7, the proposed model was compared with another advanced model, XGBoost model, to prove its effectiveness. We applied the same dataset used in this work to the XGBoost model for the comparison. The proposed model was slightly better than the XGBoost model in terms of accuracy, F1-score, recall, specificity, and precision. The results indicate that the proposed model is slightly better than the XGBoost method.

8. Conclusions

Based on the RNA sequence properties for plants, we propose a technique to predict the interaction between miRNA and lncRNA. The model efficiently solved the difficulties of gradient disappearance and explosion during the gradient propagation process and ensured classification accuracy. Furthermore, the model has a basic structure, is simple to use, and can be extended. Our suggested technique effectively classifies plant interactions between miRNA and lncRNA. The model has apparent benefits over shallow machine learning and other deep learning models, and it may be extended to other plants with acceptable results. At the same time, the model has a high level of performance and generalization, making it helpful in classifying plant miRNA–lncRNA interactions. This study has laid the groundwork for further investigation into the interaction mechanism of miRNA and lncRNA in plants. Model classification accuracy could be further enhanced in the future by modifying the level of model structures and increasing datasets.

In future work, we will apply our proposed model to other datasets in various fields, especially agriculture. Furthermore, it would be interesting to study the influence of combining additional deep learning models or using different optimization models.

Author Contributions

Writing—original draft preparation, W.H.; writing—review and editing, A.E.H.; methodology, A.E.H. and W.H.; software, W.H.; validation resources, W.A.A. and A.H.I.; data curation, W.H. and A.I. and supervision, A.E.H.; All authors have read and agreed to the published version of the manuscript.

Funding

The authors received no specific funding for this study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We would like to acknowledge the Scientific Research Group in Egypt (SRGE) for efforts made in this research.

Conflicts of Interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Abbreviation

SVM	Support Vector Machines
BLSTM	Bidirectional Long Short-Term Memory
KNN	K-Nearest Neighbors
CNN	Convolutional Neural Network
RNN	Recurrent Neural Network
IndRNN	Independently Recurrent Neural Network
SMOTE	Synthetic Minority Oversampling Technique
LSTM	Long Short-Term Memory Networks

References

Bolin, W.; Su, Z.; Wan, L.; He, T. Relationship between long non-coding RNA polymorphism and the risk of coronary artery disease: A protocol for systematic review and meta-analysis. Medicine 2021, 100, e25146. [Google Scholar]
Fonouni-Farde, C.; Ariel, F.; Crespi, M. Plant Long Noncoding RNAs: New Players in the Field of Post-Transcriptional Regulations. Non-Coding RNA 2021, 7, 12. [Google Scholar] [CrossRef] [PubMed]
Shan, G.; Tang, T.; Xia, Y.; Qian, H.-J. Long non-coding RNA NEAT1 promotes bladder progression through regulating miR-410 mediated HMGB1. Biomed. Pharmacother. 2019, 121, 109248. [Google Scholar] [CrossRef] [PubMed]
Sakshi, S.; Jayasuriya, R.; Ganesan, K.; Xu, B.; Ramkumar, K.M. Role of circR-NA-miRNA-mRNA interaction network in diabetes and its associated complications. Mol. Ther. Nucleic Acids 2021, 26, 1291–1302. [Google Scholar] [CrossRef]
Du, X.-X.; Liu, Y.; Wang, B.; Zhang, J.-F. lncRNA–disease association prediction method based on the nearest neighbor matrix completion model. Sci. Rep. 2022, 12, 21653. [Google Scholar] [CrossRef]
Chao, H.; Hu, Y.; Zhao, L.; Xin, S.; Ni, Q.; Zhang, P.; Chen, M. Biogenesis, Functions, Interactions, and Resources of Non-Coding RNAs in Plants. Int. J. Mol. Sci. 2022, 23, 3695. [Google Scholar] [CrossRef]
Cai, X.; Lymbery, A.J.; Armstrong, N.J.; Gao, C.; Ma, L.; Li, C. Systematic identification and charac-terization of lncRNAs and lncRNA-miRNA-mRNA networks in the liver of turbot (Scophthalmus maximus L.) induced with Vibrio anguillarum. Fish Shellfish. Immunol. 2022, 131, 21–29. [Google Scholar] [CrossRef]
Pepe, G.; Guarracino, A.; Ballesio, F.; Parca, L.; Ausiello, G.; Helmer-Citterich, M. Evaluation of potential miRNA sponge effects of SARS genomes in human. Non-Coding RNA Res. 2022, 7, 48–53. [Google Scholar] [CrossRef]
Sun, L.; Yang, Y.; Pan, H.; Zhu, J.; Zhu, M.; Xu, T.; Li, Z.; Dong, T. Molecular Characteriza-tion and Target Prediction of Candidate miRNAs Related to Abiotic Stress Responses and/or Storage Root Development in Sweet Potato. Genes 2022, 13, 110. [Google Scholar] [CrossRef]
Lin, C.; Sun, Z.-L. PmliHFM: Predicting Plant miRNA-lncRNA Interactions with Hybrid Feature Mining Net-work. Interdiscip. Sci. Comput. Life Sci. 2022, 14, 1–11. [Google Scholar]
Chen, J.; Peng, H.; Han, G.; Cai, H.; Cai, J. HOGMMNC: A higher order graph matching with multiple network constraints model for gene–drug regulatory modules identification. Bioinformatics 2018, 35, 602–610. [Google Scholar] [CrossRef] [PubMed]
Akrem, S.; Tabbone, S. Deep neural networks-based relevant latent representation learning for hyper-spectral image classification. Pattern Recognit. 2022, 121, 108224. [Google Scholar]
Chen, S.; Zheng, L. Complementary ensemble empirical mode decomposition and independent recurrent neural network model for predicting air quality index. Appl. Soft Comput. 2022, 131, 109757. [Google Scholar] [CrossRef]
Qiang, K.; Meng, J.; Cui, J.; Luan, Y.; Chen, M. PmliPred: A method based on hybrid model and fuzzy deci-sion for plant miRNA–lncRNA interaction prediction. Bioinformatics 2020, 36, 2986–2992. [Google Scholar]
Zhang, M.; Sun, J.-W.; Liu, Z.; Ren, M.-W.; Shen, H.-B.; Yu, D.-J. Improving N6-methyladenosine site prediction with heuristic selection of nucleotide physical–chemical properties. Anal. Biochem. 2016, 508, 104–113. [Google Scholar] [CrossRef] [PubMed]
Yuan, Z.; Zeng, P.; Li, Y.-H.; Zhang, Z.; Cui, Q. SRAMP: Prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res. 2016, 44, e91. [Google Scholar]
Chen, W.; Tang, H.; Lin, H. MethyRNA: A web server for identification of N6-methyladenosine sites. J. Biomol. Struct. Dyn. 2016, 35, 683–687. [Google Scholar] [CrossRef]
Qiang, X.; Chen, H.; Ye, X.; Su, R.; Wei, L. M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species. Front. Genet. 2018, 9, 495. [Google Scholar] [CrossRef] [Green Version]
Wei, L.; Chen, H.; Su, R. M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning. Mol. Ther. Nucleic Acids 2018, 12, 635–644. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Hamada, M. DeepM6ASeq: Prediction and characterization of m6A-containing sequences using deep learning. BMC Bioinform. 2018, 19, 524. [Google Scholar] [CrossRef]
Zou, Q.; Xing, P.; Wei, L.; Liu, B. Gene2vec: Gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. Rna 2018, 25, 205–218. [Google Scholar] [CrossRef] [Green Version]
Kunqi, C.; Wei, Z.; Zhang, Q.; Wu, X.; Rong, R.; Lu, Z.; Su, J.; de Magalhaes, J.P.; Rigden, D.J.; Meng, J. WHISTLE: A high-accuracy map of the human N 6-methyladenosine (m6A) epitranscriptome pre-dicted using a machine learning approach. Nucleic Acids Res. 2019, 47, e41. [Google Scholar]
Kewei, L.; Chen, W. iMRM: A platform for simultaneously identifying multiple kinds of RNA modifica-tions. Bioinformatics 2020, 36, 3336–3342. [Google Scholar]
Abdollahzadeh, B.; Gharehchopogh, F.S.; Mirjalili, S. Artificial gorilla troops optimizer: A new nature-inspired metaheuristic algorithm for global optimization problems. Int. J. Intell. Syst. 2021, 36, 5887–5958. [Google Scholar] [CrossRef]
Ginidi, A.; Ghoneim, S.M.; Elsayed, A.; El-Sehiemy, R.; Shaheen, A.; El-Fergany, A. Gorilla Troops Optimizer for Electrically Based Single and Double-Diode Models of Solar Photovoltaic Systems. Sustainability 2021, 13, 9459. [Google Scholar] [CrossRef]
Mohamed, A.-B.; El-Shahat, D.; Sallam, K.M.; Munasinghe, K. Parameter extraction of photovol-taic models using a memory-based improved gorilla troops optimizer. Energy Convers. Manag. 2022, 252, 115134. [Google Scholar]
Piri, J.; Mohapatra, P.; Acharya, B.; Gharehchopogh, F.S.; Gerogiannis, V.C.; Kanavos, A.; Manika, S. Feature Selection Using Artificial Gorilla Troop Optimization for Biomedical Data: A Case Analysis with COVID-19 Data. Mathematics 2022, 10, 2742. [Google Scholar] [CrossRef]
Qingwei, L.; Chu, S.-C.; Yang, Q.; Liang, A.; Pan, J.-S. Multi-Group Gorilla Troops Opti-mizer with Multi-Strategies for 3D Node Localization of Wireless Sensor Networks. Sensors 2022, 22, 4275. [Google Scholar]
Yaning, X.; Sun, X.; Guo, Y.; Li, S.; Zhang, Y.; Wang, Y. An Improved Gorilla Troops Opti-mizer Based on Lens Opposition-Based Learning and Adaptive beta-Hill Climbing for Global Optimization. Cmes-Comput. Model. Eng. Sci. 2022, 131, 815–850. [Google Scholar]
Houssein, E.H.; Saad, M.R.; Ali, A.A.; Shaban, H. An efficient multi-objective gorilla troops optimizer for minimizing energy consumption of large-scale wireless sensor networks. Expert Syst. Appl. 2023, 212. [Google Scholar] [CrossRef]
Wu, T.; Wu, D.; Jia, H.; Zhang, N.; Almotairi, K.H.; Liu, Q.; Abualigah, L. A Modified Gorilla Troops Optimizer for Global Optimization Problem. Appl. Sci. 2022, 12, 10144. [Google Scholar] [CrossRef]
El-Dabah, M.A.; Kamel, S.; Khamies, M.; Shahinzadeh, H.; Gharehpetian, G.B. Artificial gorilla troops optimizer for optimum tuning of TID based power system stabilizer. In Proceedings of the 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), Bam, Iran, 2–4 March 2022; IEEE: Bam, Iran, 2022; pp. 1–5. [Google Scholar]
Ashutosh, B.; Marwaha, S. Economic energy scheduling through chaotic gorilla troops optimiz-er. Int. J. Energy Environ. Eng. 2022, 3, 1–25. [Google Scholar] [CrossRef]
Pan, Z.; Zhang, H.; Lei, J.; Fang, Y.; Shao, X.; Ling, N.; Kwong, S. DACNN: Blind Image Quality Assessment via a Distortion-Aware Convolutional Neural Network. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7518–7531. [Google Scholar] [CrossRef]
Elaraby, A.; Hamdy, W.; Alanazi, S. Classification of Citrus Diseases Using Optimization Deep Learning Approach. Comput. Intell. Neurosci. 2022, 2022, 9153207. [Google Scholar] [CrossRef]
Davis, G.; Ongie, G.; Willett, R. Deep equilibrium architectures for inverse problems in imaging. IEEE Trans. Comput. Imaging 2021, 7, 1123–1133. [Google Scholar]
Ali, S.; Miah, S.; Haque, J.; Rahman, M.; Islam, K. An enhanced technique of skin cancer classification using deep convolutional neural network with transfer learning models. Mach. Learn. Appl. 2021, 5, 100036. [Google Scholar] [CrossRef]
Ouyang, S.; Dong, D.; Xu, Y.; Xiao, L. Communication optimization strategies for distributed deep neural network training: A survey. J. Parallel Distrib. Comput. 2020, 149, 52–65. [Google Scholar] [CrossRef]
Feifei, C.; Li, S.; Zhang, Z.; Sui, M.; Cao, C.; Hesham, A.E.-L.; Zou, Q. DeepMC-iNABP: Deep learning for multiclass identification and classification of nucleic acid-binding proteins. Comput. Struct. Biotechnol. J. 2022, 20, 2020–2028. [Google Scholar]
Elaraby, A.; Hamdy, W.; Alruwaili, M. Optimization of Deep Learning Model for Plant Disease Detection Using Particle Swarm Optimizer. Comput. Mater. Contin. 2022, 71, 4019–4031. [Google Scholar] [CrossRef]
Liu, Z.; Yang, X.; Zhang, S.; Liu, Y.; Zhao, Y.; Zheng, W. Automatic Generation of Test Cases Based on Genetic Algorithm and RBF Neural Network. Mob. Inf. Syst. 2022, 2022, 1489063. [Google Scholar] [CrossRef]
Ingo, G.; Kutyniok, G.; Petersen, P. Error bounds for approximations with deep ReLU neural networks in W s, p norms. Anal. Appl. 2020, 18, 803–859. [Google Scholar]
Dahou, A.; Al-Qaness, M.A.; Elaziz, M.A.; Helmi, A. Human activity recognition in IoHT applications using Arithmetic Optimization Algorithm and deep learning. Measurement 2022, 199. [Google Scholar] [CrossRef]
Ali, M.; Kotb, H.; Aboras, K.M.; Abbasy, N.H. Design of cascaded PI-fractional order PID con-troller for improving the frequency response of hybrid microgrid system using gorilla troops optimizer. IEEE Access 2021, 9, 150715–150732. [Google Scholar] [CrossRef]
Tianhua, L.; Yin, S. An improved particle swarm optimization algorithm used for BP neural network and mul-timedia course-ware evaluation. Multimed. Tools Appl. 2017, 76, 11961–11974. [Google Scholar]
Yi, X.; Zhang, Z.; Ling, Y.; Xu, W.; Su, Z. PNRD: A plant non-coding RNA database. Nucleic Acids Res. 2014, 43, D982–D989. [Google Scholar] [CrossRef] [Green Version]
Paytuví-Gallart, A.; Pulido, A.H.; De Lagrán, I.A.M.; Sanseverino, W.; Cigliano, R.A. GREENC: A Wiki-based database of plant lncRNAs. Nucleic Acids Res. 2015, 44, D1161–D1166. [Google Scholar] [CrossRef]
Cheng, C.; Zhang, Q.; Yu, B.; Yu, Z.; Lawrence, P.J.; Ma, Q.; Zhang, Y. Improving pro-tein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier. Comput. Biol. Med. 2020, 123, 103899. [Google Scholar] [CrossRef]
Zhihan, L.; Qiao, L. Deep belief network and linear perceptron based cognitive computing for collaborative ro-bots. Appl. Soft Comput. 2020, 92, 106300. [Google Scholar]
Douzas, G.; Bacao, F. Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE. Inf. Sci. 2019, 501, 118–135. [Google Scholar] [CrossRef]
Lorenz, R.; Bernhart, S.H.; Siederdissen, C.H.; Tafer, H.; Flamm, C.; Stadler, P.F.; Hofacker, I.L. ViennaRNA package 2.0. Algorithms Mol. Biol. 2013, 6, 26. [Google Scholar] [CrossRef]

Figure 1. A Convolutional Neural Network (CNN) frame structure.

Figure 2. Independently Recurrent Neural Network (IndRNN) frame structure.

Figure 3. The general architecture of the proposed model.

Figure 4. Sample of the dataset used in the experiments.

Figure 5. The performance metrics for shallow machine learning compared with our proposed method.

Figure 6. The distribution of our dataset.

Figure 7. Loss ratio comparison of our proposed method with other different deep learning models.

Table 1. Suggested model parameter settings.

Hyperparameters	Value
Optimization algorithm	GTO
Initial learning rate	0.1
Dropout layer	0.5
Epochs	50
Batch size	128

Table 2. Wheat (Triticum aestivum) dataset information.

Datasets	Original Data	Filtered Data	Databases
lncRAN	19,011	18,468	GreeNC [36]
miRNA	384	298	PNRD [35]

Table 3. Information on structural features.

Feature Category	Feature Name	Number
Primary structural features	1-mer	4
	2-mer	16
	3-mer	64
	First gap features	64
	Second gap features	256
Secondary structural features	DM	1
	$E_{1}$	1
	$E_{2}$	1

Table 4. The performance metrics of shallow machine learning models compared with our proposed method.

Performance Measures	Accuracy	Precision	F1-Score	Specificity	Recall
random forest	93.9	93.5	93.7	93.4	93.6
K-NN	92.4	92	91.9	91.7	92.1
SVM	89.5	89.3	90.4	89.6	89.3
proposed method	97.7	97.4	97.6	97.39	97.21

Table 5. Comparison analysis of the proposed models with different lengths of sequences and different deep learning models.

Models	Length of Sequences
Models	3000 nt	2500 nt	2000 nt	1500 nt	1000 nt	500 nt
IndRNN	76.49	76.37	76.53	76.58	76.97	76.76
LSTM	69.98	71.56	73.86	76.34	79.64	83.74
CNN	95.29	94.32	95.51	94.91	94.67	95.25
CNN+LSTM	95.46	95.76	95.25	95.71	95.78	96.41
Proposed method	96.45	96.60	97.21	97.85	97.33	97.7

Table 6. Comparison analysis of the proposed model.

Year	Existing Work	Accuracy %
2022	X Du et al. [5]	92
2022	Chen, Lin et al. [10]	91.9
2019	Chen, Kunqi et al. [22]	80
2023	Proposed model	97.7

Table 7. Comparison of the suggested model with the XGBoost model.

Performance Measures	Accuracy	Precision	F1-Score	Specificity	Recall
XGBoost	96.5	94.3	96.4	92.6	93.3
Proposed method	97.7	97.4	97.6	97.39	97.21

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hamdy, W.; Ismail, A.; Awad, W.A.; Ibrahim, A.H.; Hassanien, A.E. An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm. Sensors 2023, 23, 2219. https://doi.org/10.3390/s23042219

AMA Style

Hamdy W, Ismail A, Awad WA, Ibrahim AH, Hassanien AE. An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm. Sensors. 2023; 23(4):2219. https://doi.org/10.3390/s23042219

Chicago/Turabian Style

Hamdy, Walid, Amr Ismail, Wael A. Awad, Ali H. Ibrahim, and Aboul Ella Hassanien. 2023. "An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm" Sensors 23, no. 4: 2219. https://doi.org/10.3390/s23042219

APA Style

Hamdy, W., Ismail, A., Awad, W. A., Ibrahim, A. H., & Hassanien, A. E. (2023). An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm. Sensors, 23(4), 2219. https://doi.org/10.3390/s23042219

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Optimized Ensemble Deep Learning Model for Predicting Plant miRNA–IncRNA Based on Artificial Gorilla Troops Algorithm

Abstract

1. Introduction

2. Related Work

3. CNN’s Model Setting and Phases

3.1. Structure of IndRNN and CNN

3.2. Ensemble CNN (CNN and IndRNN)

4. Artificial Gorilla Troops Optimizer (GTO)

5. Experiment

6. Evaluation Phase

7. Results and Discussion

7.1. Dataset Description

7.2. Data Preprocessing

7.3. k-mer Features of miRNA Sequence

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI