Skip to Content
SustainabilitySustainability
  • Article
  • Open Access

26 April 2023

Hybrid Multichannel-Based Deep Models Using Deep Features for Feature-Oriented Sentiment Analysis

,
,
,
,
and
1
Department of Computer Science, COMSATS University Islamabad, Wah Campus, Wah 47040, Pakistan
2
Department of Computer Science, HITEC University, Taxila 47080, Pakistan
3
Management Information System Department, College of Business Administration, Prince Sattam Bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia
4
Department of Computer Science, Hanyang University, Seoul 04763, Republic of Korea

Abstract

With the rapid growth of user-generated content on social media, several new research domains have emerged, and sentiment analysis (SA) is one of the active research areas due to its significance. In the field of feature-oriented sentiment analysis, both convolutional neural network (CNN) and gated recurrent unit (GRU) performed well. The former is widely used for local feature extraction, whereas the latter is suitable for extracting global contextual information or long-term dependencies. In existing studies, the focus has been to combine them as a single framework; however, these approaches fail to fairly distribute the features as inputs, such as word embedding, part-of-speech (PoS) tags, dependency relations, and contextual position information. To solve this issue, in this manuscript, we propose a technique that combines variant algorithms in a parallel manner and treats them equally to extract advantageous informative features, usually known as aspects, and then performs sentiment classification. Thus, the proposed methodology combines a multichannel convolutional neural network (MC-CNN) with a multichannel bidirectional gated recurrent unit (MC-Bi-GRU) and provides them with equal input parameters. In addition, sharing the information of hidden layers between parallelly combined algorithms becomes another cause of achieving the benefits of their combined abilities. These abilities make this approach distinctive and novel compared to the existing methodologies. An extensive empirical analysis carried out on several standard datasets confirms that the proposed technique outperforms the latest existing models.

1. Introduction

In the age of social media, diverse industries and firms’ advisories broadly depend upon user-generated opinions for forecasting their future earnings. These opinionated unstructured data are available as reviews, blog discussions, graphics, audio, videos, and other types of media that do not support any structure. That made this field challenging due to ambiguities of natural language, exponential increases in social media web content, and indirect sentiments expressed in user-generated context [1]. In that situation, data analysts widely considered ABSA to understand the users’ or consumers’ requirements, filtration of unrequired data, and obtain relevant suggestions that make their organizational and industrial decision appropriate. Generally, two types of online attitudes, reviews, or opinions are observed, i.e., product reviews and experience sharing regarding these products or services. The first type discusses features of a particular entity, such as a product or service. Moreover, the second type compares various features of entities to identify their pros and cons [2].
The extraction of accurate features of a targeted entity became a critical issue in the NLP field due to the complex nature of contextual information. The contextual information around these advantageous features of a targeted entity considers highly important in these circumstances because they provide valuable clues for accurate identification and extraction [3,4]. Moreover, the precise identification and extraction of these features still demand attention from the research community. Traditionally, these extractions and identifications of features accomplish through various existing methodologies, such as machine learning [5,6,7]; topic modeling [8,9,10]; and lexicon-based [11,12,13], rule-based, and syntactic relation-based [14,15,16,17,18] methods. Syntactic pattern techniques perform well while extracting features and classifying their sentiments. However, these methodologies are discouraged due to their time consumption and specialists’ demand for creating rules and lexicons, which restrict them to a specific domain and language [19,20]. Additionally, the supervised methodologies of machine learning highly rely upon a large volume of labeled datasets that expresses a bottleneck of these methods [21]. On the other side, the semi-supervised methodology demands less labeled data for training, whereas their methods’ complexity for feature selection becomes the prominent cause of obstinate such approaches. While unsupervised methods highly relied upon the manual feature engineering mechanism. The quality of such extracted features depends on the manual feature engineering process, which affects the scalability and adaptability of these approaches to various fields’ applications in daily life [22,23].
One of the most highly recommended approaches in the machine learning field is known as deep learning (DL). It addresses diverse NLP challenges such as machine translation, named entity reorganization, and sentiment analysis (SA) [24]. Recent advancements in the field of NLP highly relied upon DL architectures for the extraction of such valuable features along with their sentiment classification. Recurrent and convolutional neural networks are the two well-leading architectures of DL methodologies. CNN has achieved that position due to the utilization of convolution kernels, which make it distinctive while extracting targeted features. On the other hand, recurrent neural networks (RNN) and their variations, such as long short-term memory (LSTM) or GRU, do not resemble during the extraction of contextual information, which can be versatile under varying circumstances [25].
RNN analyzes the whole sentence, word by word, to capture the semantic information of the sentence in the form of hidden layers. It also captures long semantical dependencies of long-contextual facts but in a biased manner. It means each upcoming word is more dominant compared to the previous work. However, worthwhile terms/words can occur in any position of a sentence, which becomes the prominent cause of the model’s effectiveness reduction. Generally, the RNN-based model captures the sequential patterns through temporal features and long-term semantical dependencies among the pairs of words/terms while learning. In addition, these methods equally drew their attention to each word of the targeted sentence. Due to this, they did not distinguish between ordinary and prominent words, which are (compared to common words) more dominant and influential in contextual knowledge. It becomes the cause of performance degradation of RNN-based approaches [25,26,27].
On the other side, CNN has presented itself as a non-biased model. It comprises kernels and max-pooling layers to extract the prominent features of a targeted entity within a sentence. As a result, CNN captures the semantics of a sentence more effectively when compared to RNN. However, CNN is facing an issue while determining the optimistic size of the kernel. It means that their small size may lose some critical information. While on the other hand, the larger size may consider all non-required terms, which could cause erroneous training of the model. In addition, their utilization of filters effectively captures the local features that proved beneficial during the extraction of semantically highly effective terms. Moreover, such a model never demands order-sensitive long-term semantic dependencies of the sentence during training. Instead, they can only be trained based on the local feature information [25,26,28].
Recently, the researchers presented various developments in this trend with the involvement of the attention mechanism, which improves sentiment classification with a credible exposition of opinion targets [29]. Thus, we can conclude that both models (CNN and RNN) separately cannot deliver state-of-the-art performance while extracting feature terms and classifying their sentiments. Moreover, their combination can improve sentiment classification through the accurate extraction of features. Consequently, the existing approaches regarding feature terms identification and extraction found in the literature have considered such methodology. However, they combined them in a sequential/serial and parallel manner. In the case of the serial method, one of the models has obtained the actual text while the other acquires the output of the first model, which becomes the cause of information loss. In the case of the parallel technique, both models have never been treated equally in terms of inputs. It means that one of the models uses a whole set of inputs while the other obtains its subset. This situation underutilizes the abilities of one model compared to the other, which obstinate the benefits of parallel or simultaneous algorithms adhesion approach. This scenario demands the research community to develop such techniques where both models directly interact with the textual context in the form of an equal number of input parameters and then combine their learnings to exploit these models.
Both above-mentioned algorithms have their distinctive pros and cons. In the state of these motivations, the attention-based joint model (Att-JM) jointly utilizes them and applies the attention mechanisms to these for accurately identifying informative features and classifying their expressed sentiments. Consequently, while integrating these algorithms parallelly, the Att-JM shares information of hidden layers between them to achieve the benefits of their combined learnings. Additionally, it distributes an equal number of inputs between them to accomplish the main tasks of aspect-based sentiment analysis (ABSA). Thus, the main contribution of this paper has given as under:
  • The proposed approach proposes the parallel fusion of the multichannel convolutional neural network (MC-CNN) and multichannel gated recurrent unit (MC-GRU) with various deep features while accomplishing the main tasks of ABSA.
  • The approach proposes and explores the collective utilization and the uniform distribution effect of word2vec embedding and contextual position information on the performance of the merged deep learning-based algorithm during the aspect’s identification and sentiment classification.
  • The proposed approach shares the information of hidden layers between merged models; thus, they attain the advantages of their combined abilities and learnings while predicting aspects and classifying their sentiments.
  • The proposed approach outperforms when assessed through evaluation metrics, e.g., precision, recall, and F1 measure on standardized datasets comprising SemEval and Twitter. Therefore, the F1 measure depicts 95% achievement of the proposed approach in the aspect term extraction (ATE) task and a performance of 92% in the sentiment classification (SC) task.
The rest of the paper is organized as follows: Existing work related to aspect extraction and sentiment classification is presented as “Related Work” in Section 2. Section 3 expresses the overall methodology and detail of the proposed approach as the “Proposed Research Methodology”. The experimental environment considered for the development of this model is described in Section 4 as “Experimental Arrangements”. The performance comparison of this model is presented in Section 5 as “Results and Discussion”. The last section, Section 6, concludes the whole approach with future work as “Conclusion”.

3. Proposed Research Methodology

The proposed methodology aims to identify aspects accurately and classify their corresponding sentiments. Therefore, the proposed approach depends on the knowledge of contextual position, syntactic, and semantics facts of textual reviews that provide valuable clues to determine their appropriate polarities. In the relevant literature on ABSA, existing approaches have combined CNN and RNN in a sequential and parallel manner to obtain their combined benefits. However, their diverse distribution of input features has again appeared as the leading disadvantage to these methods. In the case of the serial approach, the former algorithm acquires the actual input, whereas the latter fetches the output of the former, which becomes the cause of information loss. In addition, in the case of the parallel approach, both merged algorithms were never moderately treated. It denotes that the former algorithm’s input comprises various features, whereas the latter obtains a subset of them as input. Due to this, one of the algorithms acquires a more suitable acquaintance of contextual knowledge when compared to the other, which reduces the benefits of a parallel or simultaneous learning-based approach. Therefore, the proposed method fairly distributes the novel features among the combined algorithms to obtain the maximum advantages of parallel learning.
The utilization of contextual position information has proven itself to be a prominent factor that enhances the performance of various NLP tasks, which motivates the proposed method to include positional information vectors during the model learning phase. As a result, the Att-JM utilizes contextual positional embedding as position information vectors with word embedding. Therefore, the Att-JM uses the Google News dataset pre-trained word embedding that comprises three hundred dimensional vectors. In addition, instead of merging similar algorithms, the proposed methodology combines two different algorithms (Bi-GRU and MC-CNN) that share their hidden layers. The Att-JM utilizes the attention layer to determine the highlighted and emphasized terms that achieve the entire focus of the whole context and then passes them toward the pooling layer. The finalized features from textual reviews deliver to softmax for categorizing these targeted terms as an aspect or non-aspect. The Att-JM utilizes these identified aspects and determines their related sentiments. Thereon, the context of the targeted sentence and their identified aspect terms are combined and given to the softmax to predict the relevant polarities as positive, negative, and neutral. Consequently, the Att-JM evaluates through a standard benchmark of datasets comprised of SemEval and Twitter. The detailed framework of the proposed methodology is shown in Figure 1.
Figure 1. The framework of attention-based JM (MC-BiGRU and MC-CNN) for ATE and SC.

3.1. Proposed Model Description

The proposed method’s objective is to determine aspects with the corresponding sentiment from the textual reviews. Wherefore the Att-JM combines MC-CNN and MC-Bi-GRU in a single framework where both algorithms comprise two channels for input. Therefore, the first obtains the word2vec embedding, whereas the other acquires the contextual position information vectors as input. Hence, the inclusion of an attention mechanism assists the proposed method during the accurate identification and extraction of these informative features and their relevant sentiments. In addition, the attention mechanisms utilize the precise location information from contextual position information during their recognization phase. Therefore, the precision of the Att-JM concerning the identification of potential terms is more authentic and reliable when compared with existing techniques.
The semantic information of word occurrence concerning a sentence is more significant to preserve their dependency knowledge. Thus, to accurately conserve this genre of facts, the Att-JM utilizes word2vec embeddings because these representations are well-known to achieve this objective. For this purpose, the Google News dataset pre-trained word2vec embedding is being used, which possesses three hundred dimensional vectors. These insights deliver towards the integrated algorithms that share their hidden layer’s information to attain the benefits of their combined learnings. Consequently, the proposed methodology acquires the advantages of two different algorithms (MC-BiGRU and MC-CNN) and a parallel learning-based approach. According to the literature and to the best of our knowledge, the Att-JM is among the pioneers that perform the task of ATE and SC and merge different algorithms in a parallel manner that treats them equally in terms of input parameters. Table 1 depicts the utilized symbols in our procedure to elaborate it.
Table 1. List of abbreviations.

3.1.1. Multichannel CNN

Therefore, MC-CNN exploits contextual position information vectors and word2vec embedding as input. The contextual position information depicts the effect of each targeted term within the context of a sentence through vector representation. These representations formulate with the help of mathematical notation, which has exhibited in the form of Equation (1) [57] given below:
C P I i = lim j n θ j + 1 θ j p j n , 0 θ 1 ,
In Equation (1), C P I i is the contextual position information vector of i t h sentence of the dataset. In addition, θ j is the term’s locality-based presence ratio that expresses the range 0,1 . It depicts the contributional influence of the j t h word presence concerning its position of occurrence within context. Additionally, p j is the position of a word in the targeted sentence, while n expresses the length of the sentence. If the j t h word lies in vocabulary, then Equation (1) calculates their corresponding position-oriented influence. Otherwise, zero adds in the case of the non-availability of the word. Hereafter, contextual positional information vectors deliver on the first channel of MC-CNN with their weight matrix W C P I . Then features are extracted using Equation (2) [58] given below:
C C P I = f W C P I x C P I + b
In the above equation, f is the activation function. Consequently, Leaky ReLU is utilized for this purpose, W C P I indicates the weight of the targeting term x C P I , and b is a bise term. The last term, C C P I , denotes the identified features from the convolutional layer based on the contextual position information. In addition, as described previously, word2vec embedding uses to preserve the syntactic and semantic details of context. Therefore, these embeddings impart to the second channel of MC-CNN along with the weight matrix W W E . Then features are extracted using Equation (3) given below:
C W E = f W W E x W E + b
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. The above Equation (3) comprises the same parameters as observed in Equation (2). The only difference is that these parameters in Equation (3) depict the information relates to word2vec embedding. Afterward, both C C P I based and C W E based features are combined using the mathematical notation expressed in Equation (4).
C = C C P I ;   C W E
After utilizing these convolution layers, a max-pooling layer deploys for feature mapping, and their maximum values use for corresponding feature identification. The reason is that the highest value depicts the prominent terms of interest.

3.1.2. Multichannel BiGRU

Moreover, the second algorithm’s, i.e., MC-BiGRU, first channel acquires each targeted sentence contextual position information vector with the help of Equation (1) and their corresponding weight matrix W C P I . Features are then extracted using Equations (5)–(8) to (9)–(12) [59,60] for both the forward and backward directions, which are given below:
Z C P I = f ( W C P I x t + W z 1 h C P I 1 + b z )
R C P I = f ( W C P I x t + W r 1 h C P I 1 + b r )
h ^ C P I = f ( W C P I x t + W C P I 1 R C P I h C P I 1 + b h )
h C P I = 1 Z C P I h C P I 1 + Z C P I h ^ C P I
Z C P I = f ( W C P I x t + W z 1 h C P I 1 + b z )
R C P I = f ( W C P I x t + W r 1 h C P I 1 + b r )
h ^ C P I = f ( W C P I x t + W C P I 1 R C P I h C P I 1 + b h )
h C P I = 1 Z C P I h C P I 1 + Z C P I h ^ C P I
In the above Equations (5)–(8), h C P I represents the forward direction state’s output vector about the contextual position information. Moreover, Z C P I denotes the update vector of contextual position information, h C P I 1 depicts the contextual position information’s previous state vector, and h ^ C P I shows the activation vector. In addition, the symbol denotes the Hadamard product. However, the symbol f depicts the activation function. Therefore, Leaky ReLU is applied as an activation function in Equations (9) and (10). Additionally, Equation (11) uses the Tanh as an activation function. Furthermore, Equations (9)–(12) comprise the same elements described in Equations (5)–(8). However, they demonstrate the detail regarding the backward direction state. These (directions) states are then combined and make a single state H C P I , as shown in Equation (13) given below:
H C P I = h C P I ;   h C P I
On the other hand, the MC-BiGRU’s second channel allocates for acceptance of the sentence’s syntactic and semantic details as word2vec embeddings along with the weight matrix W W E as input. To analyze these details in both forward and backward directions, Equations (14)–(17) to (18)–(21) were then applied. These equations are equivalent to Equations (5)–(8) to (9)–(12) in terms of their comprised elements and functionality, which are given below:
Z W E = f ( W W E x t + W z 1 h W E 1 + b z )
R W E = f ( W W E x t + W r 1 h W E 1 + b r )
h ^ W E = f ( W W E x t + W W E 1 R W E h W E 1 + b h )
h W E = 1 Z W E h W E 1 + Z W E h ^ W E
Z W E = f ( W W E x t + W z 1 h W E 1 + b z )
R W E = f ( W W E x t + W r 1 h W E 1 + b r )
h ^ W E = f ( W W E x t + W W E 1 R W E h W E 1 + b h )
h W E = 1 Z W E h W E 1 + Z W E h ^ W E
The above Equations (14)–(21) identify the potential terms using word2vec embedding in both the backward and forward directions in light of syntactic and semantic information of the targeted context. The identified conceivable terms during the analysis of the bidirectional states are combined to make a single state H W E , as shown in Equation (22) given below:
H W E = h W E ;   h W E
Eventually, the features indicated from both BiGRU layers, based on contextual position information as H C P I and based on word2vec embedding as H W E , are merged as H . It produces a combined form of feature representations, as depicted in Equation (23).
H = H C P I ;   H W E
After utilizing the BiGRU layers and acquiring their merged features representation as H, a max-pooling layer deploys for feature mapping because max values depict the prominent terms of interest from identified features. These pooling layers comprise the information of the hidden layers. Therefore, the pooling layers of both BiGRU and CNN are interchanged between them to achieve the benefit of their combined learning abilities.

3.1.3. Attention Layer

According to the literature, different words have different influences within the context of the targeted sentence; thus, the Att-JM considers an attention mechanism to identify the most effectual terms of the targeted sentence. Therefore, the contextual position information reinforces attention mechanisms to determine the accurate position of those potential terms. Generally, the attention mechanism consists of two modules, known as the encoder and decoder. The responsibility of the encoder is to transform the targeted sentence into a real-valued equal-length vector, which provides the semantic information related to each term. Moreover, the responsibility of the decoder module is to provide output after the transformation of an encoded vector. Core equations regarding the attention mechanism are expressed in (24)–(26) [61] given below:
u p = t a n h W p H + b p
α p = e x p u p T u w p e x p u p T u w
H T = p α p H
In the above equations, u w expresses the updated context vector. Moreover, u p represents the result of the hidden layer’s vector H, and b p and W p are the bias and weight matrix of the attention mechanism, respectively. The last term, α p , expresses the attention score regarding the single word of the sentence. The proposed methodology applies attention mechanisms separately for both algorithms (MC-BiGRU and MC-CNN). These models learn regarding potential terms in the light of word embedding with contextual position information, and then the attention mechanism filters out the most prominent features from each sentence.
A M C B i G R U = A t t e n t i o n ( H T )
A M C C N N = A t t e n t i o n C T

3.1.4. Attention Layer

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. The last step performs the concatenation of resultant output layers as Y , which comprises the information about the potential terms along with their contextual position and syntactic and semantic information of the targeting context. These outcomes have forwarded toward the softmax activation function for prediction and classification purposes. Thus all the targeted terms found in the targeted sentence categorizes as aspect or non-aspect terms. The detailed architecture of the proposed method is depicted in Figure 2.
Y = A B i G R U + A M C C N N
P r e d i c t e d   a s p e c t s = S o f t m a x ( W . Y + b )
Figure 2. Attention-based (MC-BiGRU and MC-CNN) for aspect extraction and sentiment classification.
Manipulating Equation (30) emphasizes the predicted aspects from the targeting sentences and categorizes them as aspect or non-aspect terms. Therefore, the identified aspect terms are admitted and neglect the non-aspect terms. Afterward, ATT-JM determines the remaining terms’ polarity as positive, negative, and neutral. Therefore, these aspects are combined with their context to determine their corresponding sentiments. This process depicts in Equation (31). Thus, the softmax activation function acquires these predicted filtered terms integrated with the outcome of Equation (29) that classifies their sentiment as positive, negative, and neutral. This procedure is observable from the mathematical notation of Equation (32).
S = P r e d i c t e d   a s p e c t s , Y
P r e d i c t e d   s e n t i m e n t s = S o f t m a x ( S )

4. Experimental Arrangements

In this section, this study first illustrates the environmental arrangements for conducting these experiments. The datasets used during these experiments to evaluate that approach are then represented. Afterward, the preprocessing steps carried out for filtering these datasets are explained and finally express those baselines used for comparison to that approach. This study utilizes precision, recall, and F1 measure score for evaluating that approach because these are the most widely used metrics in the related literature on ABSA.

4.1. Environment for Experiments

The organized environment for conducting experiments regarding the Att-JM and evaluating their performance was based on the Windows 07 operating system. The hardware comprised a CPU Intel Xeon W3530 2.8 GHz, a GPU GeForce GTX 1060, and DDR3 RAM of size 16 GB. In addition, the implementation software included Python-oriented GPU Tensoflow 2.0, Keras 2.1.0, and the tool utilized is PyCharm IDE for Python 3.7.

4.2. Dataset Selection

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. The performance of the proposed model has been extensively evaluated based on seven standard datasets. These comprise the test and training datasets and are made publically available by the SemEval organizers. The first two datasets belong to SemEval2014 [35,45,62], which contains the reviews regarding restaurant and laptop domains, respectively. The subsequent two datasets also belong to the fields of restaurant and laptop while belonging to the semeval2015 [35,45,62] dataset.
The fifth and sixth also belong to the restaurant and laptop domain, which relate to SemEval2016 [35,45,62]. The last one is the Airline Twitter Sentiment dataset, which concerns U.S leading airlines. This Twitter data comprises positive, negative, and neutral tweets [63,64]. The dataset developed by CrowdFlower that publically available since 2020 and updated in 2021 and 2022. The experimental study prefers this dataset to analyze the performance of their proposed method through the leading social media platform Twitter because this emerges as a primary source to express public opinions. Table 2 shows the statistics of these underlying datasets.
Table 2. Statistics of the datasets.

4.3. Preprocessing

The proposed method performed preprocessing upon the textual reviews to obtain a structured and clean textual dataset. This preprocessing is defined as given below:
  • A lowercase conversion occurs regarding all words of sentences that belong to the English language.
  • The textual reviews’ paragraphs are split into sentences from the symbol of full-stop.
  • All sentences are split into tokens from the white spaces.
  • All punctuation words are terminated from the sentences of textual content.
  • All the words in sentences that comprise an impure form of alphabetical (alphanumeric) characters are removed.
  • All the words in sentences that incorporate stop words are removed.
  • All the words in sentences that contain special characters are removed.
  • All the words whose length is less than or equal to one character are terminated.

4.4. Baselines

The proposed methodology’s performance compares with the performance of the following baseline models. The details regarding the baseline models have represented in a tabular notation in the form of Table 3, given below:
Table 3. Summary of the baseline methods considered for the proposed approach.

5. Results and Discussion

In the empirical analysis, we noticed during the experiments that accurate performance regarding extraction/identification of aspects and their related sentiments highly relies upon word embedding. Moreover, the utilization of contextual position information endorses such genres of achievements. In addition, pre-training is another well-known significant factor concerns to the performance of word embedding. However, the best pre-training achieves when word embedding trains through a large volume of datasets. Due to this, the Att-JM utilizes the Google News dataset-based pre-train word2vec embeddings for performing tasks of ABSA. On the other side, Att-JM’s performance evaluates through the English-based standard datasets comprising SemEval 2014, 2015, and 2016, and Twitter datasets, which belong to the laptop, restaurant, and US-Airlines domains, respectively. During the training phase, the proposed framework determines the length of the maximum sentence of the textual review, which is assumed to be the highest attainable length of a single sentence within these reviews. All the sentences that conceived a size less than the maximum length were padded with zeros. The Att-JM considers those procedures for accurately identifying aspects and their sentiments. The detailed parameter setting of the proposed model is depicted in Table 4.
Table 4. Proposed approach parametric setting.
The proposed approach deeply analyzes the experimental arrangements of all the baseline studies. During the analysis, the proposed method determines which kind of parameter setting becomes the actual cause of performance increase and decrease. Depending upon such parametric analysis and utilizing the parameter optimization approach, the proposed method conceives the best parametric settings. These parameter settings are described in Table 4. Afterward, Table 5 represents the Att-JM’s comparison with baseline approaches based on the F1 measure regarding ATE, whereas their SC-based performance demonstrates in Table 6. These tabular notations describe Att-JM’s remarkable achievements when assessing its abilities through standardized datasets against the acquisitions of baseline approaches, which signifies its quantitative improvement in all domains of interest.
Table 5. Performance comparison of proposed approach based upon F1 measure score to baseline approaches regarding ATE.
Table 6. Performance comparison of proposed approach based upon F1 measure score to baseline approaches regarding the SC.

5.1. Aspect Term Extraction Performance Comparison

The Att-JM acquires a significant performance improvement in the SemEval-14L domain dataset in the F1 measure score. It achieves an appropriate margin success from all baselined approaches in F1 measure performance. For example, it gained 14.37% from MCNN + WV2 + POS, 15.16% from MCNN + WV + POS, 13.41% from DE-CNN, 21.38% from SRNN-GRU, 26.19% from BiGRU-WE-POS, and 13.53% from POS-AttWD-BLSTM-CRF. This performance gain can observe graphically in Figure 3.
Figure 3. Model compression with baseline approaches upon SemEval-14L domain.
The Att-JM abilities have not only been evaluated depending on a single domain. Therefore, in assessing the capabilities of the Att-JM, the SemEval-14R is also used. The collaborative consideration of different algorithms and sharing their learning information is the epicenter discrimination of this approach that provides a remarkable performance enhancement in terms of the F1-measure score compared to existing techniques, even when the domain varies. Therefore, it achieves 4.11% from MCNN + WV2 + POS, 6.31% from MCNN + WV + POS, 15.13% from SRNN-GRU, 17.38% from DE + CNN, 18.2% from BiGRU-WE-POS, and 3.14% from POS-AttWD-BLSTM-CRF. This performance improvement can observe in the form of a graphical representation in Figure 4.
Figure 4. Model compression with baseline approaches upon SemEval-14R domain.
These experiments also utilize the SemEval-16R domain during the Att-JM’s performance evaluation. Consequently, a boost in performance concerns the proposed approach noticing in terms of the F1 measure score. Hence based on these improvements, the experiments analyze the outperforms of the proposed technique compared to baseline models. Therefore, it gains 18.29% improvement from MCNN + WV2 + POS, 21.38% improvement from MCNN + WV + POS, 19.63% improvement from DE-CNN, 22.3% improvement from SRNN-GRU, 17.85% improvement from BiGRU-WE-POS, and 20.96% improvement from POS-AttWD-BLSTM-CRF in the performance. This performance refinement can observe in the form of a graphical representation in Figure 5.
Figure 5. Model compression with baseline approaches upon SemEval-16R domain.

5.2. Sentiment Classification Performance Comparison

The Att-JM’s acquires a significant performance improvement during the classification process of identified aspects in the SemEval-14L domain dataset, which is 89% in the F1 measure score. It achieves an appropriate margin success from all baselined approaches in F1 measure performance. For example, their gained performance enhancement is 17.5%, 18.13%, 19.89%, 20.06%, 24.72%, 14.37%, and 16.09% compared to CPA-SAA, CPA-SAF, HPNet-M, HPNet-S, JTSG, GCNs, and FAPN, respectively. These performance improvements are depicted graphically in Figure 6.
Figure 6. Model compression with baseline approaches on SemEval-14L’s domain.
Moreover, while classifying the identified aspects in the SemEval-14R domain dataset, the Att-JM obtained an enhancement, which is 88%. Therefore, its achievement improvement is 14.62%, 15.19%, 9.05%, 9.13%, 13.24%, 10.65%, and 14.93% compared to CPA-SAA, CPA-SAF, HPNet-M, HPNet-S, JTSG, GCNs, and FAPN, respectively. Figure 7 depicts these performance enhancements against the baseline approaches in a pictorial configuration.
Figure 7. Model compression with baseline approaches on SemEval-14R’s domain.
In addition, in the domain of the SemEval-15R dataset, the Att-JM obtains a performance improvement from the prescribed baselines, which is 90%. Consequently, the performance enhancement attains during the evaluation from baselines CPA-SAA, CPA-SAF, HPNet-M, HPNet-S, JTSG, GCNs, and FAPN is 29.85, 29.74, 11.06, 11.14, 15.34, 23.61, and 26.93, respectively. The proposed approach performs more reasonably than all the considered baselines, which is shown graphically in Figure 8.
Figure 8. Model compression with baseline approaches on SemEval-15R’s domain.
Furthermore, in the domain of the SemEval-16R dataset, the Att-JM achieved an enhancement, which is 92%. Thus it achieves improvements of 19.57%, 20.53%, 13.07%, 13.15%, 17.44, 16.57%, and 18.97% compared to CPA-SAA, CPA-SAF, HPNet-M, HPNet-S, JTSG, and GCNs, respectively. The collective consideration of contextual position information and syntactic and semantic information enhances the precision of the attention mechanism during the identification of potential terms. Moreover, the information sharing of hidden layers between the combined algorithms improves the prediction ability of the proposed approach. This performance enhancement depicts in Figure 9.
Figure 9. Model compression with baseline approaches on SemEval-16R’s domain.

5.3. Proposed Approach Performance and Other Metrics

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. The estimation of the performance enhancement regarding Att-JM comprises variant metrics and different datasets’ domains. Therefore, the Att-JM’s performance assesses through precision metrics using the previously discussed various datasets that express its significant performance improvement while conducting experiments. Moreover, the maximum precision improvement of the Att-JM ATE task observes in two datasets, i.e., SemEval-14R’s domain and SemEval-15L’s domain, which is 97% in both fields. On the other hand, the Att-JM’s minimum precision observes in SemEval-15R’s domain dataset is 89%. Furthermore, the SC’s task obtained the maximum precision value on the Twitter Data-US Airline domain, which is 91%. Additionally, its minimum precision value observes that the SemEval-15L and SemEval-16L domain dataset is 82%. During experiments, the Att-JM’s precision-based performance on all other datasets represents in Figure 10.
Figure 10. Model precision-based performance on variant datasets.
In addition, Att-JM’s performance analysis through the recall metrics upon previously described datasets also depicts its significant performance improvement. Therefore, the ATE task’s maximum recall value from SemEval-15R’s domain dataset is 98%. Furthermore, the minimum improvement related to recall metrics found from the SemEval-15L’s domain dataset is 82%. On the other hand, SC’s task’s maximum recall is 96% on SemEval-16R’s domain dataset. Moreover, the minimum recall value observes on SemEval-15L and SemEval-16L’s domain dataset is 90%. All other datasets’ recall-based observations regarding the Att-JM experiments can analyze in Figure 11.
Figure 11. Model recall-based performance on variant datasets.
In this way, the Att-JM’s performance improvement analyzes variant datasets using the F1 measure metric. Thus, the ATE task’s maximum F1 measure score observed on the SemEval-14L’s domain dataset is 95%. On the other hand, the minimum F1 measure score analysis on SemEval-15L’s domain dataset is 89%. In addition, SC’s task’s maximum F1 score is 92% on SemEval-16R’s domain dataset, whereas its minimum F1 score found on the SemEval-15L and SemEval-16L domain datasets is 86%. Therefore, during experiments, all other datasets’ F1 measure scores observed regarding Att-JM have depicted in Figure 12.
Figure 12. Model F1-measure score-based performance on variant datasets.
A frequently used visual tool known as the precision–recall (PR) curve evaluates the performance of proposed approaches regarding their discrimination capabilities between two classes. Therefore, the Att-JM uses the PR curve to assess its ability to discriminate among different categories. The area under the curve demonstrates the significant performance of the Att-JM while classifying multi-class classification, which depicts in Figure 13.
Figure 13. Model’s precision–recall curve.

5.4. Discussions

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. The Att-JM depicts a unique methodology that identifies aspects instead of acquiring them explicitly as input. Based on these identified aspects proposed approach predicts their sentiment from the corresponding context. Consequently, Figure 14 summarizes the overall performance of the Att-JM concerning different domains of standardized datasets during aspect extraction and SC with variant metrics perspectives such as precision, recall, and F1 measures. The framework significantly outperforms all datasets’ fields compared to baseline approaches. This pictorial representation expresses that the proposed approach obtains a better precision value while accomplishing the task of ATE, whereas the SC task performs better in the recall. The F1 measure depicts a mixed trend of achievement. The main reason behind better performance regarding this approach is the consideration of two different algorithms and sharing of their hidden layers between them. In addition, the utilization of contextual positional information vectors and word2vec embedding assists the attention mechanisms during the filtration of targeted aspects along with their sentiments with precise accuracy. These quantitative achievements on various domains’ datasets express the significance of deep features and algorithms’ combined utilization.
Figure 14. Model performance on variant datasets.

6. Conclusions

The Att-JM that combines MC-Bi-GRU and MC-CNN extracts the targeted aspect and classifies their sentiments from the textual reviews as positive, negative, and neutral. In addition, both algorithms have acquired the same number of input parameters, such as word2vec-based embedding and contextual position information, which becomes the prominent cause of the proposed model’s distinctions and incredible performance. The performance results of the Att-JM endorsed the envisioned hypothesis regarding parallel fused algorithms that equally and simultaneously input parameters provision can enhance the accuracy of targeted aspects and their predicted sentiments. The information sharing between the merged algorithms enables the model to obtain accurate information regarding the context and the prominent terms that influence each phrase within the context. In the future, we aim to extend this work to determine the implicit polarities relating to targeted aspects.

Author Contributions

Conceptualization, T.I.; Methodology, H.U.K., U.T. and J.-h.C.; Software, U.T.; Validation, M.A.K.; Formal analysis, W.A. and J.-h.C.; Investigation, H.U.K.; Resources, T.I. and J.-h.C.; Data curation, H.U.K.; Writing—original draft, W.A.; Writing—review & editing, W.A. and T.I.; Supervision, H.U.K.; Project administration, T.I. and M.A.K.; Funding acquisition, M.A.K., U.T. and J.-h.C. All authors contributed equally. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the “Human Resources Program in Energy Technology” of the Korea Institute of Energy Technology Evaluation and Planning (KETEP), granted financial resources from the Ministry of Trade, Industry & Energy, Republic of Korea. (No. 20204010600090).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets used in this work are publicly available.

Acknowledgments

The authors extend their appreciation to the “Human Resources Program in Energy Technology” of the Korea Institute of Energy Technology Evaluation and Planning (KETEP), granted financial resources from the Ministry of Trade, Industry & Energy, Republic of Korea. (No. 20204010600090).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Peng, Y.; Xiao, T.; Yuan, H. Cooperative gating network based on a single BERT encoder for aspect term sentiment analysis. Appl. Intell. 2021, 52, 5867–5879. [Google Scholar] [CrossRef]
  2. Yang, J.; Yang, R.; Lu, H.; Wang, C.; Xie, J. Multi-Entity Aspect-Based Sentiment Analysis with Context, Entity, Aspect Memory and Dependency Information. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 2019, 18, 1–22. [Google Scholar] [CrossRef]
  3. Yu, S.; Liu, D.; Zhu, W.; Zhang, Y.; Zhao, S. Attention-based LSTM, GRU and CNN for short text classification. J. Intell. Fuzzy Syst. 2020, 39, 1–8. [Google Scholar] [CrossRef]
  4. Mahmood, A.; Khan, H.U.; Ramzan, M. On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter. J. Web Eng. 2020, 19, 1–28. [Google Scholar]
  5. Jihan, N.; Senarath, Y.; Tennekoon, D.; Wickramarathne, M.; Ranathunga, S. Multi-Domain Aspect Extraction using Support Vector Machines. In Proceedings of Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017), Taipei, Taiwan, 27–28 November 2017; pp. 308–322. [Google Scholar]
  6. Hegde, R.; Seema, S. Aspect based feature extraction and sentiment classification of review data sets using Incremental machine learning algorithm. In Proceedings of the 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), Chennai, India, 27–28 February 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 122–125. [Google Scholar]
  7. Xiang, Y.; He, H.; Zheng, J. Aspect term extraction based on MFE-CRF. Information 2018, 9, 198. [Google Scholar] [CrossRef]
  8. Shams, M.; Baraani-Dastjerdi, A. Enriched LDA (ELDA): Combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction. Expert Syst. Appl. 2017, 80, 136–146. [Google Scholar] [CrossRef]
  9. Das, S.J.; Chakraborty, B. An Approach for Automatic Aspect Extraction by Latent Dirichlet Allocation. In Proceedings of the 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST), Morioka, Japan, 23–25 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
  10. Wan, C.; Peng, Y.; Xiao, K.; Liu, X.; Jiang, T.; Liu, D. An association-constrained LDA model for joint extraction of product aspects and opinions. Inf. Sci. 2020, 519, 243–259. [Google Scholar] [CrossRef]
  11. Liao, C.; Feng, C.; Yang, S.; Huang, H.-Y. A hybrid method of domain lexicon construction for opinion targets extraction using syntax and semantics. J. Comput. Sci. Technol. 2016, 31, 595–603. [Google Scholar] [CrossRef]
  12. Mowlaei, M.E.; Abadeh, M.S.; Keshavarz, H. Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst. Appl. 2020, 148, 113234. [Google Scholar] [CrossRef]
  13. Wai, M.S.; Aung, S.S. Simultaneous opinion lexicon expansion and product feature extraction. In Proceedings of the 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), Wuhan, China, 24–26 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 107–112. [Google Scholar]
  14. Liu, Q.; Gao, Z.; Liu, B.; Zhang, Y. Automated rule selection for opinion target extraction. Knowl.-Based Syst. 2016, 104, 74–88. [Google Scholar] [CrossRef]
  15. Kang, Y.; Zhou, L. RubE: Rule-based methods for extracting product features from online consumer reviews. Inf. Manag. 2017, 54, 166–176. [Google Scholar] [CrossRef]
  16. Asghar, M.Z.; Khan, A.; Zahra, S.R.; Ahmad, S.; Kundi, F.M. Aspect-based opinion mining framework using heuristic patterns. Clust. Comput. 2019, 22, 7181–7199. [Google Scholar] [CrossRef]
  17. Kushwaha, A.; Chaudhary, S. Review highlights: Opinion mining on reviews: A hybrid model for rule selection in aspect extraction. In Proceedings of the 1st International Conference on Internet of Things and Machine Learning, Liverpool, UK, 17–18 October 2017; pp. 1–6. [Google Scholar]
  18. Ruskanda, F.Z.; Widyantoro, D.H.; Purwarianti, A. Comparative Study on Language Rule Based Methods for Aspect Extraction in Sentiment Analysis. In Proceedings of the 2018 International Conference on Asian Language Processing (IALP), Bandung, Indonesia, 15–17 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 56–61. [Google Scholar]
  19. Matsuno, I.P.; Rossi, R.G.; Marcacini, R.M.; Rezende, S.O. Aspect-based Sentiment Analysis using Semi-supervised Learning in Bipartite Heterogeneous Networks. JIDM 2016, 7, 141–154. [Google Scholar]
  20. Rana, T.A.; Cheah, Y.-N. A two-fold rule-based model for aspect extraction. Expert Syst. Appl. 2017, 89, 273–285. [Google Scholar] [CrossRef]
  21. Marcacini, R.M.; Rossi, R.G.; Matsuno, I.P.; Rezende, S.O. Cross-domain aspect extraction for sentiment analysis: A transductive learning approach. Decis. Support Syst. 2018, 114, 70–80. [Google Scholar] [CrossRef]
  22. Shu, L.; Xu, H.; Liu, B. Lifelong learning crf for supervised aspect extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2017, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 148–154. [Google Scholar]
  23. Khan, H.U.; Nasir, S.; Nasim, K.; Shabbir, D.; Mahmood, A. Twitter trends: A ranking algorithm analysis on real time data. Expert Syst. Appl. 2021, 164, 113990. [Google Scholar] [CrossRef]
  24. Al-Smadi, M.; Talafha, B.; Al-Ayyoub, M.; Jararweh, Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybern. 2019, 10, 2163–2175. [Google Scholar] [CrossRef]
  25. Guo, L.; Zhang, D.; Wang, L.; Wang, H.; Cui, B. CRAN: A hybrid CNN-RNN attention-based model for text classification. In Proceedings of the International Conference on Conceptual Modeling, Xi’an, China, 22–25 October 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 571–585. [Google Scholar]
  26. Lai, S.; Xu, L.; Liu, K.; Zhao, J. Recurrent convolutional neural networks for text classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
  27. Maitama, J.Z.; Idris, N.; Abdi, A.; Shuib, L.; Fauzi, R. A Systematic Review on Implicit and Explicit Aspect Extraction in Sentiment Analysis. IEEE Access 2020, 8, 194166–194191. [Google Scholar] [CrossRef]
  28. Quan, W.; Chen, Z.; Gao, J.; Hu, X.T. Comparative Study of CNN and LSTM based Attention Neural Networks for Aspect-Level Opinion Mining. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 2141–2150. [Google Scholar]
  29. Yang, C.; Zhang, H.; Jiang, B.; Li, K. Aspect-based sentiment analysis with alternating coattention networks. Inf. Process. Manag. 2019, 56, 463–478. [Google Scholar] [CrossRef]
  30. Tang, F.; Fu, L.; Yao, B.; Xu, W. Aspect based fine-grained sentiment analysis for online reviews. Inf. Sci. 2019, 488, 190–204. [Google Scholar] [CrossRef]
  31. Gallego, F.O.; Corchuelo, R. Torii: An aspect-based sentiment analysis system that can mine conditions. Softw. Pract. Exp. 2020, 50, 47–64. [Google Scholar] [CrossRef]
  32. Han, Y.; Liu, M.; Jing, W. Aspect-level Drug Reviews Sentiment Analysis based on Double BiGRU and Knowledge Transfer. IEEE Access 2020, 8, 21314–21325. [Google Scholar] [CrossRef]
  33. Al-Smadi, M.; Al-Ayyoub, M.; Jararweh, Y.; Qawasmeh, O. Enhancing aspect-based sentiment analysis of Arabic hotels’ reviews using morphological, syntactic and semantic features. Inf. Process. Manag. 2019, 56, 308–319. [Google Scholar] [CrossRef]
  34. Karagoz, P.; Kama, B.; Ozturk, M.; Toroslu, I.H.; Canturk, D. A framework for aspect based sentiment analysis on turkish informal texts. J. Intell. Inf. Syst. 2019, 53, 431–451. [Google Scholar] [CrossRef]
  35. Zeng, D.; Dai, Y.; Li, F.; Wang, J.; Sangaiah, A.K. Aspect based sentiment analysis by a linguistically regularized CNN with gated mechanism. J. Intell. Fuzzy Syst. 2019, 36, 3971–3980. [Google Scholar] [CrossRef]
  36. Liu, N.; Shen, B.; Zhang, Z.; Zhang, Z.; Mi, K. Attention-based Sentiment Reasoner for aspect-based sentiment analysis. Hum.-Cent. Comput. Inf. Sci. 2019, 9, 35. [Google Scholar] [CrossRef]
  37. Yu, J.; Jiang, J.; Xia, R. Global inference for aspect and opinion terms co-extraction based on multi-task neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 27, 168–177. [Google Scholar] [CrossRef]
  38. Ye, H.; Yan, Z.; Luo, Z.; Chao, W. Dependency-tree based convolutional neural networks for aspect term extraction. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Jeju, Republic of Korea, 23–26 May 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 350–362. [Google Scholar]
  39. Xu, H.; Liu, B.; Shu, L.; Yu, P.S. Double embeddings and cnn-based sequence labeling for aspect extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2018, Volume 2: Short Papers, Melbourne, Australia, 15–20 July 2018; pp. 592–598. [Google Scholar]
  40. Shu, L.; Xu, H.; Liu, B. Controlled CNN-based Sequence Labeling for Aspect Extraction. arXiv 2019, arXiv:1905.06407. [Google Scholar]
  41. Da’u, A.; Salim, N. Aspect extraction on user textual reviews using multi-channel convolutional neural network. PeerJ Comput. Sci. 2019, 5, e191. [Google Scholar] [CrossRef]
  42. Li, X.; Bing, L.; Li, P.; Lam, W.; Yang, Z. Aspect term extraction with history attention and selective transformation. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) 2018, Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
  43. Saraiva, F.Z.R.; da Silva, T.L.C.; de Macêdo, J.A.F. Aspect Term Extraction Using Deep Learning Model with Minimal Feature Engineering. In Proceedings of the International Conference on Advanced Information Systems Engineering, Grenoble, France, 8–12 June 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 185–198. [Google Scholar]
  44. Tran, T.U.; Hoang, H.T.T.; Huynh, H.X. Aspect Extraction with Bidirectional GRU and CRF. In Proceedings of the 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), Danang, Vietnam, 20–22 March 2019; pp. 1–5. [Google Scholar]
  45. Wu, C.; Wu, F.; Wu, S.; Yuan, Z.; Huang, Y. A hybrid unsupervised method for aspect term and opinion target extraction. Knowl.-Based Syst. 2018, 148, 66–73. [Google Scholar] [CrossRef]
  46. Chauhan, G.S.; Meena, Y.K.; Gopalani, D.; Nahta, R. A two-step hybrid unsupervised model with attention mechanism for aspect extraction. Expert Syst. Appl. 2020, 161, 113673. [Google Scholar] [CrossRef]
  47. Meng, W.; Wei, Y.; Liu, P.; Zhu, Z.; Yin, H. Aspect Based Sentiment Analysis With Feature Enhanced Attention CNN-BiLSTM. IEEE Access 2019, 7, 167240–167249. [Google Scholar] [CrossRef]
  48. Liu, N.; Shen, B. Aspect-based sentiment analysis with gated alternate neural network. Knowl.-Based Syst. 2020, 188, 105010. [Google Scholar] [CrossRef]
  49. Zhu, Y.; Gao, X.; Zhang, W.; Liu, S.; Zhang, Y. A bi-directional LSTM-CNN model with attention for aspect-level text classification. Future Internet 2018, 10, 116. [Google Scholar] [CrossRef]
  50. Akhtar, M.S.; Garg, T.; Ekbal, A. Multi-task Learning for Aspect Term Extraction and Aspect Sentiment Classification. Neurocomputing 2020, 398, 247–256. [Google Scholar] [CrossRef]
  51. Zhang, C.; Li, Q.; Cheng, X. Text Sentiment Classification Based on Feature Fusion Text Sentiment Classification Based on Feature Fusion. Rev. Intell. Artif. 2020, 34, 515–520. [Google Scholar]
  52. Zhang, J.; Liu, F.a.; Xu, W.; Yu, H. Feature Fusion Text Classification Model Combining CNN and BiGRU with Multi-Attention Mechanism. Future Internet 2019, 11, 237. [Google Scholar] [CrossRef]
  53. Cheng, Y.; Yao, L.; Xiang, G.; Zhang, G.; Tang, T.; Zhong, L. Text sentiment orientation analysis based on multi-channel CNN and bidirectional GRU with attention mechanism. IEEE Access 2020, 8, 134964–134975. [Google Scholar] [CrossRef]
  54. Huang, B.; Guo, R.; Zhu, Y.; Fang, Z.; Zeng, G.; Liu, J.; Wang, Y.; Fujita, H.; Shi, Z. Aspect-level sentiment analysis with aspect-specific context position information. Knowl.-Based Syst. 2022, 243, 108473. [Google Scholar] [CrossRef]
  55. Feng, A.; Zhang, X.; Song, X. Unrestricted Attention May Not Be All You Need–Masked Attention Mechanism Focuses Better on Relevant Parts in Aspect-Based Sentiment Analysis. IEEE Access 2022, 10, 8518–8528. [Google Scholar] [CrossRef]
  56. Liao, W.; Zhou, J.; Wang, Y.; Yin, Y.; Zhang, X. Fine-grained attention-based phrase-aware network for aspect-level sentiment analysis. Artif. Intell. Rev. 2021, 55, 3727–3746. [Google Scholar] [CrossRef]
  57. Jin, Y.; Zhang, H.; Du, D. Incorporating positional information into deep belief networks for sentiment classification. In Proceedings of the Industrial Conference on Data Mining, New York, NY, USA, 12–13 July 2017; pp. 1–15. [Google Scholar]
  58. LeCun, Y.; Bengio, Y. Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 1995, 3361, 1995. [Google Scholar]
  59. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning 2014, Montreal, QC, Canada, 11 December 2014. [Google Scholar]
  60. Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of the SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation 2014, Doha, Qatar, 25 October 2014; pp. 103–111. [Google Scholar]
  61. Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
  62. Wang, W.; Pan, S.J. Syntactically Meaningful and Transferable Recursive Neural Networks for Aspect and Opinion Extraction. Comput. Linguist. 2020, 45, 705–736. [Google Scholar] [CrossRef]
  63. Xiang, R.; Chersoni, E.; Lu, Q.; Huang, C.R.; Li, W.; Long, Y. Lexical data augmentation for sentiment analysis. J. Assoc. Inf. Sci. Technol. 2021, 72, 1432–1447. [Google Scholar] [CrossRef]
  64. Tijare, P.V.; Prathuri, J.R. Correlation Between K-means Clustering and Topic Modeling Methods on Twitter Datasets. In Cyber Security and Digital Forensics; Springer: Berlin/Heidelberg, Germany, 2022; pp. 459–477. [Google Scholar]
  65. Xiao, D.; Ren, F.; Pang, X.; Cai, M.; Wang, Q.; He, M.; Peng, J.; Fu, H. A hierarchical and parallel framework for End-to-End Aspect-based Sentiment Analysis. Neurocomputing 2021, 465, 549–560. [Google Scholar] [CrossRef]
  66. Li, Z.; Li, L.; Zhou, A.; Lu, H. JTSG: A joint term-sentiment generator for aspect-based sentiment analysis. Neurocomputing 2021, 459, 1–9. [Google Scholar] [CrossRef]
  67. Dai, A.; Hu, X.; Nie, J.; Chen, J. Learning from word semantics to sentence syntax by graph convolutional networks for aspect-based sentiment analysis. Int. J. Data Sci. Anal. 2022, 14, 17–26. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.