Next Article in Journal
Improved Multiple Vector Representations of Images and Robust Dictionary Learning
Next Article in Special Issue
HFGNN-Proto: Hesitant Fuzzy Graph Neural Network-Based Prototypical Network for Few-Shot Text Classification
Previous Article in Journal
Adaptive Motion Skill Learning of Quadruped Robot on Slopes Based on Augmented Random Search Algorithm
Previous Article in Special Issue
Realistic Image Generation from Text by Using BERT-Based Embedding
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Feature-Based Approach for Sentiment Quantification Using Machine Learning

1
Department of Computer Science, Wah Campus, COMSATS University Islamabad, Wah Cantt 45550, Islamabad Capital Territory, Pakistan
2
College of Engineering, Al Ain University, Al Ain 64141, United Arab Emirates
3
Department of Computer Science and Information, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia
*
Author to whom correspondence should be addressed.
Electronics 2022, 11(6), 846; https://doi.org/10.3390/electronics11060846
Submission received: 13 January 2022 / Revised: 25 February 2022 / Accepted: 3 March 2022 / Published: 8 March 2022

Abstract

:
Sentiment analysis has been one of the most active research areas in the past decade due to its vast applications. Sentiment quantification, a new research problem in this field, extends sentiment analysis from individual documents to an aggregated collection of documents. Sentiment analysis has been widely researched, but sentiment quantification has drawn less attention despite offering a greater potential to enhance current business intelligence systems. In this research, to perform sentiment quantification, a framework based on feature engineering is proposed to exploit diverse feature sets such as sentiment, content, and part of speech, as well as deep features including word2vec and GloVe. Different machine learning algorithms, including conventional, ensemble learners, and deep learning approaches, have been investigated on standard datasets of SemEval2016, SemEval2017, STS-Gold, and Sanders. The empirical-based results reveal the effectiveness of the proposed feature sets in the process of sentiment quantification when applied to machine learning algorithms. The results also reveal that the ensemble-based algorithm AdaBoost outperforms other conventional machine learning algorithms using a combination of proposed feature sets. The deep learning algorithm RNN, on the other hand, shows optimal results using word embedding-based features. This research has the potential to help diverse applications of sentiment quantification, including polling, trend analysis, automatic summarization, and rumor or fake news detection.

1. Introduction

The social web has changed the way people communicate. The emergence of social media channels has resulted in the rapid creation of textual content. People create and post their content using social interaction platforms such as the web, discussion forums, Facebook, Twitter, etc. The rapid growth of content has sentiment information, which offers the potential for researchers to obtain people’s opinion through social media about entities including business, academia, products, marketing, etc. To extract meaningful information from raw data, a famous field known as sentiment analysis is in trend [1,2].
Sentiment analysis is an active research area that classifies opinions in negative, positive, and neutral texts. It also finds the grade of polarity (high, moderate, and mild). Sentiment analysis is carried out on three levels: document level, sentence level, and phrase level. Document-level sentiment analysis is the most popular and is followed by numerous opinion mining techniques. Document-level sentiment analysis creates groups of documents and classifies the target documents into the required set of classes. For binary classification, the target documents are classified as positive or negative, while for tertiary classification the required classes include positive, negative, and neutral. Document-level sentiment analysis does not consider diverse factors for analysis. Apart from document-level analysis, sentence-level analysis considers each sentence and counts its single supposition. Sentence-level sentiment analysis is based on the subjectivity of sentences. Document-level and sentence-level sentiment analysis do not give a clear understanding of the polarity of the text. Sentiment analysis has various research areas, including subjectivity analysis, sentiment polarity detection [3], sentiment quantification, etc. [4].
Sentiment quantification deals with the estimation of class labels of individual content. For sentiment quantification, various methods that include Classify and Count, Adjusted Count, and Instance-based Quantification Trees [5] are commonly used in different studies. However, an analysis of previous classification algorithms shows that for quantification, these standard algorithms are not an optimal solution. In this regard, research suggests that quantification should be considered as a different approach compared to classification and this research problem should be addressed using diverse approaches [6] Hence, it opens new research opportunities to explore different approaches and develop new methods in this domain.
Consequently, it raises the need to devise sentiment quantification-based methods that deliver high accuracy. To address the issue of accuracy, this research contributes to the field of sentiment quantification as follows:
  • Novel feature sets are proposed such as pos, tweet, specific, content, and sentiment, with the ranking of features carried out using feature selection approaches.
  • Deep features including word2vec and GloVe are used for sentiment analysis, and these features are considered for sentiment quantification.
  • Machine learning approaches have been investigated, including: (1) traditional techniques—Support Vector Machine (SVM), Naïve Bayes (NB), and Decision Tree (DT); (2) ensemble learners—Random Forest (RF) and AdaBoost; (3) deep learning-based—Deep Belief Network (DBN), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN).
  • The results for sentiment quantification are computed on SemEval2016, SemEval2017, STS-Gold, and Sanders. Standard performance evaluation measures, including Kullback–Leibler divergence (KLD), relative absolute error (RAE), and absolute error (AE), are applied for the evaluation of classifiers.
The remainder of this paper is organized as follows: Section 2 provides a review of existing research studies in the relevant literature. Section 3 provides details of the proposed research methodology. Section 4 provides a comprehensive discussion of the empirical-based results. Section 5 concludes the paper.

2. Related Work

Here, we discuss the existing research work of quantification based on sentiment analysis, which further divides into three main classes: aggregated methods, non-aggregated methods, and ensemble-based methods.

2.1. Sentiment Quantification

Sentiment quantification is the process of detecting frequent data. It is also known as prevalence estimation [7]. Quantification is used in different fields to deal with aggregated data. Sentiment quantification has various research applications, with some of them discussed here. The sentiment quantification method is used to detect communities [8], quantification of cross-lingual language [9], public health monitoring [10], and for tweet classification [11].

2.1.1. Aggregated Methods

The quantification approach is preferred to predict the class prior probabilities. Classify and Count (CC) is a famous technique used for quantification. However, CC is lacking in the estimation of the class distribution. Newer approaches are presented to overcome the processing limitation of CC based on Sample Means Matching (SMM). SMM is very effective in quantifying a large amount of data per second. Twenty-five datasets are taken to perform the experiments. The proposed technique has outperformed the existing methods of quantification [12]. Further, a model titled Ordinal Quantification based on trees is proposed. The purpose of the proposed technique is to accurately count the frequency of each class of unlabeled items in text. In ordinal quantification, the order of the class is defined. The same approach is utilized to find the highest stars in products’ reviews by analyzing their class prevalence over time. The proposed approach is evaluated on the SemEval2016 dataset and outperforms state-of-the-art methods [13].
Classifying data in deep layers is a complex task. Various techniques such as Neural and Statistical Machine Translation are used for this purpose, but are lacking in encoding and decoding while learning data from deep layers. To address these issues, Expectation Maximization (EM) is a quantification technique used for the automatic detection of errors in Arabic text to overcome the shortcomings of neural and statistical translation methods. EM dynamically combines information around layers. Moreover, during training, Kullback–Leibler divergence (KLD) is used to improve the model’s performance. The proposed approach is evaluated on two standard datasets, namely QALB-2014 and QALB-2015. The experiments showed that the proposed approach outperformed the previous techniques in terms of F1 score [14].
EM is also applied in the field of rumor detection for Arabic tweets. The proposed method is based on a semi-supervised (EM) method to extract the user and content-based features from tweets. Both feature sets are tested to check their significance. The proposed feature sets are trained through a semi-supervised (EM) method with a small base of labelled data. The proposed method was compared with the Gaussian method and outperformed the baseline with 78.6% accuracy [15].
An estimation of class proportions based on counting the classification errors is also used for classification purposes. A new method is proposed to adjust the classification errors by building confidence intervals. The model is introduced for the quantification of social media. The proposed approach is better than previous approaches that used the accurate estimation intervals [16].
The CC method has given rise to many other derived methods. “QuaNet”, another derived method, is introduced using Recurrent Neural Network (RNN) to learn “quantification embeddings”. These embeddings are firstly learned by a model then elaborated by CC. This approach is tested on Kindle, IMDb, and HP datasets. The results have shown the effectiveness of this model over existing quantification techniques [17].

2.1.2. Non-Aggregated Methods

González-Castro et al. [18] developed a model to quantify data based on a divergence measure. Hellinger distance is used for data distribution, validation, and to find the mismatch between test and validation. Prior probability estimation is used to minimize divergence. HDx and HDy are two types of Hellinger distance, where HDy needs output from the classifier and HDx does not need input from the classifier. Hopkins et al. introduced a non-parametric technique to quantify data [19]. The proposed approach quantifies data without any need of classification. American presidential blogs were selected as a dataset, with the proposed method decreasing the unbiased estimation. Software application was developed to quantify thousands of opinions about the US presidency.

2.1.3. Ensemble-Based Methods

Ensemble learners combine some weak learners. Some aggregated methods are combined to address the data distribution issues in sentiment analysis. Count HDy and Adjusted Classify are combined to make an ensemble model. CC, AC, PCC, PAC, and HDy are applied for learning the proposed ensemble model. Two schemes are presented to learn and predict. All learners are used to give prediction, then four sets of measures are applied to select the best model [20].
Ensemble methods give optimal results by building various training sets. Each model is then trained using data distribution techniques for quantification. The proposed methods categorize the errors of data distribution to enhance the performance of ensemble learners. The model explicitly addresses the binary quantification problem by focusing on the change in the expected distribution for each class. The results have shown that ensemble-based method have outperformed prior techniques [21].
The ensemble method is also explored in the field of soundscape ecology. A new approach is introduced which combines quantification and classification to train the CNN to classify classes of birds. The experiments show that the quantification performed better than the classification for the classification of bird species [22].
To obtain the optimal accuracy for sentiment quantification, machine learning techniques reported promising results. However, due to the sensitive nature of sentiments in the opinion-seeking process, there is a need to achieve more optimal results. Non-lexicon-based approaches are not widely applied for sentiment quantification. The role of diverse features can be exploited to find their role in improving the classification accuracy of sentiment quantification along with non-lexicon approaches. Some of these studies are summarized in Table 1.

2.2. Problem Statement and Formulation

Accuracy is an important parameter in the field of sentiment analysis. In the literature, various feature sets have been exploited using machine learning techniques to improve the results. However, there is still a need to investigate those feature sets for the emerging domain of sentiment quantification and to further improve accuracy because of the sensitive nature of sentiment in the opinion-seeking process. There is a need to contribute to the field of sentiment quantification to inquire about the impact of feature sets for sentiment quantification. In addition, as existing research studies only focus on machine learning, there is a need to explore deep learning approaches.
Formally, the research problem is to estimate the dispersion of a set D = { d 1 ,   d 2 , ,   d q } A of unlabeled documents across a set C = { c 1 ,   c 2 , ,   c p } S   of classes. In our research, the relevant literature deals with | C | = 3 . There are three classes: positive, neutral, and negative. As our focus is the SLMC quantification task, we consider the measures that have been proposed for evaluation. Some of the notations include quantification loss denoted by ( p ^ , p , D , C ) , error estimation denoted by , and distribution denoted by p for set D and class C by another distribution p ^ .

3. Proposed Research Methodology

This segment describes the approach used to quantify tweets based on sentiment analysis. A framework is proposed to give insights into the steps followed for sentiment quantification. A detail discussion follows on the feature engineering, algorithms applied, dataset considered, and performance evaluation measure used in this research.

3.1. Framework for Sentiment Quantification

The proposed model demonstrates the procedure carried out for sentiment quantification, as shown in Figure 1. In the first step, cleansing of the standard datasets (SemEval2016, SemEval2017, STS-Gold, and Sanders) is performed using data preprocessing techniques. Data preprocessing includes spaces removal, tokenization achievement, stop-words removal, case conversion, removal of words of less than three letters, and lemmatization for content feature extraction. In the second step, features based on content, POS (part of speech), tweet specific, and sentiment features are extracted using libraries of Python. Parameter settings for optimizing all classifiers are shown in Table 7. Further, the traditional machine learning approaches and deep learning approaches such as NB, AdaBoost, DT, RF, SVM and RNN, CNN_LSTM, and DBN are applied for sentiment quantification. Afterward, to count and classify the instance of data, the Classify and Count (CC) method is applied. Next, to evaluate the performance of machine learning classifiers, performance evaluation measures are applied for sentiment quantification.

3.2. Feature Engineering

Feature engineering consists of feature extraction and selection to achieve optimal accuracy. The selection of features has a major impact in achieving the desired results. Here, the discussion is divided into subparts such as proposed features, baseline features, and deep features to elaborate on the features’ impact on quantification accuracy. It also presents the selection and ranking of the features.

3.2.1. Proposed Feature Sets

To perform sentiment quantification, sentiment-based features are extracted through a sentiment-based lexicon, Vader. Vader is well known for the computation of sentiment features and is also used in various research studies [23,24].
POS tagging is applied to understand the nature of content-based features. The verb feature is used to obtain the action of an entity. The adjective count is considered because adjectives show the negative and positive characteristics of a tweet. The content-based feature exploits the diverse characteristics of text in tweets. Question marks, exclamation marks, and special characters are counted to check whether a person is asking a question or trying to attract attention. The retweet feature is also important to check if a tweet contains facts. If a tweet is frequently retweeted, it potentially relates to a sensitive topic containing more sentiments. The mention feature is considered to check if another person is added to the discussion. Moreover, the URL feature is used to obtain the number of URLs shared by users to support their point of view. The hashtag contains the content topic; therefore, this feature is also considered. A list of proposed features is shown in Table 2.
Features such as n-gram, word2vec and Bag-of-Words (BoW) are also used for sentiment quantification. The n-gram is a technique of word embedding, while BoW is used in natural language processing (NLP). BoW takes frequencies of each word to train a classifier. Term frequency-inverse document frequency (TF-IDF) finds the frequency of a particular word in text. Word2vec and GloVe are used to represent words into vectors. The feature word2vec also finds the syntactic and semantics’ similarity between words, while GloVe divides words into clusters to find similar and dissimilar words.

3.2.2. Feature Selection and Ranking

Optimal features increase the performance of classifiers. To find the optimal feature sets, three widely used feature selection methods are applied: Information Gain (IG), Gain Ration (GR), and Relief-f. IG is suitable for biased data and decreases due to its mutual information formula. GR computes the difference of attributes and considers the features with small difference. Relief-F computes the closest neighbors for all attributes.
Optimal features are selected through feature selection techniques, with those features having less importance and a negative influence on the target class omitted. Features ranked by feature selection algorithms according to their importance are shown in Table 3.
Table 3 shows that the sentiment features have a greater impact than other features. Negative sentiments and negative emoticons have a greater impact and are ranked higher than positive sentiments, with their importance consistent with existing research studies [25]. The adjective and adverbs features of POS have high scores and show the action of an object. Verbs have a greater impact than nouns due to their nature and show the attributes of any entity. POS features have a greater impact for predicting the target class.
Content features such as WH, quoted, and repetitive content-based features have high scores due to the subjective and opinionative nature of features, respectively [26]. Then, special characters followed by exclamation marks are ranked higher, showing the discussion within content. Another content-based feature, URL, has a high score as this focuses on the opinion of the subject. Hashtags are ranked low as they relate to topics that are important both in objective and opinionative content. Then follows hashtags that contain topics of the content which can be retweets and contain no sentiments.
Among the baseline features, TF-IDF and n-gram have higher scores than BoW. Among deep features, GloVe has a higher score than word2vec due to the fast processing of training. In addition, GloVe combines the benefits of the word2vec-based skip-gram model in word analogy tasks such as sentiment analysis and stance classification.

3.3. Classification Algorithms Applied

This subsection discusses the machine learning technique applied for sentiment quantification. Machine learning techniques are divided into three categories: traditional algorithms, ensemble learners, and deep learners.

3.3.1. Machine Learning Techniques

Some machine learning approaches applied on tweets for sentiment quantification are discussed.

Support Vector Machine (SVM)

SVM is based on linear regression. It uses high margins for high-dimensional data to classify negative and positive features. SVM optimization is calculated through the formula shown in Equations (1) and (2).
m a x f ( i 1 , i 2 ,   ,   i x ) = x = 1 n i k 1 2 y = 1 n q x i x ( c x c y ) q x i x
x = 1 n q x i x = 0 ,   0 i x C
where n is the number of trainings, i is a linear combination of training inputs, q is the training output, m is the cost function, and x and y measure the similarity of the dot product of c. SVM is not suitable for noisy and large datasets due to more execution time being required for the training process.

Decision Tree (DT)

DT is based on the rule of data decision and works on the principal of entropy and Information Gain (IG) techniques. DT helps to reduce the execution time of preprocessing for missing attributes. The entropy is calculated using the formula in Equation (4).
E n ( X ) = h = 1 m P h l o g 2 P h
where Ph is the probability that an attribute belongs to class m. To process the information in bits, the l o g 2 function is used. While En(X) is the required entropy for the class label, it is also known as Entropy.

3.3.2. Ensemble Learning Techniques

AdaBoost

AdaBoost is an ensemble method that aggregates the strong and weak learners. This technique helps to give more accurate decisions for predicting the target class. This method is also favorable for attributes that are misclassified during prediction. While training, each element is assigned a weight. The weight assignment calculation is shown in Equation (5).
Weight ( e i ) = 1 y
where e is the number of elements to be trained. Misclassified instances are computed as shown in Equation (6).
Err = ( Corr X ) X .  

Random Forest (RF)

RF is based on the technique of regression models. RF is suitable for high-variation data and calculate average to compute. RF works on a strategy of votes to calculate responses on data attributes. This approach uses a bagging method k times. For Q = 1... m, RF trains its regression tree by the formula given in Equation (7).
1 Q i = 1 m f x ( R )

3.3.3. Deep Learning Techniques

Deep Belief Networks (DBN)

DBN is a deep learning technique that follows the method of probability and statistics. DBN architecture contains hidden layers and blocks, with layers interconnected but blocks separated from each other. For sentiment analysis, DBN is in trend to exploit its efficiency for prediction [27].

CNN-LSTM

CNN (Convolutional Neural Network) is a deep learner but is not capable of calculating long-distance dependencies in data. LSTM (Long Short-Term Memory) can work well with long-distance dependencies and is combined with CNN to achieve the desired result for any biased datasets. CNN, along with LSTM, is applied for sentiment analysis and sequence-based text processing [28].

Recurrent Neural Network (RNN)

RNN is also a deep learner and is preferable for text processing and language translation. It works on the rule of memory. The previous output is saved and fed as input for the next phase. This strategy helps its sequential processing. RNN is applied for sentiment analysis [29].

3.4. Datasets

This subsection discusses the details of datasets selected for experimentation.

3.4.1. SemEval2016

SemEval2016 is a widely used dataset for quantification. SemEval2016 includes five tasks and contains tweets such as Feminist Movement (949 tweets), Abortion (933 tweets), Atheism (733 tweets), Hillary Clinton (984 tweets), and Climate Change (564 tweets). The details of the dataset are shown in Table 4. This dataset has been used in earlier studies [30,31].

3.4.2. SemEval2017

SemEval2017 is a famous multilingual dataset. This dataset consists of tweets in two languages: Arabic and English. English tweets are higher in number than Arabic tweets, which are only 19% of the dataset. The dataset contains 6100 testing and 3355 training tweets in Arabic, with 12,284 testing and 50,333 training tweets are in English. The details of the dataset, which has been used in earlier studies [32,33], are shown in Table 4.

3.4.3. STS-Gold

STS-Gold may present different sentiment labels because tweets and targets (entities) are annotated individually [34]. This dataset contains 1.6 million manually classified tweets. There were 1.28 million tweets used for training and 3.2 million tweets used for testing purposes.

3.4.4. Sanders

The Sanders [35,36] [N2-N4] dataset is manually labelled by one annotator and consists of 5512 tweets. We have used 4410 tweets for training and 1102 tweets for testing purposes.

3.5. Performance Evaluation Measures

This subsection describes the performance evaluation measures used for sentiment quantification.

3.5.1. Absolute Error (AE)

This measure corresponds to the average absolute difference between the predicted class prevalence and the true class prevalence, using Equation (8).
AE ( p ^ , p ) = 1 | C |   c j   ϵ   C | p ^ ( c j ) p ( c j ) |

3.5.2. Relative Absolute Error (RAE)

Relative absolute error (RAE) addresses the trouble that occurred in normalized absolute error by scaling the value | p ^ ( c j ) p ( c j ) | in Equation (9) with the true class prevalence.
R A E ( p ^ , p ) = 1 | C |   c j   ϵ   C | p ^ ( c j ) p ( c j ) | p ( c j )

3.5.3. Kullback–Leibler Divergence (KLD)

Another measure that has become the standard metric of quantification is normalized cross-entropy, better known as Kullback–Leibler divergence (KLD), which is used as a quantification measure and is defined in Equation (10).
K L D ( p ^ , p ) = c j   ϵ   C p ( c j ) log p ( c j ) p ^ ( c j )

4. Results and Discussion

According to the literature, there is room to improve the accuracy for sentiment-based quantification. Sentiment quantification is not addressed with feature-based approaches to achieve the desired accuracy. To address this problem, we have proposed various feature sets to reach the optimal accuracy for quantification of tweets based on sentiment analysis. To evaluate our feature-based framework, machine learning approaches which are subdivided into three levels, conventional algorithms, ensemble learners, and deep learning approaches, are applied on the SemEval2016, SemEval2017, STS-Gold, and Sanders datasets. To evaluate the performance of classifiers, performance evaluation metrics are applied.

4.1. Single Feature Sets

Detailed experiments are performed to evaluate the effectiveness of our proposed features for the sentiment quantification task. To achieve this aim, each proposed feature set is tested on both datasets to obtain detailed analysis. To evaluate their impact, NB, SVM, and DT conventional algorithms and AdaBoost and RF ensemble learners are applied on each set including POS, content, sentiment, and tweet specific. The results suggest that POS features show more effective results when applied with AdaBoost than when evaluated through performance evaluation metrics. AdaBoost dominated the other classifiers in terms of a lower error rate (KLD = 0.0213) for SemEval2016, SemEal2017 (KLD = 0.0214), STS-Gold (KLD = 0.0129), and Sanders (KLD = 0.0169), as shown in Table 5.

4.2. Combination of Feature Sets

To take the experiments to the next step, the proposed feature sets are combined with each other to determine the optimal pair of feature sets. The proposed features are combined in groups such as “sentiment + content” (SC), “sentiment + tweet specific” (ST), “sentiment + POS” (SP), “POS + tweet specific” (PT), “POS + content” (PC),”content + tweet specific” (CT), “sentiment + POS + content” (SPC), “sentiment + content + tweet specific” (SCT), “sentiment + POS + tweet specific” (SPT), “POS + content + tweet specific” (PCT), and all feature sets. The results have shown that when all proposed features are combined, they outperform all single feature sets. SVM outperformed other classifiers when applied with all feature sets “sentiment + POS + content + tweet specific” (SPCT). SVM has more promising results with a lower error rate for all four datasets, with KLD = 0.014 for SemEval2016, 0.013 for SemEval2017, 0.0051 for STS-Gold, and 0.0092 for Sanders, as shown in Table 6, Table 7, Table 8 and Table 9, respectively.

4.3. Optimal Feature Sets

The results analysis is also represented in Figure 2 and Figure 3. Results have shown the impact of POS as a single feature set. POS contains the action of an object and also important information. Thus, when applied with machine learning algorithms, it has shown promising results and a lower error rate for sentiment quantification, as shown in Figure 2 for all four datasets. When the feature sets are combined their effectiveness is increased, which shows the usefulness of these features that contain meaningful information and outperformed other approaches when applied with SVM for all four datasets, as shown in Figure 3.

4.4. Results of Deep Features

Some of the deep features are also exploited to find out their impact on sentiment quantification. Deep features, including GloVe, BoW, word2vec, and n-gram, are extracted from all four datasets, SemEval2016, SemEval2017, STS-Gold, and Sanders. The deep features are tested with deep learning approaches such as DBN, RNN, and CNN-LSTM. The deep learning approaches are chosen due to their scalability and efficiency. Deep learning approaches do not require feature engineering and are suitable to achieve desired results. The results suggest that RNN is the best approach when applied with GloVe, which had a lower error rate (KLD = 0.009) for SemEval2016 and SemEval2017 (KLD = 0.011), and lower error rate (KLD = 0.004) for STS-Gold and Sanders (KLD = 0.008) when applied with word2vec among other deep learning approaches, as shown in Table 10. Deep learning approaches outperformed the conventional and ensemble-based machine learning approaches due to their high efficacy.

4.5. Comparison of Proposed Technique with Existing Techniques

The proposed framework is effective and has achieved the desired results of accuracy for sentiment quantification. The proposed framework outperformed the baseline approaches for SemEval2016, SemEval2017, STS-Gold, and Sanders, as shown in Table 11. Parameter settings for optimizing machine learning algorithms are shown in Table 12. The settings for deep learning algorithms DBN, CNN-LSTM, and RNN are CNN_Layers = “3”, Activation_Function = “tanh”, MaxPooling = 3, parameter values for hidden_layers = (300,300,300), learning_rate = “adaptive”, alpha = 0.001, regularizes L2(overfitting) = (0.01), loss = “categorical_crossentropy”, and optimizer = “Rmsprop”.

5. Conclusions

This study contributes to the field of quantification based on sentiment analysis. The study exploits the diverse feature sets and explores the performance of machine learning approaches for the quantification of tweets. The proposed feature sets, such as POS, tweet specific, and sentiment- and content-based, increase the performance of classifiers. When the proposed feature sets are combined, they demonstrate efficient results in terms of quantification accuracy.
Three conventional machine learning approaches, namely Naïve Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM), are used in the proposed framework. AdaBoost and Random Forest are used in the case of ensemble-based approaches. Recurrent Neural Network (RNN), Deep Belief Network (DBN), and Convolutional Neural Network (CNN-LSTM) are exploited in the deep learning category of approaches. Ensemble approach AdaBoost dominated the other classifiers when applied using a single feature set, in terms of a lower error rate (KLD = 0.0213) for SemEval2016, SemEval2017 (KLD = 0.0214), STS-Gold (KLD = 0.0129), and Sanders (KLD = 0.0169). When the feature sets are combined, SVM has more promising results with a lower error rate for all four datasets, with KLD = 0.014 for SemEval2016, 0.013 for SemEval2017, 0.0051 for STS-Gold, and 0.0092) for Sanders. The computed results show that RNN with GloVe performed best for SemEval2016 and SemEval2017, and RNN with word2vec performed best for STS-Gold and Sanders.
Future work directions are as follows:
  • As the social web channels provide a facility to add multilingual content, it raises diverse research issues for natural language processing and context understanding. In the case of multilingual content, especially where the diversity in different languages’ structure presents issues such as sentence structure, stemming, parsing, tagging, etc., more research is needed.
  • Each language has its own syntax and vocabulary. Text-based features of each language provide different research challenges. Therefore, applying the proposed features and algorithms on languages such as Arabic, Persian, and Urdu will be an interesting research work, as these languages are written from right to left.
  • The analysis and learning carried out using one language can be applied to another language using cross-lingual analysis. Thus, the cross-lingual sentiment quantification task can also be a potential research area, especially in languages that lack annotated datasets.

Author Contributions

Conceptualization, K.A.; Formal analysis, K.A. and S.I.; Funding acquisition, F.K.A. and N.A.; Investigation, E.U.M.; Methodology, K.A.; Resources, F.K.A. and N.A.; Software, K.A.; Supervision, M.W.N. and E.U.M.; Validation, M.W.N.; Writing—original draft, K.A.; Writing—review & editing, S.I., F.K.A. and N.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia, for funding this work through Research Group no. RG-21-51-01.

Data Availability Statement

All the data used in this research study is publicly available for download and use for any research purpose.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Zamir, A.; Khan, H.U.; Mehmood, W.; Iqbal, T.; Akram, A.U. A feature-centric spam email detection model using diverse supervised machine learning algorithms. Electron. Libr. 2020, 38, 633–657. [Google Scholar] [CrossRef]
  2. Mahmood, A.; Khan, H.U.; Ramzan, M.J. On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter. J. Web Eng. 2020, 1–28, 21–28. [Google Scholar]
  3. Jabreel, M.; Moreno, A.J.A.S. A deep learning-based approach for multi-label emotion classification in tweets. Appl. Sci. 2019, 9, 1123. [Google Scholar] [CrossRef] [Green Version]
  4. Chen, C.Y.-H.; Hafner, C.M.J. Sentiment-induced bubbles in the cryptocurrency market. J. Risk Insur. 2019, 12, 53. [Google Scholar] [CrossRef] [Green Version]
  5. Jungherr, A.; Schoen, H.; Posegga, O.; Jürgens, P. Digital trace data in the study of public opinion: An indicator of attention toward politics rather than political support. Soc. Sci. Comput. Rev. 2017, 35, 336–356. [Google Scholar] [CrossRef] [Green Version]
  6. Rosenthal, S.; Farra, N.; Nakov, P. SemEval-2017 task 4: Sentiment analysis in Twitter. In Proceedings of the Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), Vancouver, BC, Canada, 4 August 2017; pp. 502–518. [Google Scholar]
  7. Gao, W.; Sebastiani, F. From classification to quantification in tweet sentiment analysis. Soc. Net. Anal. Min. 2016, 6, 1–22. [Google Scholar] [CrossRef]
  8. Moradi-Jamei, B.; Shakeri, H.; Poggi-Corradini, P.; Higgins, M.J. A new method for quantifying network cyclic structure to improve community detection. Physica A 2021, 561, 125116. [Google Scholar] [CrossRef]
  9. Esuli, A.; Moreo, A.; Sebastiani, F. Cross-lingual sentiment quantification. IEEE Intell. Syst. 2020, 35, 106–114. [Google Scholar] [CrossRef]
  10. Faryal, M.; Iqbal, M.; Tahreem, H. Mental health diseases analysis on Twitter using machine learning. IKSP J. Comput. Sci. Eng. 2021, 1, 16–25. [Google Scholar]
  11. Samuel, J.; Ali, G.; Rahman, M.; Esawi, E.; Samuel, Y. COVID-19 public sentiment insights and machine learning for tweets classification. Information 2020, 11, 314. [Google Scholar] [CrossRef]
  12. Hassan, W.; Maletzke, A.; Batista, G. Accurately quantifying a billion instances per second. In Proceedings of the 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), Sydney, Australia, 6 October 2020; pp. 1–10. [Google Scholar]
  13. Da San Martino, G.; Gao, W.; Sebastiani, F. Ordinal text quantification. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, Pisa, Italy, 7 July 2016; pp. 937–940. [Google Scholar]
  14. Solyman, A.; Zhenyu, W.; Qian, T.; Elhag, A.A.M.; Rui, Z.; Mahmoud, Z. Automatic Arabic Grammatical Error Correction based on Expectation Maximization routing and target-bidirectional agreement. Know.-Based Syst. 2022, 241, 108180. [Google Scholar] [CrossRef]
  15. Alzanin, S.M.; Azmi, A.M. Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation–maximization. Know.-Based Syst. 2019, 185, 104945. [Google Scholar] [CrossRef]
  16. Daughton, A.R.; Paul, M. A bootstrapping approach to social media quantification. Soc. Net. Anal. Min. 2021, 11, 1–14. [Google Scholar] [CrossRef]
  17. Esuli, A.; Moreo Fernández, A.; Sebastiani, F. A recurrent neural network for sentiment quantification. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 17 October 2018; pp. 1775–1778. [Google Scholar]
  18. González-Castro, V.; Alaiz-Rodríguez, R.; Alegre, E. Class distribution estimation based on the Hellinger distance. Inf. Sci. 2013, 218, 146–164. [Google Scholar] [CrossRef]
  19. Hopkins, D.J.; King, G.J.A.J.o.P.S. A method of automated nonparametric content analysis for social science. Am. J. Political Sci. 2010, 54, 229–247. [Google Scholar] [CrossRef] [Green Version]
  20. Pérez-Gállego, P.; Castano, A.; Quevedo, J.R.; del Coz, J. Dynamic ensemble selection for quantification tasks. Inf. Fus. 2019, 45, 1–15. [Google Scholar] [CrossRef]
  21. Pérez-Gállego, P.; Quevedo, J.R.; del Coz, J. Using ensembles for problems with characterizable changes in data distribution: A case study on quantification. Inf. Fus. 2017, 34, 87–100. [Google Scholar] [CrossRef] [Green Version]
  22. Dias, F.F.; Ponti, M.A.; Minghim, R. A classification and quantification approach to generate features in soundscape ecology using neural networks. Neur. Comput. Appl. 2021, 34, 1–15. [Google Scholar] [CrossRef]
  23. Adarsh, R.; Patil, A.; Rayar, S.; Veena, K. Comparison of VADER and LSTM for sentiment analysis. Int. J. Recent Technol. Eng. 2019, 7, 540–543. [Google Scholar]
  24. Alabrah, A.; Alawadh, H.M.; Okon, O.D.; Meraj, T.; Rauf, H.T. Gulf Countries’ Citizens’ Acceptance of COVID-19 Vaccines—A Machine Learning Approach. Mathematics 2022, 10, 467. [Google Scholar] [CrossRef]
  25. Khan, H.U. Mixed-sentiment classification of web forum posts using lexical and non-lexical features. J. Web Eng. 2017, 16, 161–176. [Google Scholar]
  26. Khan, H.U.; Daud, A. Using Machine Learning Techniques for Subjectivity Analysis based on Lexical and Nonlexical Features. J. Web Eng. 2017, 14, 481–487. [Google Scholar]
  27. Almanaseer, W.; Alshraideh, M.; Alkadi, O. A deep belief network classification approach for automatic diacritization of arabic text. Appl. Sci. 2021, 11, 5228. [Google Scholar] [CrossRef]
  28. Elzayady, H.; Badran, K.M.; Salama, G.I. Arabic Opinion Mining Using Combined CNN-LSTM Models. Int. J. Intell. Syst. Appl. 2020, 12, 25–36. [Google Scholar] [CrossRef]
  29. Nemes, L.; Kiss, A.J. Social media sentiment analysis based on COVID-19. J. Inf. Syst. Telecommun. 2021, 5, 1–15. [Google Scholar] [CrossRef]
  30. Zeng, J.; Liu, T.; Jia, W.; Zhou, J. Relation construction for aspect-level sentiment classification. Inf. Sci. 2022, 586, 209–223. [Google Scholar] [CrossRef]
  31. Wu, C.; Xiong, Q.; Yi, H.; Yu, Y.; Zhu, Q.; Gao, M.; Chen, J. Multiple-element joint detection for Aspect-Based Sentiment Analysis. Knowl.-Based Syst. 2021, 223, 107073. [Google Scholar] [CrossRef]
  32. Pathak, A.R.; Pandey, M.; Rautaray, S. Topic-level sentiment analysis of social media data using deep learning. Appl. Soft Comput. 2021, 108, 107440. [Google Scholar] [CrossRef]
  33. Hamraoui, I.; Boubaker, A. Impact of Twitter sentiment on stock price returns. Soc. Net. Anal. Min. 2022, 12, 1–15. [Google Scholar] [CrossRef]
  34. Saif, H.; Fernandez, M.; He, Y.; Alani, H. Evaluation datasets for Twitter sentiment analysis: A survey and a new dataset, the STS-Gold. In Proceedings of the 1st Interantional Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI (ESSEM 2013), Turin, Italy, 3 December 2013. [Google Scholar]
  35. Wang, D.; Al-Rubaie, A.; Hirsch, B.; Pole, G.C. National happiness index monitoring using Twitter for bilanguages. Soc. Net. Anal. Min. 2021, 11, 1–18. [Google Scholar] [CrossRef]
  36. Deitrick, W.; Hu, W. Mutually enhancing community detection and sentiment analysis on twitter networks. J. Data Anal. Inf. Proc. 2013, 1, 19–29. [Google Scholar] [CrossRef] [Green Version]
  37. Nakov, P.; Ritter, A.; Rosenthal, S.; Sebastiani, F.; Stoyanov, V. SemEval-2016 task 4: Sentiment analysis in Twitter. arXiv 2019, arXiv:1912.00741. [Google Scholar]
  38. Ayyub, K.; Iqbal, S.; Munir, E.U.; Nisar, M.W.; Abbasi, M. Exploring diverse features for sentiment quantification using machine learning algorithms. IEEE Access 2020, 8, 142819–142831. [Google Scholar] [CrossRef]
  39. Labille, K.; Gauch, S. Optimizing Statistical Distance Measures in Multivariate SVM for Sentiment Quantification. In Proceedings of the the Thirteenth International Conference on Information, Process, and Knowledge Management, Nice, France, 18–22 July 2021; pp. 57–64. [Google Scholar]
Figure 1. The proposed model for sentiment quantification.
Figure 1. The proposed model for sentiment quantification.
Electronics 11 00846 g001
Figure 2. Comparison of single feature set performances.
Figure 2. Comparison of single feature set performances.
Electronics 11 00846 g002
Figure 3. Comparison of combined feature sets performances.
Figure 3. Comparison of combined feature sets performances.
Electronics 11 00846 g003
Table 1. Quantification Techniques.
Table 1. Quantification Techniques.
MethodApproachYearFeature(s)Dataset
AggregatedExpectation Minimization (EM) [15] 2022User and content-based features.Tweets
Expectation Minimization (EM) [14]2022Quantifying errors in textQALB-2014,
QALB-2015
Sample Means Matching (SMM) [12]2020Quantifying billions of data elements in seconds.25 benchmark datasets
QuaNet [17]2018Quantification of data.Kindle,
IMDb,
HP (Harry Potter)
OQT [13]2016Quantifying TweetsSemEval2016
Non-AggregatedAutomated Nonparametric Content Analysis [18]2010Quantification of data. Blogs
HDy and HDx [18]2013Quantification of data.UCI datasets
Ensemble-basedEnsembles for Quantification [21]2017Data distribution and quantification.UCI datasets,
Sentiment140
Dynamic Ensembles [20]2019Quantification of data.
Applied techniques based on ensemble learners.
UCI datasets
Table 2. Proposed feature sets for tweets.
Table 2. Proposed feature sets for tweets.
Sr#CategoriesDescriptionSymbol
1SentimentSentiment score of the tweet S s e n t T
2Sentiment—Number of positive words S P W T
3Sentiment—Number of negative words S N W T
4Count of positive emoticons S P E T
5Count of negative emoticons S N E T
6POSNumber of nouns in a tweet P N T
7Number of pronouns in a tweet P P T
8Verbs frequency in a tweet P V T
9Adjectives frequency in a tweet P A T
10ContentNumber of special symbols C S S T
11Number of WH words in a tweet C W H T
12Number of question marks in a tweet C Q M T
13Number of exclamation marks C E M T
14Number of capitalized words C R W T
15Number of quoted words C Q W T
16Tweet SpecificNumber of retweets T R T T
17Number of mentions T M T
18Number of URLs T U R L T
19Hashtag length T H L
20Is it a tweet or retweet? C R T
21Number of hashtags T H T T
22Number of capitalized hashtags T C H T
Table 3. Feature engineering for tweets.
Table 3. Feature engineering for tweets.
SentimentParts of SpeechContentTweet SpecificBaselineDeep
Sentiment score of the tweetVerbs frequency in a tweetNumber of WH words in a tweetNumber of mentionsTF-IDFGloVe
Sentiment—Number of negative wordsAdjectives frequency in a tweetNumber of question marks in a tweetNumber of retweetsn-gramWord2vec
Sentiment—Number of positive wordsNumber of nouns in a tweetNumber of quoted wordsNumber of URLsBoW
Count of negative emoticonsNumber of pronouns in a tweetNumber of repetitive wordsNumber of hashtags
Count of positive emoticons Number of special symbolsNumber of capitalized hashtags
Number of exclamation marksHashtag length
Is it a tweet or retweet?
Table 4. Tweet statistics of investigated datasets.
Table 4. Tweet statistics of investigated datasets.
DatasetTotal TweetsTesting
Tweets
Training
Tweets
SemEval201668,19751,85116,346
SemEval201762,61712,28450,333
STS-Gold1,600,000320,0001,280,000
Sanders551211024410
Table 5. Comparison of single feature sets using ML classifiers.
Table 5. Comparison of single feature sets using ML classifiers.
DatasetFeaturesNBDTSVMAdaBoostRF
AERAEKLDAERAEKLDAERAEKLDAERAEKLDAERAEKLD
SemEval2016S0.0330.6910.0250.0400.7980.0300.0300.5990.0230.0290.5950.0220.0300.6340.023
P0.0440.8980.0340.0370.7500.0280.0280.5750.0220.0280.5710.0210.0420.8530.032
C0.0390.8020.0300.0350.7080.0260.0290.5790.0220.0360.7150.0270.0370.7500.028
T0.0551.0970.0420.0290.6030.0220.0290.5960.0220.0290.5920.0220.0531.0560.040
SemEval2017S0.0300.6180.0220.0410.8230.0310.0290.5770.0220.0280.5760.0220.0350.7320.026
P0.0490.9860.0370.0350.7100.0270.0280.5720.0210.0270.5470.0200.0300.6070.023
C0.0430.8730.0320.0320.6570.0250.0280.5740.0220.0330.6710.0260.0320.6570.024
T0.0721.4530.0560.0270.5650.0210.0280.5750.0220.0280.5720.0210.0290.5790.022
STS-GoldS0.0210.4340.0160.0330.6610.0250.0190.3780.0140.0190.3780.0140.0270.5660.020
P0.0420.8510.0320.0260.5320.0200.0180.3750.0140.0170.3470.0130.0200.4130.016
C0.0350.7230.0270.0230.4720.0180.0190.3760.0140.0240.4850.0190.0230.4730.018
T0.0691.3850.0530.0180.3700.0140.0190.3770.0140.0180.3740.0140.0190.3800.015
SandersS0.0250.5320.0190.0370.7480.0280.0240.4840.0180.0240.4830.0180.0310.6550.023
P0.0450.9230.0350.0310.6270.0240.0240.4800.0180.0220.4540.0170.0260.5170.020
C0.0390.8030.0300.0280.5710.0210.0240.4820.0180.0290.5840.0220.0280.5710.021
T0.0711.4210.0540.0230.4740.0170.0240.4820.0180.0240.4790.0180.0240.4860.019
Table 6. Comparison of feature sets’ combinations using ML classifiers on SemEval2016.
Table 6. Comparison of feature sets’ combinations using ML classifiers on SemEval2016.
FeaturesNBDTSVMAdaBoostRF
AERAEKLDAERAEKLDAERAEKLDAERAEKLDAERAEKLD
SP0.0360.7360.0270.0350.7130.0270.0250.5190.0190.0250.5120.0190.0330.6800.025
SC0.0320.6710.0240.0330.6790.0250.0250.5060.0190.0280.5750.0220.0290.6130.022
ST0.0400.8240.0310.0300.6110.0220.0240.4990.0180.0240.4940.0180.0380.7730.029
PC0.0370.7580.0280.0310.6300.0230.0230.4670.0170.0260.5370.0200.0340.7050.026
PT0.0450.9160.0350.0270.5580.0200.0220.4610.0170.0220.4540.0160.0430.8690.033
CT0.0420.8500.0320.0250.5230.0190.0220.4480.0160.0250.5200.0190.0390.8010.030
SPC0.0320.6640.0240.0300.6170.0230.0210.4310.0160.0230.4770.0170.0290.6060.022
SPT0.0370.7690.0280.0270.5650.0210.0200.4230.0150.0200.4180.0150.0350.7160.026
SCT0.0350.7190.0260.0260.5380.0190.0200.4100.0150.0220.4570.0170.0320.6630.024
PCT0.0380.7880.0290.0240.5050.0180.0190.3890.0140.0210.4360.0160.0360.7360.027
SPCT0.0340.7060.0250.0250.5240.0190.0180.3790.0140.0200.4120.0150.0310.6490.023
Table 7. Comparison of feature sets combination using ML classifiers on SemEval2017.
Table 7. Comparison of feature sets combination using ML classifiers on SemEval2017.
FeaturesNBDTSVMAdaBoostRF
AERAEKLDAERAEKLDAERAEKLDAERAEKLDAERAEKLD
SP0.0360.7420.0270.0350.7050.0270.0250.5050.0190.0240.4890.0180.0290.6060.022
SC0.0320.6690.0240.0330.6650.0250.0240.4910.0180.0270.5420.0200.0290.6170.022
ST0.0480.9710.0360.0290.6040.0220.0230.4760.0180.0230.4730.0170.0280.5750.022
PC0.0410.8430.0310.0280.5810.0220.0220.4620.0170.0240.5010.0190.0260.5250.019
PT0.0571.1570.0440.0250.5170.0190.0220.4480.0160.0210.4310.0160.0240.4750.018
CT0.0531.0830.0410.0230.4740.0170.0210.4340.0160.0240.4860.0180.0240.4890.018
SPC0.0330.6950.0250.0290.5930.0220.0200.4200.0150.0220.4450.0160.0250.5210.019
SPT0.0440.9030.0330.0260.5460.0200.0200.4060.0150.0190.3950.0140.0240.4880.018
SCT0.0410.8480.0310.0250.5140.0190.0190.3920.0140.0210.4270.0160.0240.4920.018
PCT0.0480.9810.0360.0220.4570.0160.0180.3780.0140.0190.4030.0150.0210.4300.016
SPCT0.0400.8290.0300.0240.4950.0180.0170.3640.0130.0180.3830.0140.0220.4500.016
Table 8. Comparison of feature sets combination using ML classifiers on STS-Gold.
Table 8. Comparison of feature sets combination using ML classifiers on STS-Gold.
FeaturesNBDTSVMAdaBoostRF
AERAEKLDAERAEKLDAERAEKLDAERAEKLDAERAEKLD
SP0.0280.5750.0210.0260.5280.0200.0150.2990.0110.0140.2830.0100.0200.4190.015
SC0.0240.4940.0180.0240.4830.0180.0140.2830.0100.0170.3400.0130.0210.4340.015
ST0.0410.8350.0310.0200.4140.0150.0130.2670.0100.0130.2650.0100.0190.3790.014
PC0.0330.6910.0250.0190.3870.0140.0120.2520.0090.0140.2940.0110.0160.3230.012
PT0.0521.0460.0400.0150.3160.0110.0110.2360.0090.0100.2190.0080.0130.2610.010
CT0.0470.9630.0360.0130.2700.0100.0110.2200.0080.0140.2780.0100.0140.2790.010
SPC0.0250.5240.0190.0200.4020.0150.0100.2050.0070.0110.2340.0090.0150.3250.012
SPT0.0370.7590.0280.0170.3490.0130.0090.1900.0070.0090.1780.0060.0140.2810.010
SCT0.0340.6980.0250.0150.3140.0110.0080.1740.0060.0100.2130.0080.0140.2880.010
PCT0.0410.8470.0310.0120.2510.0090.0080.1590.0060.0090.1880.0070.0100.2140.008
SPCT0.0320.6760.0240.0140.2940.0110.0070.1440.0050.0080.1650.0060.0120.2410.009
Table 9. Comparison of feature sets combination using ML classifiers on Sanders.
Table 9. Comparison of feature sets combination using ML classifiers on Sanders.
FeaturesNBDTSVMAdaBoostRF
AERAEKLDAERAEKLDAERAEKLDAERAEKLDAERAEKLD
SP0.0320.6650.0240.0310.6220.0230.0200.4090.0150.0190.3930.0140.0250.5190.019
SC0.0280.5880.0210.0280.5800.0220.0190.3940.0150.0220.4470.0170.0250.5320.019
ST0.0440.9080.0340.0250.5150.0190.0180.3790.0140.0180.3760.0140.0240.4840.018
PC0.0370.7720.0280.0240.4910.0180.0180.3640.0130.0200.4040.0150.0210.4310.016
PT0.0551.1050.0420.0200.4230.0150.0170.3490.0130.0160.3320.0120.0190.3750.014
CT0.0501.0270.0380.0180.3790.0140.0160.3340.0120.0190.3890.0140.0190.3910.015
SPC0.0290.6150.0220.0250.5040.0190.0150.3200.0120.0170.3470.0130.0200.4300.015
SPT0.0410.8360.0310.0220.4540.0170.0150.3050.0110.0140.2940.0100.0190.3920.014
SCT0.0370.7780.0280.0200.4200.0150.0140.2900.0100.0160.3270.0120.0190.3970.014
PCT0.0450.9180.0340.0170.3610.0130.0130.2760.0100.0140.3030.0110.0160.3290.012
SPCT0.0360.7580.0270.0190.4010.0140.0120.2620.0090.0130.2810.0100.0170.3520.013
Table 10. Sentiment quantification based on deep features.
Table 10. Sentiment quantification based on deep features.
AlgorithmFeaturesSemEval2016SemEval2017STS-GoldSanders
AERAEKLDAERAEKLDAERAEKLDAERAEKLD
DBNGloVe0.0190.3940.0140.0210.4310.0150.0110.2220.0080.0160.3340.012
Word2vec0.0200.4230.0150.0240.4960.0180.0140.2960.0110.0190.4030.014
n-Gram0.0330.6760.0250.0340.6950.0260.0250.5160.0190.0300.6120.023
BoW0.0360.7170.0270.0340.6860.0260.0250.5030.0190.0300.6010.023
CNN-LSTMGloVe0.0140.2980.0110.0160.3450.0120.0060.1220.0040.0110.2410.009
Word2vec0.0160.3290.0120.0190.3930.0140.0370.7720.0270.0370.7720.027
n-Gram0.0300.6020.0230.0310.6240.0230.0210.4350.0160.0260.5360.020
BoW0.0300.6060.0230.0300.6020.0230.0200.4090.0150.0250.5120.019
RNNGloVe0.0120.2560.0090.0150.3080.0110.0370.7720.0270.0370.7720.027
Word2vec0.0150.3060.0110.0160.3380.0120.0050.1140.0040.0110.2340.008
n-Gram0.0290.5870.0220.0290.5990.0220.0460.9350.0350.0460.9350.035
BoW0.0270.5650.0210.0280.5680.0210.0450.9280.0340.0450.9280.034
Table 11. Comparison of proposed framework and baseline approaches.
Table 11. Comparison of proposed framework and baseline approaches.
Sr. NoDatasetProposed MethodBaselineReference
1SemEval2016KLD = 0.013KLD = 0.034[37]
2SemEval2017KLD = 0.012KLD = 0.036[6]
3STS-GoldAE = 0.007AE = 0.008[38]
4SandersKLD = 0.010KLD = 0.009[39]
Table 12. Parameter settings for applied machine learning algorithms.
Table 12. Parameter settings for applied machine learning algorithms.
AlgorithmsValues
NB priors = None, var_smoothing = 1e-09
Decision Tree(loss = “deviance”, learning_rate = 0.01, n_estimators = 100, subsample = 1.0, criterion = “friedman_mse”, min_samples_split = 2, min_samples_leaf = 1, min_weight_fraction_leaf = 0.0, max_depth = 3, min_impurity_decrease = 0.0, min_impurity_split = None, init = None, random_state = None, max_features = None, verbose = 0, max_leaf_nodes = None, warm_start = False, presort = “auto”, validation_fraction = 0.1, n_iter_no_change = None, tol = 0.0001)
SVMC = 1.0, kernel = “rbf”, degree = 3, gamma = “auto_deprecated”, coef0 = 0.0, shrinking = True, probability = False, tol = 0.001, cache_size = 200, class_weight = None, verbose = False, max_iter = 1, decision_function_shape = “ovr”, random_state = None
AdaBoost(base_estimator = None, n_estimators = 10, max_samples = 1.0, max_features = 1.0, bootstrap = True, bootstrap_features = False, oob_score = False, warm_start = False, n_jobs = None, random_state = None,
RF(n_estimators = 10, criterion = “gini”, max_depth = None, min_samples_split = 2, min_samples_leaf = 1, min_weight_fraction_leaf = 0.0, max_features = “auto”, max_leaf_nodes = None, min_impurity_decrease = 0.0, min_impurity_split = None, bootstrap = True, oob_score = False, n_jobs = None, random_state = None, verbose = 0, warm_start = False, class_weight = None)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ayyub, K.; Iqbal, S.; Wasif Nisar, M.; Munir, E.U.; Alarfaj, F.K.; Almusallam, N. A Feature-Based Approach for Sentiment Quantification Using Machine Learning. Electronics 2022, 11, 846. https://doi.org/10.3390/electronics11060846

AMA Style

Ayyub K, Iqbal S, Wasif Nisar M, Munir EU, Alarfaj FK, Almusallam N. A Feature-Based Approach for Sentiment Quantification Using Machine Learning. Electronics. 2022; 11(6):846. https://doi.org/10.3390/electronics11060846

Chicago/Turabian Style

Ayyub, Kashif, Saqib Iqbal, Muhammad Wasif Nisar, Ehsan Ullah Munir, Fawaz Khaled Alarfaj, and Naif Almusallam. 2022. "A Feature-Based Approach for Sentiment Quantification Using Machine Learning" Electronics 11, no. 6: 846. https://doi.org/10.3390/electronics11060846

APA Style

Ayyub, K., Iqbal, S., Wasif Nisar, M., Munir, E. U., Alarfaj, F. K., & Almusallam, N. (2022). A Feature-Based Approach for Sentiment Quantification Using Machine Learning. Electronics, 11(6), 846. https://doi.org/10.3390/electronics11060846

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop