Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies

Tu, Shu-Fen; Hsu, Ching-Sheng; Lu, Yu-Tzu

doi:10.3390/fi13090226

Open AccessArticle

Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies

by

Shu-Fen Tu

¹

,

Ching-Sheng Hsu

^2,*

and

Yu-Tzu Lu

²

¹

Department of Information Management, Chinese Culture University, Taipei 11114, Taiwan

²

Department of Information Management, Ming Chuan University, Taoyuan 33300, Taiwan

^*

Author to whom correspondence should be addressed.

Future Internet 2021, 13(9), 226; https://doi.org/10.3390/fi13090226

Submission received: 11 July 2021 / Revised: 27 August 2021 / Accepted: 28 August 2021 / Published: 30 August 2021

(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)

Download

Browse Figures

Versions Notes

Abstract

:

Nowadays, many companies collect online user reviews to determine how users evaluate their products. Dalpiaz and Parente proposed the RE-SWOT method to automatically generate a SWOT matrix based on online user reviews. The SWOT matrix is an important basis for a company to perform competitive analysis; therefore, RE-SWOT is a very helpful tool for organizations. Dalpiaz and Parente calculated feature performance scores based on user reviews and ratings to generate the SWOT matrix. However, the authors did not propose a solution for situations when user ratings are not available. Unfortunately, it is not uncommon for forums to only have user reviews but no user ratings. In this paper, sentiment analysis is used to deal with the situation where user ratings are not available. We also use KKday, a start-up online travel agency in Taiwan as an example to demonstrate how to use the proposed method to build a SWOT matrix.

Keywords:

online travel agency; sentiment analysis; SWOT matrix; web crawler

1. Introduction

Traditionally, the distribution of tourism products is controlled by tourism product suppliers, tourism product wholesalers, tourism product retailers, and the final consumers. Due to emerging technologies, any one of the channels mentioned above can be used to reach customers directly online via social networks or review platforms. Research conducted by the Pacific Asian Travel Association (PATA) found that online travel agencies (OTAs) account for 40% of the total global travel market [1]. According to Source Research, 70% of people travel to experience the real culture of a region. However, statistics from Trafalgar’s “The Good Life” survey show that 49% of people find that “so-called” real experiences are not “real” enough [2]. Therefore, some OTAs specialize in offering local in-destination tours and activities for global travelers. These OTAs provide an online platform on which customers can browse and order tours and activities, and customers can leave public reviews after experiencing the tours and activities. Such platforms enable travelers to choose from a variety of tours conveniently and at a low cost.

There are local tour platforms for destinations worldwide, including Taiwan’s KKday, which was founded in 2014; USA’s Peek, founded in 2012; and Germany’s GetYourGuide, founded in 2009. Taiwan’s startup KKday offers fully planned, guided tours in more than 150 different cities. The CEO stated that KKday’s goal is to create a global travel platform that can be used by customers everywhere [3]. Some OTA giants, such as Booking.com, Expedia, and Trip.com, have also begun to enter this market. Therefore, it can be predicted that the competition in this market will become more and more fierce. For start-up companies, determining how to remain competitive is an important issue. Launching popular products is one way to maintain competitiveness. Some tools can help organizations to formulate competitive strategies, among which the SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis matrix is commonly used. SWOT analysis was proposed a long time ago. Some researchers consider the origin of SWOT to be obscure, but Puyt et al. [4] believes that SWOT originated in the early 1950s. Though SWOT analysis is an ancient method, it is still a widely used strategic planning tool [5,6,7,8,9].

In 2019, Dalpiaz and Parente proposed a tool, known as RE-SWOT, which utilizes user reviews to generate a SWOT matrix, which is then used for the elicitation of requirements [10]. Dalpiaz and Parente retrieved the user reviews and ratings of the target product to identify strengths and weaknesses. Opportunities and threats were determined from the user reviews and ratings of the competitive products. Dalpiaz and Parente took the app as the research object to illustrate how to use the proposed method to generate the SWOT matrix. Dalpiaz and Parente’s study is very useful because it provides a way to automatically generate a SWOT matrix based on users’ comments. This study intends to investigate how to utilize Dalpiaz and Parente’s method to build the SWOT matrix for KKday products. The user reviews and ratings needed to generate the SWOT matrix are available on the KKday website, but they may be less objective because the company has the authority to delete or retain the comments. User reviews and ratings are preferably obtained from third-party websites, such as the TripAdvisor forum. TripAdvisor is the largest travel forum, having users all over the world, and it has become the main source of data for many travel-related studies, because its online reviews are less likely to be changed by the owners of the tourist company [11]. However, the TripAdvisor forum lacks user ratings, which are required for Dalpiaz and Parente’s method to create a SWOT matrix. If the problem of a lack of user ratings can be solved, Dalpiaz and Parente’s method could be widely applied in various scenarios.

Recently, some companies have tried to extract comments on their own products from the web and use a sentiment analysis to analyze the polarity or general feeling of the comments [12]. Sentiment analysis, also known as sentiment classification or opinion mining, refers to the use of natural language processing (NLP), text analysis, and computer linguistics to systematically identify, extract, and quantify the subjectivity of the text [13]. The application of sentiment analysis to the tourism industry has previously been proposed [14,15,16,17,18,19]. Polarity means the quantified value of the sentiment, and its classification can be binary (positive or negative), triple (positive, negative, or neutral), or more. Therefore, sentiment analysis provides a possible solution to the problem that the Dalpiaz and Parente’s method will not work if user ratings are missing. We can use sentiment analysis to determine the polarity of the sentiment in user reviews and then use the polarity to represent the user rating. In summary, Dalpiaz and Parente’s method is a good way to automatically generate a SWOT matrix, which is an important tool for competitive analysis. This study applied Dalpiaz and Parente’s method to create a SWOT matrix for an OTA, using KKday as an example. The user reviews were collected from the TripAdvisor forum, and the user ratings, which are required by Dalpiaz and Parente’s method but are not available in the TripAdvisor forum, were replaced by the polarity of sentiment in the reviews. In this way, we made Dalpiaz and Parente’s method applicable to the OTA.

The remainder of this paper is organized as follows: In Section 2, the online travel agency is introduced, and Dalpiaz and Parente’s method and sentiment analysis are reviewed; in Section 3, the method proposed to integrate sentiment analysis into the RE-SWOT tool is described in detail; in Section 4, the experiment results are presented; and finally, discussion, implications, and conclusions are provided in Section 5.

2. Related Works

2.1. SWOT Analysis

SWOT is an acronym for strength, weakness, opportunity, and threat. It is mainly used to analyze the strengths and weaknesses of a company as well as to identify the opportunities and threats faced in the presence of competitors. SWOT analysis is used to conduct an in-depth and comprehensive analysis of the positioning of a company’s own competitive advantages before formulating development strategies. Depending on whether an issue involves a controllable element of the organization and whether it is beneficial to achieve a certain goal, a 4-quadrant matrix can be drawn, as shown in Figure 1. Strengths represent the unique advantages of the organization, which are factors that can be controlled within the organization. Weaknesses are those elements that weaken the strength of the organization. If organizations want to remain competitive, they must remove weaknesses or at least reduce their negative impacts. Opportunities are external factors that help a company to grow due to industrial progress or environmental changes. Threats are external factors that organizations cannot control, but organizations can take necessary precautions against losses and injuries caused by emergencies.

2.2. SWOT Development Based on User Reviews

Dalpiaz and Parente [10] proposed a method for requirements engineering based on SWOT analysis. Requirements engineering refers to the application of effective technologies to help companies understand and determine customer requirements. Dalpiaz and Parente believe that user comments are an important basis for understanding customer requirements, so they designed a process to extract product features mentioned by users and the user ratings about the products. The final output of the process is the SWOT matrix, and requirements engineering is based on this matrix. Here, we briefly describe Dalpiaz and Parente’s SWOT matrix before moving onto a closer examination of the process. Dalpiaz and Parente called the matrix built by this process the RE-SWOT matrix, as illustrated by Figure 2.

Features are strengths of a company’s own product if the performance of this product has a positive or above average evaluation compared with competing products. Similarly, weaknesses are features of a product that perform negatively or below average when compared with competing products. Threats are features of competing products that perform positively or above the market average, and opportunities come from the features of the competitor that perform negatively or below the market average.

Having described the RE-SWOT matrix, we now explain the steps taken to generate the RE-SWOT matrix:

Step 1: Identify features and transform ratings.

First, the user reviews and ratings on the target product and its competing products are retrieved. Then, those user reviews are preprocessed using the Natural Language Processing (NLP) technique to identify the mentioned features. The original 5-point rating is transformed into a number on a scale from −2 to 2.

Step 2: Calculate the FPS per feature.

For a feature j of a product i, the feature performance score (FPS) is calculated as shown in Equation (1).

F P S_{i, j} = \frac{S_{i, j} \cdot V_{i, j}}{\sum_{i = 1}^{m} | S_{i, j} V_{i, j} |}, and s_{i, j} = \frac{\sum_{k = 1}^{V_{i, j}} r_{i, j} [k]}{2 \times V_{i, j}} .

(1)

In Equation (1), m is the number of products, V_i,j is the number of reviews mentioning feature j related to product i, and r_i,j[k] is the transformed rating of the k-th user review mentioning feature j of product i, and k = 1..V_i,j.

Step 3: Classify each feature.

According to the FPS score, each feature is classified as positive or negative and above the market average or below the market average. A feature j is considered a positive element of product i if FPS_i,j ≥ σ or negative if FPS_i,j ≤ −σ. The performance of feature j of product i is above the market average if

F P S_{i, j} - \bar{F P S_{i, j}} \geq σ

or below market average if

F P S_{i, j} - \bar{F P S_{i, j}} \leq - σ

. In addition, a feature is considered unique if its FPS score is 1 or −1. In the experiment, Dalpiaz and Parente set σ = 0.1.

Step 4: Generate the RE-SWOT matrix.

The positive and above market average features of a company’s own product can be considered strengths, and those that are negative or below the market average can be considered weaknesses. On the contrary, positive and above market average features of competing products can be considered threats to the company’s own product, and negative and below market average features of competing products can be considered opportunities for the company’s own product.

In Dalpiaz and Parent’s study, the products were mobile apps, so the user reviews and ratings were retrieved from the app store. User reviews and ratings are obtained from different platforms for different products. In principle, credible platforms should be used and the product’s official website should be avoided. The app store is a third-party platform for mobile apps and does not belong to any app company, so the reviews and ratings on it can be regarded as objective. As described above, generation of the RE-SWOT matrix depends on the FPS scores, which are calculated from user ratings. If the platform we want to use only has user reviews and no user ratings, the RE-SWOT matrix cannot be created. The current study investigated tourism products, and, as mentioned in the previous section, the TripAdvisor forum was an appropriate platform for this. However, the TripAdvisor forum has user views but no user ratings. This study intended to address this issue using sentiment analysis.

2.3. Sentiment Analysis

Sentiment analysis is a method of opinion detection. The composition of opinions includes the source of the opinion (that is, the person who expresses the opinion), the target object of the opinion, the polarity of the opinion, and the text containing the opinion. The simplest sentiment analysis simply judges the polarity of the opinion; a more advanced analysis will give a quantitative value to the intensity of the opinion’s polarity. A more complex sentiment analysis will try to judge the source of the opinion and the target object mentioned and may even analyze which aspect of the target object the comment refers to as well as the emotions implicit in the opinion.

Research on sentiment analysis has a wide range of applications, covering film, catering, tourism, and other reviews. For example, Zhang [20] collected reviews from the Bulletin Board System PTT and analyzed the sentiment score of the adjective terms to recommend the movie. There are currently ready-made sentiment analysis tools that can read text messages and determine the implied emotional polarity and intensity of the message. Of these, VADER (Valence Aware Dictionary and sEntiment Reasoner) is one of the most widely used tools [21,22,23]. VADER is a rule-based sentiment analysis tool that has been proven to be more accurate than other sentiment analysis methods when dealing with text from social media, movie reviews, and product reviews. VADER can not only identify emotional polarity, it can also give a quantitative value of the intensity of emotional polarity [13,24].

The advantages of VADER include [13]

It performs well with social media text covering various fields;
It does not require any training data but uses a generalized sentiment dictionary based on the psychological valence and gold-standard sentiment lexicon;
It has a satisfactory level of efficiency for online streaming data and does not seriously impact the balance between speed and performance.

Given a sentence, VADER gives scores in four categories: positive, negative, neutral, and compound. The compound score is derived by normalizing the sum of the positive, negative, and neutral scores. This score ranges from −1(most extreme negative) to +1 (most extreme positive) [25]. Elbagir and Yang [21] proposed a method to transform the VADER score into a rating on a scale of −2 to 2. The transformation rule is as follows:

If compound score > 0.001, then
- If positive score > 0.5, the rating is set to +2; else, the rating is set to +1.
If compound score < −0.001, then
- If negative score > 0.5, the rating is set to −2; else the rating is set to −1.
If compound score is between −0.001 and 0.001, the rating is set to 0.

We can now propose a solution to the RE-SWOT problem that we posed in the previous subsection. If the user ratings are required but are lacking, we can utilize VADER to calculate the sentiment scores of user reviews and use Elbagir and Yang’s method to generate user ratings.

2.4. KKday

KKday was founded in 2014, and its headquarters are in Taipei, Taiwan. It is Taiwan’s first e-commerce platform dedicated to providing in-destination tour experiences and itineraries. KKday provides diversified local experience itineraries, allowing travelers to go deeper into the local area and plan their itineraries more freely and easily. Currently, KKday platform is used in more than 80 countries and 500 cities around the world and provides more than 20,000 travel itineraries and experiences. The service languages are not only traditional and simplified Chinese, but also English, Vietnamese, Thai, Japanese, and Korean. KKday has established branches around the world, including in Taiwan, Hong Kong, Singapore, Japan, South Korea, Malaysia, Thailand, the Philippines, Vietnam, Shanghai, and other places. It is continuing to expand across North America, Europe, New Zealand, and Australia. In 2016, KKday formed an alliance with Asia Miles.

This study intends to use the RE-SWOT tool to analyze the competitiveness of KKday’s products. At present, little research has been conducted on local tour experience platforms such as KKday, and as far as we know, there has been no research on competitiveness analysis. Lu [26] tried to determine success factors related to KKday using field observations and interviews. Hsu [27] analyzed the business model of the KKday platform. Hou [28] conducted a survey of KKday customers to verify the impacts of its innovative service characteristics on its functional value and emotional value.

3. The Proposed Method

Figure 3 shows the process used in the proposed method, which can be broken down into four phases. We discuss these in detail.

Data collection

At this stage, English text messages related to KKday and Klook were extracted from the TripAdvisor forum, and the web crawler was written in Python. The extracted text messages had to go through data cleaning before being used in the next stage. As the data collection and transfer process may cause data format errors, data loss, or other problems, it was necessary to preprocess the data prior to analysis [29]. Data cleaning is the process of detecting and deleting inaccurate or useless records, such as spelling errors and non-target languages, leaving only valuable and relevant data. In this study, messages with a short length or those repeatedly posted by a single user were deleted. Because repeated messages may cause statistical bias, we only kept one of the repeated messages to preserve the authenticity and accuracy of the data sample [28].

2.: Feature extraction

Referring to Dalpiaz and Parente’s study, we identified features of the user reviews derived from the previous stage through the following procedures:

Tokenization

Tokenization is the act of splitting a string of text into list of tokens, from which certain characters, such as punctuation, may be thrown away. We provide the following message as an example:

“I buy ticket in KKdays, and they will send me QR code through email. Use QR code to go in, very convenience, easy, and save time.”

The message can be broken into two sentences: “I buy ticket in KKdays, and they will send me QR code through email.” and “Use QR code to go in, very convenience, easy, and save time.” The two sentences can be further tokenized into “I”, “buy”, “ticket”, “in”, “KKdays”, “and”, “they”, “will”, “sent”, “me”, “QR”, “code”, “through”, “email”, “Use”, “QR”, “code”, “to”, “go”, “in”, “very”, “convenience”, “easy”, “and”, “save”, and “time”.

b.: Lowercase transformation

The purpose of converting uppercase to lowercase is to prevent words from being recognized as different words due to the difference in capitalization. Take the tokens “I”, “KKdays”, and “QR” in the previous step as examples. These tokens would be converted to lowercase: “i”, “kkdays”, and “qr”.

c.: Removal of stop words

Some words have little lexical meaning or can only express grammatical meaning within sentences, such as a, the, and he. Some words appear very frequently, such as ‘want’, but it is difficult for search engines to narrow the search scope and provide truly relevant search results. Such words are called stop words and can be ignored without losing the meaning of the sentence. In this study, we used the python NLTK library to remove stop words. Taking the above sentences as an example, after removing stop words, we were left with “buy”, “ticket”, “kkdays”, “sent”, “qr”, “code”, “email”, “qr”, “code”, “go”, “convenience”, “easy”, “save”, and “time”.

d.: POS (part-of-speech) tagging

Features are usually described by nouns, verbs, and adjectives. Therefore, we used the part-of-speech tagging method in the python NLTK library to assign a label to each token to indicate its part of speech (POS). Table 1 lists some POS tags. Taking the tokens without stop words as an example, after tagging, we were left with “buy, VB”, “ticket, NN”, “kkdays, NNS”, “sent, VB”, “qr, NN”, “code, NN”, “email, NN”, “qr, NN”, “code, NN”, “go, VB”, “convenience, NN”, “easy, JJ”, “save, VB”, and “time, NN”.

e.: Lemmatization

Tokens may be recognized as different keywords due to differences in the part of speech, which results in the extraction of repeated features. This study used the Lemmatizer method in python NLTK library to remove inflectional endings and return tokens to the base form. For example, “kkdays” was returned to “kkday” and then tagged as “NN”.

f.: Collocations

Collocation refers to a sequence of two or more consecutive words occurring more frequently than can be accidental. Thanopoulos et al. [30] compared several statistic metrics to find collocations. Their experiment results showed that the likelihood ratio performs well relative to other metrics; therefore, this study adopted the likelihood ratio to identify collocations. After that, the collocations were sorted in descending order based on frequency, and the top 20 were selected as features.

g.: Merging similar features

It was possible that some of the features selected in the above step could be similar. Therefore, we checked the similarity manually and unified similar features. For example, the phrases “high speed” and “speed rail” both describe high-speed rail, so we always used “high speed”.

3.: Sentiment Analysis

As mentioned previously, the user reviews extracted from the TripAdvisor forum were only text messages without user ratings. To calculate the FPS at the next stage, we first used VADER to generate a sentiment score for each user review and transformed the scores into a rating on a scale of −2 to 2 according to Elbagir and Yang’s method [21], as described in Section 2.3.

4.: SWOT analysis

To generate the SWOT matrix, we needed to first calculate the FPS for each feature. Suppose that there are m₁ and m₂ user reviews related to the products of KKday and Klook, respectively, and that n features are generated at the stage of feature extraction. Let M_k be a m_k × n binary matrix, R_k be a column vector of m_k entries, and B_k be a column vector in which all entries are the same constant, 2, where k ∈ {1, 2}. M_k[i, j] = 1 if feature j is mentioned in review i, and R_k[i, 0] is the rating of user review i, where i = 0 … (m_k − 1), and j = 0 … (n − 1). The FPS of feature j is calculated using Equation (2):

F P S_{k, j} = \frac{S_{k, j} \cdot \sum_{i = 1}^{m_{k}} M_{k} [i, j]}{\sum_{k = 1}^{2} | S_{k, j} \cdot \sum_{i = 1}^{m_{k}} M_{k} [i, j] |},

(2)

where

S_{k, j} = \frac{R_{k}^{T} \times M_{k}}{B_{k}^{T} \times M_{k}}, and k \in {1, 2} .

(3)

Based on the FPS, the four parts of SWOT matrix can be determined as follows. Let f_s, f_w, f_o, and f_t be the set of features belong to strengths, weaknesses, opportunities, and threatens, respectively, and be defined as

f_{s} = {f_{j} | {F P S}_{1, j} \geq σ and F P S_{1, j} - \bar{F P S_{1, J}} \geq σ},

(4)

f_{w} = {f_{j} | {F P S}_{1, j} \leq - σ and F P S_{1, j} - \bar{F P S_{1, J}} \geq - σ},

(5)

f_{o} = {f_{j} | {F P S}_{2, j} \leq - σ and F P S_{2, j} - \bar{F P S_{2, J}} \geq - σ}, and

(6)

f_{t} = {f_{j} | {F P S}_{2, j} \geq σ and F P S_{2, j} - \bar{F P S_{2, J}} \geq σ} .

(7)

The symbol σ is a user-defined parameter. The cardinality of the above sets is negatively related to the value of σ; therefore, the setting of σ depends on how many features the manager expects to see.

4. Experimental Results

In our experiment, we used the RE-SWOT tool to analyze the competitiveness of KKday’s products. As mentioned previously, we collected user reviews of KKday’s product from the TripAdvisor forum and used Elbagir and Yang’s method to generate user ratings. There is one other thing that is important for generating the RE-SWOT matrix: the opportunities and threats are external factors that come from competitors. According to Owler [31], the main competitor of KKday is Klook. Therefore, we also collected user reviews related to Klook from the TripAdvisor forum and used the same method to generate user ratings for Klook’s products. In addition, we wanted to find out how foreigners feel about local tours in Taiwan. Therefore, this study extracted reviews on local tours in Taiwan provided by KKday and Klook from the TripAdvisor forum. Repeated and overly short messages were removed, leaving 206 messages related to KKdays and 158 messages related to Klook. Next, the sentiment score of each review was generated using the VADER tool. In addition, 20 features were extracted from these reviews through the process of feature extraction. We could then calculate the FPS for each feature using Equation (2), and the results are listed in Table 2. To generate the SWOT matrix for KKday, we needed to categorize features into four groups (strengths, weakness, opportunities, and threats), according to Equations (4)–(7). The user-defined parameter σ was set to 0.1, which is the same value as that used by Dalpiaz and Parente [10]. The SWOT matrix for KKday is presented in Figure 4.

The features presented in Figure 4 can be explained based on the part of the SWOT matrix to which they belong and the comments from which these features originate.

Strengths

Features with above-average FPS scores that originate from positive comments on KKday were regarded as strengths of KKday. We checked the comments related to these features to determine what they meant, as follows: (a) tour guide—customers feel that tour guides are friendly and introduce tourist attractions by telling stories; (b) observation deck—customers mentioned that the ticket price of the observation deck was reasonable and that they enjoyed the scenery; (c) pocket wifi—customers thought that the process of renting the wifi online and claiming it at the airport was convenient; (d) credit card—customers were satisfied with the offer for credit card payments; (e) qr codes—customers thought that the qr code tickets were convenient; (f) day tour—customers were satisfied with the range of day tours; (g) souvenir shop—customers mentioned that items purchased online could be picked up at the airport; (h) rail bike—customers mentioned that the rail bike experience in South Korea was interesting; and (i) cable car—customers were satisfied with the price of cable car tickets.

2.: Weaknesses

Features with below-average FPS scores, representing negative comments about KKday, were regarded as weaknesses of KKday. We checked the comments related to these features to see what they meant. Some reviews showed that users were unsatisfied with the Sun Moon Lake Tour, for example, the meeting place was not clearly marked, or the tour guide was late. Now that KKday is aware of these weaknesses, it can try to remove or minimize them.

3.: Threats

Features with above-average FPS scores, represented by positive comments about Klook, were regarded as threats to KKday. We checked the comments related to these features to see what they meant: (a) high speed—customers like the discount for the high speed rail ticket booking available on the Klook platform; (b) food court—customers mentioned that it was easy to reach the food court from the tourist attractions; (c) front desk—the customers’ experience of the Klook service was better than that of front desk service; (d) hot spring—customers mentioned that Klook provided a satisfactory discount on the activity fee; (e) mrt station—customers mentioned that the pick-up points were near the MRT station; and (f) rock formation—customers were satisfied with the shuttle bus tickets to Yehliu Geopark. KKday should be aware of these threats coming from its competitor, and it could utilize its own strengths to lessen these threats.

4.: Opportunities

Features with below-average FPS scores, representing negative comments about Klook, are regarded as opportunities to KKday. Among the comments collected up to now, no features met these conditions. One of the possible reasons for this is that the number of reviews collected was insufficient. Another possible reason is that most customers are satisfied with Klook.

In this study, VADER was used to analyze the sentiment polarity of user reviews. Therefore, we investigated the extent to which VADER can correctly predict sentiments associated with user reviews. The effectiveness of VADER relies on human judgement. We asked two people, both of whom had travel experience and had no relationship with this research, to be human coders. One was responsible for annotating user reviews related to KKday, and the other was responsible for annotating those related to Klook. Because fine-grained sentiment polarity is a bit too challenging for human coders, we asked the human coders to annotate sentiments as positive, negative, and neutral. Manual annotations were used as ground truth to measure the validity of VADER. Table 3 shows the annotation statistics of VADER and the human coders. The accuracy, which is the ratio of the number of correct predictions to the total number of reviews, was (200 + 15 + 21)/297 = 0.7946. The performance measured in terms of precision, recall, and F1-Score is shown in Table 4. It was observed that VADER performed better in the classification of positive and negative reviews compared with that of neutral reviews. This is presumably due to the relatively narrow numerical interval of VADER scores defined as neutral. Another possible reason is that human coders tend to annotate a sentiment as neutral when positive or negative sentiments are less obvious in the review. It can be seen from Table 3 that 54 reviews were annotated by human-coders as neutral, which was twice as many as those annotated by VADER. To validate the features extracted by our method, we also asked the two human-coders to select up to ten features from each review. In a total of 296 reviews, the total number of features selected by our method is 2960, while the total number of features selected by human-coders is 2353. For a same review, if the feature selected by our method interacts with the feature set selected by the human-coder, then it is considered as matching. Among the 2960 features, there are 1718 matching features. Therefore, the recall, precision, and F1-score metrics are 0.7301, 0.5804, and 0.6467, respectively. Compared to the feature set selected by computer, the feature set selected by human-coders is smaller, which leads to high false positive rate. As a result, the precision is lower than recall. If the set of manually selected features can be expanded, the precision would be improved.

5. Discussion, Implications, and Conclusions

The SWOT matrix is a tool that is widely used to help companies conduct competitive analyses. To generate a SWOT matrix, a company needs to collect detailed data to understand internal and external environmental factors. Extracting data from the web has become popular thanks to the advancement of information technology and the development of social media. Dalpiaz and Parente [10] proposed a method, called RE_SWOT, to automatically generate a SWOT matrix from online user reviews and ratings. Dalpiaz and Parente took mobile apps as an example and collected user reviews and ratings from the app store. Dalpiaz and Parente’s method provides good direction if a company has difficulty generating a SWOT matrix. However, if user ratings are not available, it becomes difficult to use RE-SWOT in practice. Therefore, this study aimed to solve this problem by using sentiment classification. We used an online travel agency as an example to illustrate the feasibility of our idea.

5.1. Discussion

SWOT analysis is an important technique used by organizations to identify their strengths, weaknesses, opportunities, and threats. Strengths and weaknesses are internal elements, while opportunities and threats are external elements, usually coming from the company’s competitors. Through SWOT, organizations can understand their positions and develop competitive strategies accordingly. Up to now, SWOT analysis has been a useful tool in various fields [32]. In addition to the enterprise level, SWOT can be applied to the application level, such as in tourism destination planning or for analyzing products or services [33]. SWOT analysis can facilitate software requirements as well [34,35]. Recently, eliciting requirements from online user reviews has become a software engineering trend [36,37]. With the help of NLP, Groen et al. [38] found that the automatic analysis of user requirements has better scalability than manual analysis. The Requirements Engineering Lab (RE-Lab) at Utrecht University continuously conduct research on NLP for software requirements engineering [39]. Inspired by SWOT analysis, Dalpiaz and Parente [10] proposed the RE-SWOT method to automatically generate a SWOT matrix for mobile apps with online user reviews gathered from app stores. RE-SWOT is a novel tool that can reduce the burden associated with generating a SWOT matrix. RE-SWOT requires user ratings to calculate the FPS, which is used as the basis to determine which features belong to which part of the SWOT matrix. However, Dalpiaz and Parente did not mention what to do if there are no user ratings. van’t Hul’s Master’s thesis focused on the fostering creativity of app designers [40]. Vliet et al. [41] used a crowdsourcing approach to improve the elicitation of requirements with existing NLP techniques. Considering the difficulties involved in conducting RE within a governmental organization, Wouters et al. [42] developed a crowd-based method, called KMar-Crowd, to encourage user engagement in requirement elicitation. Therefore, the problems associated with RE-SWOT were still left unresolved.

Because the mobile industry is one of the fastest growing and most dynamic IT industries [43], understanding essential factors associated with mobile app requirements is an important issue [44]. Valuing the idea of Dalpiaz and Parente, some researchers also conducted competitive analyses of mobile apps using online user reviews. Shah et al. [45] developed a tool, called RevSUM, that can automatically extract the most relevant features from user reviews for an app developer. Assi et al. [46] proposed a review analysis method, called FeatCompare, that can automatically identify high-level features by comparing user reviews among competing apps. Lee [47] utilized an explainable neutral network to establish a classification model to extract product features that lead to differentiation from competitors. Later, Han and Lee [48] established a more suitable model to enhance the performance of Lee’s model. Because of the semantic gap between end users and developers, the priorities of requirements perceived by users are usually different from that of developers. Kifetew et al. [49] extracted quantifiable properties from user feedback and used domain ontology to narrow the gap to prioritize requirements according to user preferences. However, no study has improved the RE-SWOT.

Because Dalpiaz and Parente studied mobile apps and the app store has rich reviews and ratings, the RE-SWOT problem appeared minor to them. Since RE-SWOT provides an easy way to conduct a SWOT analysis, making RE-SWOT practicable for any kind of product, it is valuable. With such thoughts, this study aimed to propose a way to cope with a lack of user ratings when using RE-SWOT. Sentiment classification is a process of automatically recognizing opinions in text and determining their emotional polarity according to the emotions expressed by users. To a certain extent, user ratings express the emotional polarity of comments. Therefore, this study employed sentiment classification to generate sentiment polarity when user ratings are not available. The sentiment classification tool used in this study was VADER, because it is especially suitable for emotions expressed in social media. Given a sentence, VADER gives scores in four categories: positive, negative, neutral, and compound. In order to better adapt to the initial RE-SWOT method, the VADER scores were transformed into number on a scale of −2 to 2. Consequently, we were able to smoothly incorporate sentiment classification into the RE-SWOT method without changing the original procedures.

To sum up, RE-SWOT provides an easy way to automatically generate a SWOT matrix, and the results of this research help to enable RE-SWOT to be used for any kind of product. Traditionally, SWOT analysis required interviews to be conducted with stakeholders and market research to be performed to gather necessary information to generate a SWOT matrix. With the advent of social media, more and more users are expressing their opinions online. Online user reviews provide valuable information that can be used for SWOT analysis. With sentiment classification, SWOT analysis can be conducted for any product with online user reviews by means of RE-SWOT. Using the proposed method, we conducted SWOT analysis for OTAs with data collected from the TripAdvisor forum. Even though the forum lacks user ratings, the experimental results show that the proposed method was able to generate a SWOT matrix successfully.

5.2. Implication

In order to retain a competitive advantage, organizations need to carry out various management activities continuously. Strategic planning is an important organization and management activity that aims to provide a clear road map for implementing strategies and achieving business goals. There are various tools to help organizations make strategic plans, and the SWOT matrix is one of them. The first step in strategic planning is to collect the information necessary to understand and determine the problems, challenges, and trends forming the strategy. Many researchers and industry experts have realized the value of online user comments. Various efforts have been made to mine valuable information from online user reviews, but, to the best of our knowledge, Dalpiaz and Parente’s study was the first to link online user reviews with a known strategic planning tool. Their procedure is clear and easy to follow. We employed sentiment classification to make RE-SWOT method proposed by Dalpiaz and Parente more practical. As the SWOT matrix is a well-known tool, managers can easily apply RE-SWOT results for strategic planning as usual.

With the improved RE-SWOT, SWOT analysis using online user comments is no longer limited to assessing mobile apps. SWOT analysis is a framework that facilitates data driven strategic decisions. Therefore, data fed into the process for building the SWOT matrix determine the value provided by the SWOT analysis. Even though RE-SWOT helps to semi-automate the process of building the SWOT matrix, organizations must keep this concept in mind. Depending on the field of application, organizations have to consider appropriate data sources when using the improved RE-SWOT. The quality and quantity of data are both important for RE-SWOT. A large amount of data can help organizations identify as many features as possible, thereby deriving critical features. In some fields, such as the tourism and software industries, it is easier to find relevant and well-known online forums that are big enough to be data sources. The contents of these forums are highly relevant to this field, as organizations need to be able to obtain rich, high-quality data from these forums. However, in some areas, the relevant online forums may not be large enough, or there may not be any relevant online forums. In such cases, it is very difficult to obtain high-quality data. Therefore, the improved RE-SWOT method can be used to aid SWOT analysis, but this does not mean that it can completely replace the manual SWOT analysis method.

The online travel agent market has declined by −20.0% due to the COVID-19 pandemic, but it still has great potential and is expected to grow at a compound annual growth rate of 14.8% from 2021 and reach USD 902.2 billion by 2023. The online travel agent market of the Asia Pacific region was the largest in 2019 and is expected to experience the fastest growth in the future [50]. Hence, new OTAs are springing up in this growing market. Start-up OTAs need to analyze their own market positions and competitive advantages to survive in the fiercely competitive market. Some researchers have shown that online reviews have a significant impact on the tourism business [51,52,53]. Some researchers have conducted sentiment analysis on tourism online reviews [54,55,56]. These studies usually utilized sentiment analysis to classify words or features in the reviews into positive, negative, or neutral. Some further compared the classification results using different sentiment analysis methods. This classification may help an OTA understand which features are more attractive to customers. To shape a competitive strategy, the OTA still needs to rely on well-established strategic planning tools. The improved RE-SWOT could be an effective strategic planning tool for OTAs.

5.3. Limitations and Future Work

The limitations of this study may be considered future research directions. First, following Dalpiaz and Parent’s method, the extracted features were fine-grained and mentioned as word pairs. When there are more competitors to compare, the number of reviews becomes large, and the comparison of fine-grained features becomes challenging and time-consuming [46]. Therefore, Assi et al. suggested that the comparison should be performed on high-level features. A so-called high-level feature represents the main function and characteristic of a product. Assi et al. used the weather forecasting app as an example. If three different word pairs extracted from user reviews are “Terrible accuracy”, “Inaccurate Forecast”, and “Excellent accuracy”, these three fine-grained features present the same high-level feature: “Forecast accuracy”. Obviously, the use of high-level features can reduce the workload in regard to comparisons. Therefore, we may employ a high-level feature extraction model in future work. Second, some researchers have found that combining SWOT analysis with another strategic planning framework, such as the PESTEL (political, economic, sociological, technological, environmental, and legal) framework, AHP (analytic hierarchy process), or five forces model, can provide more accurate information and allow a more powerful strategy to be formed [32,33]. In future research, we may study how to add another strategic planning framework into our procedure and propose a hybrid RE-SWOT tool. Third, some of the settings in this study were inherited from Dalpiaz and Parente’s method. For example, this study used five-point scale ratings, and the parameter σ was set at 0.1. We predict that different settings would lead to different results. It may be worth trying a different threshold to quantize the ratings or using a greater number of quantization levels. Additionally, we could use a different σ value and observe the results. Through such exploratory experiments, we may identify a rule of thumb for these settings. Fourth, other than lexicon and rule-based sentiment analysis tools, such as VADER, there are tools that are implemented in machine-learning-based sentiment analysis. VADER is easy to use, but it may be worth trying other sentiment analysis techniques and observing the results. Different sentiment analysis techniques perform differently in terms of accuracy, time complexity, and other metrics [24]. It is worth noting that some metrics are trade-offs. In addition, the performance of sentiment analysis technology may also be affected by the dataset. Fifth, this study focused on English user reviews, but user reviews in different languages would also provide valuable information for OTAs, because tourists can come from all around the world. However, existing NLP tools do not support non-English languages sufficiently. Although there are more than 7000 languages in the world, most NLP processes use seven major languages: English, Chinese, Urdu, Persian, Arabic, French, and Spanish [57]. Design of a non-English NLP tool is beyond the scope of this study. It is expected that NLP tools will support more languages in the future, so we will be able to extend our method to other languages. Sixth, this research mainly focuses on the use of sentiment analysis to solve the problem of RE-SWOT, and less attention is paid to the research topics of sentiment analysis itself. There are some common and important issues in sentiment analysis, such as how to deal with opposite polarities of sentiment in the same review. This study relied on VADER for sentiment analysis, and no additional algorithm is designed to deal with such situation. This may affect the validity of the research results, and further research on this topic should be considered in the future. Finally, a performance evaluation of the results achieved in this paper needs the help of domain experts in a rigorous process and is a topic for future work.

Author Contributions

Conceptualization, formal analysis, methodology, C.-S.H. and S.-F.T.; software, validation, data curation, investigation, S.-F.T. and Y.-T.L.; writing—original draft preparation, S.-F.T.; writing—review and editing, C.-S.H.; resources, supervision, project administration, C.-S.H. and S.-F.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not Applicable, the study does not report any data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Walker, B. Online Travel Agencies Market Share across the World. 2021. Available online: https://www.hotelmize.com/blog/online-travel-agencies-market-share-across-the-world/ (accessed on 19 June 2021).
Tan, M. Trafalgar CEO, Gavin Tollman, Recommends Industry Reboot for the Future of Travel. 2019. Available online: https://theladiescue.com/trafalgar-industry-reboot/ (accessed on 19 June 2021).
Patton, E. Big Data and Local Expertise: Travel Platforms KKday and Klook Want to Make Your Next Trip Awesome. 2017. Available online: https://meet-global.bnext.com.tw/articles/view/40181 (accessed on 25 June 2021).
Puyt, R.; Lie, F.B.; De Graaf, F.J.; Wilderom, C.P. Origins of SWOT Analysis. Acad. Manag. Proc. 2020, 2020, 17416. [Google Scholar] [CrossRef]
Aksu, A.; Bayar, K. Development of Health Tourism in Turkey: SWOT Analysis of Antalya Province. J. Tour. Manag. Res. 2019, 6, 134–154. [Google Scholar] [CrossRef] [Green Version]
Ivanov, S. Modeling Company Sales Based on the Use of SWOT Analysis and Ishikawa Charts. CEUR Workshop Proc. 2019, 2422, 385–394. [Google Scholar]
Kamran, M.; Fazal, M.R.; Mudassar, M. Towards empowerment of the renewable energy sector in Pakistan for sustainable energy evolution: SWOT analysis. Renew. Energy 2020, 146, 543–558. [Google Scholar] [CrossRef]
Niranjanamurthy, M.; Nithya, B.N.; Jagannatha, S. Analysis of Blockchain technology: Pros, cons and SWOT. Clust. Comput. 2019, 22, 14743–14757. [Google Scholar] [CrossRef]
Wang, J.; Wang, Z. Strengths, weaknesses, opportunities, and threats (SWOT) analysis of China’s prevention and control strategy for the COVID-19 epidemic. Int. J. Environ. Res. Public Health 2020, 17, 2235. [Google Scholar] [CrossRef] [Green Version]
Dalpiaz, F.; Parente, M. RE-SWOT: From User Feedback to Requirements via Competitor Analysis. In Requirements Engineering: Foundation for Software Quality, REFSQ; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2019; Volume 11412. [Google Scholar]
Yu, H.W. A Study on Satisfaction Evaluation by Community Platforms’ Consumer Review Using Sentiment Analysis—A Case Study of TripAdvisor. Unpublished Master’s Thesis, National Chung Hsing University, Taichung, Taiwan, 2015. Available online: https://hdl.handle.net/11296/9yttgg (accessed on 30 August 2021). (In Chinese).
Bianchi, N. 6 Sentiment Analysis Real-World Use Cases. 2021. Available online: https://www.repustate.com/blog/sentiment-analysis-real-world-examples/ (accessed on 12 August 2021).
Pandey, P. Simplifying Sentiment Analysis Using VADER in Python (on Social Media Text). 2018. Available online: https://medium.com/analytics-vidhya/simplifying-social-media-sentiment-analysis-using-vader-in-python-f9e6ec6fc52f (accessed on 20 December 2019).
Fu, Y.; Hao, J.X.; Li, X.; Hsu, C.H. Predictive accuracy of sentiment analytics for tourism: A metalearning perspective on Chinese travel news. J. Travel Res. 2019, 58, 666–679. [Google Scholar] [CrossRef]
Kirilenko, A.P.; Stepchenkova, S.O.; Kim, H.; Li, X. Automated sentiment analysis in tourism: Comparison of approaches. J. Travel Res. 2018, 57, 1012–1025. [Google Scholar] [CrossRef]
Kuhamanee, T.; Talmongkol, N.; Chaisuriyakul, K.; San-Um, W.; Pongpisuttinun, N.; Pongyupinpanich, S. Sentiment Analysis of Foreign Tourists to Bangkok Using Data Mining through Online Social Network. In Proceedings of the IEEE 15th International Conference on Industrial Informatics, Emden, Germany, 24–26 July 2017. [Google Scholar] [CrossRef]
Masrury, R.A.; Alamsyah, A. Analyzing Tourism Mobile Applications Perceived Quality Using Sentiment Analysis and Topic Modeling. In Proceedings of the 7th International Conference on Information and Communication Technology, Kuala Lumpur, Malaysia, 24–26 July 2019. [Google Scholar] [CrossRef]
Prameswari, P.; Surjandari, I.; Laoh, E. Mining Online Reviews in Indonesia’s Priority Tourist Destinations Using Sentiment Analysis and Text Summarization Approach. In Proceedings of the IEEE 8th International Conference on Awareness Science and Technology, Taichung, Taiwan, 8–10 November 2017. [Google Scholar] [CrossRef]
Ramanathan, V.; Meyyappan, T. Twitter Text Mining for Sentiment Analysis on People’s Feedback about Oman Tourism. In Proceedings of the 4th MEC International Conference on Big Data and Smart City, Muscat, Oman, 15–16 January 2019. [Google Scholar] [CrossRef]
Zhang, C.H. Text Mining and Sentiment Analysis for the Application of the Product Recommendation-the Case of PTT Movie Board. Unpublished Master’s Thesis, Soochow University, Suzhou, China, 2017. Available online: https://hdl.handle.net/11296/62w2x2 (accessed on 30 August 2021). (In Chinese).
Elbagir, S.; Yang, J. Twitter Sentiment Analysis Using Natural Language Toolkit and VADER Sentiment. In Proceedings of the International Multi Conference of Engineers and Computer Scientists 2019, Hong Kong, China, 13–15 March 2019. [Google Scholar]
Hutto, C.J.; Gilbert, E. Vader: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eighth international AAAI Conference on Weblogs and Social Media, Ann Arbor, MI, USA, 1–4 June 2014. [Google Scholar]
Newman, H.; Joyner, D. Sentiment Analysis of Student Evaluations of Teaching. In Proceedings of the 19th International Conference on Artificial Intelligence in Education, London, UK, 25–30 June 2018. [Google Scholar] [CrossRef]
Savarimuthu, B.T.R.; Corbett, J.; Yasir, M.; Lakshmi, V. Using Machine Learning to Improve the Sustainability of the Online Review Market. In Proceedings of the 41st International Conference on Information Systems 2020, Hyderabad, India, 13–16 December 2020. [Google Scholar]
Swarnkar, N. VADER Sentiment Analysis in Algorithmic Trading. Available online: https://blog.quantinsti.com/vader-sentiment/ (accessed on 19 June 2020).
Lu, W.L. Industrial Restructuring under Experience Economy—A Case Study of Travel Ecommerce Platform, KKday. Unpublished Master’s Thesis, National Chiao Tung University, Hsinchu, Taiwan, 2016. Available online: https://hdl.handle.net/11296/rc55a8 (accessed on 30 August 2021). (In Chinese).
Hsu, Y.C. An Innovative Business Model of a Local Tour Platform: A Case Study of KKday.com. Master’s Thesis, National Sun Yat-sen University, Kaohsiung, Taiwan, 2016. Available online: https://hdl.handle.net/11296/vnm527 (accessed on 30 August 2021). (In Chinese).
Hou, Z.; Cui, F.; Meng, Y.; Lian, T.; Yu, C. Opinion mining from online travel reviews: A comparative analysis of Chinese major OTAs using semantic association analysis. Tour. Manag. 2019, 74, 276–289. [Google Scholar] [CrossRef]
Malley, B.; Ramazzotti, D.; Wu, J.T. Data Pre-Processing. In Secondary Analysis of Electronic Health Records; Springer: Berlin/Heidelberg, Germany, 2016; pp. 115–141. [Google Scholar] [CrossRef] [Green Version]
Thanopoulos, A.; Fakotakis, N.; Kokkinakis, G. Comparative Evaluation of Collocation Extraction Metrics. In Proceedings of the Third International Conference on Language Resources and Evaluation, Las Palmas, Spain, 29–31 May 2002; Available online: http://www.lrec-conf.org/proceedings/lrec2002/pdf/128.pdf (accessed on 30 August 2021).
Owler Kkday’s Competitors, Revenue, Number of Employees, Funding and Acquisitions. 2019. Available online: https://www.owler.com/company/kkday (accessed on 25 December 2019).
Benzaghta, M.A.; Elwalda, A.; Mousa, M.M.; Erkan, I.; Rahman, M. SWOT analysis applications: An integrative literature review. J. Glob. Bus. Insights 2021, 6, 54–72. [Google Scholar]
Oreski, D. Strategy development by using SWOT-AHP. Tem J. 2012, 1, 283–291. [Google Scholar]
Knauss, M. Using SWOT Analysis for Collecting Software Requirements. 2017. Available online: https://www.linkedin.com/pulse/using-swot-analysis-collecting-software-requirements-markus-knauss (accessed on 12 August 2021).
Oza, H. How to Utilize SWOT Analysis for the App Development. 2021. Available online: https://www.hyperlinkinfosystem.co.uk/blog/how-to-utilize-swot-analysis-for-the-app-development (accessed on 12 August 2021).
Dabrowski, J.; Letier, E.; Perini, A.; Susi, A. App Review Analysis for Software Engineering: A Systematic Literature Review; Tech. Rep.; University College London: London, UK, 2020. [Google Scholar]
Lim, S.; Henriksson, A.; Zdravkovic, J. Data-Driven Requirements Elicitation: A Systematic Literature Review. SN Comput. Sci. 2021, 2, 1–35. [Google Scholar] [CrossRef]
Groen, E.C.; Schowalter, J.; Kopczynska, S.; Polst, S.; Alvani, S. Is there Really a Need for Using NLP to Elicit Requirements? A Benchmarking Study to Assess Scalability of Manual Analysis. In Proceedings of the International Conference on Requirements Engineering: Foundation for Software Quality, REFSQ Workshops, Utrecht, The Netherlands, 19–22 March 2018. [Google Scholar]
Dalpiaz, F.; Brinkkemper, S. Research on NLP for RE at Utrecht University: A Report. In Proceedings of the International Conference on Requirements Engineering: Foundation for Software Quality, REFSQ Workshops, Essen, Germany, 18 March 2019. [Google Scholar]
van’t Hul, I.L. Fostering Creativity in the Process of Designing Mobile Applications. Master’s Thesis, Utrecht University, Utrecht, Netherlands, 2020. [Google Scholar]
van Vliet, M.; Groen, E.C.; Dalpiaz, F.; Brinkkemper, S. Identifying and Classifying User Requirements in Online Feedback via Crowdsourcing. In Proceedings of the International Conference on Requirements Engineering: Foundation for Software Quality, Pisa, Italy, 24 March 2020; pp. 143–159. [Google Scholar]
Wouters, J.; Janssen, R.; van Hulst, B.; van Veenhuizen, J.; Dalpiaz, F.; Brinkkemper, S. CrowdRE in a Governmental Setting: Lessons from Two Case Studies. Available online: https://webspace.science.uu.nl/~dalpi001/papers/wout-2021-re.pdf (accessed on 12 August 2021).
Stepanova, E.; Kirikova, M. Continuous Requirements Engineering for Mobile Application Development. In Proceedings of the International Conference on Requirements Engineering: Foundation for Software Quality, REFSQ Workshops, Essen, Germany, 27 February 2017. [Google Scholar]
Patkar, N.; Ghafari, M.; Nierstrasz, O.; Hotomski, S. Caveats in Eliciting Mobile App Requirements. In Proceedings of the Evaluation and Assessment in Software Engineering, Trondheim, Norway, 15–17 April 2020; pp. 180–189. [Google Scholar]
Shah, F.A.; Sirts, K.; Pfahl, D. Using App Reviews for Competitive Analysis: Tool Support. In Proceedings of the 3rd ACM SIGSOFT International Workshop on App Market Analytics, Tallinn, Estonia, 27 August 2019; pp. 40–46. [Google Scholar]
Assi, M.; Hassan, S.; Tian, Y.; Zou, Y. FeatCompare: Feature comparison for competing mobile apps leveraging user reviews. Empir. Softw. Eng. 2021, 26, 1–38. [Google Scholar] [CrossRef]
Lee, Y. Extraction of Competitive Factors in a Competitor Analysis Using an Explainable Neural Network. Neural Process. Lett. 2021, 1–16. [Google Scholar] [CrossRef]
Han, J.; Lee, Y. Explainable Artificial Intelligence-Based Competitive Factor Identification. ACM Trans. Knowl. Discov. Data (TKDD) 2021, 16, 1–11. [Google Scholar] [CrossRef]
Kifetew, F.M.; Perini, A.; Susi, A.; Siena, A.; Muñante, D.; Morales-Ramirez, I. Automating user-feedback driven requirements prioritization. Inf. Softw. Technol. 2021, 138, 106635. [Google Scholar] [CrossRef]
The Business Research Company. Online Travel Agent Market—By Service Type (Vacation Packages, Travel, Accommodation), by Platform (Mobile/Tablet Based, Desktop Based), and by Region, Opportunities and Strategies—Global Forecast to 2023. 2020. Available online: https://www.thebusinessresearchcompany.com/report/online-travel-agent-market (accessed on 19 June 2021).
Schuckert, M.; Liu, X.; Law, C.H.R. Hospitality and Tourism Online Reviews: Recent Trends and Future Directions. J. Travel Tour. Mark. 2015, 32, 608–621. [Google Scholar] [CrossRef]
Phillips, P.; Barnes, S.; Zigan, K.; Schegg, R. Understanding the impact of online reviews on hotel performance: An empirical analysis. J. Travel Res. 2017, 56, 235–249. [Google Scholar] [CrossRef] [Green Version]
Dharel, Y. Online reviews & their impact in tourism business: Need for Strategic Handling of Consumer Reviews. 2021. Available online: https://www.theseus.fi/handle/10024/463648 (accessed on 12 August 2021).
Alaei, A.R.; Becken, S.; Stantic, B. Sentiment analysis in tourism: Capitalizing on big data. J. Travel Res. 2019, 58, 175–191. [Google Scholar] [CrossRef]
Dina, N.Z. Tourist sentiment analysis on TripAdvisor using text mining: A case study using hotels in Ubud, Bali. Afr. J. Hosp. Tour. Leis. 2020, 9, 1–10. [Google Scholar]
Yu, C.; Zhu, X.; Feng, B.; Cai, L.; An, L. Sentiment analysis of Japanese tourism online reviews. J. Data Inf. Sci. 2019, 4, 89. [Google Scholar] [CrossRef] [Green Version]
Siavoshi, M. The Importance of Natural Language Processing for Non-English Languages. 2020. Available online: https://towardsdatascience.com/the-importance-of-natural-language-processing-for-non-english-languages-ada463697b9d (accessed on 12 August 2021).

Figure 1. SWOT matrix.

Figure 2. RE-SWOT matrix.

Figure 3. Process used in the proposed method.

Figure 4. SWOT matrix for KKday.

Table 1. Some POS tags.

Tag	Part of Speech	Tag	Part of Speech
NN	Noun, singular or mass	VBN	Verb, past participle
NNS	Noun, plural	VBP	Verb, non-3rd person singular present
NNP	Proper noun, singular	VBZ	Verb, 3rd person singular present
NNPS	Proper noun, plural	JJ	Adjective
VB	Verb, base form	JJR	Adjective, comparative
VBD	Verb, past tense	JJS	Adjective, superlative
VBG	Verb, gerund, or present participle

Table 2. FPS and category of each feature (σ = 0.1).

		Reference	Competitor
		KKday	Klook
tour guide	FPS	0.82	0.18
tour guide	Situation	Pos, above average	Pos, below average
sun moon lake	FPS	−1.00	-
sun moon lake	Situation	Neg, unique	-
observation deck	FPS	0.67	0.33
observation deck	Situation	Pos, above average	Pos, below average
pocket wifi	FPS	0.82	0.18
pocket wifi	Situation	Pos, above average	Pos, below average
credit card	FPS	0.73	0.27
credit card	Situation	Pos, above average	Pos, below average
theme park	FPS	0.60	0.40
theme park	Situation	Pos, below average	Pos, below average
qr code	FPS	1.00	-
qr code	Situation	Pos, unique	-
day tour	FPS	0.73	0.27
day tour	Situation	Pos, above average	Pos, below average
bus	FPS	0.54	0.46
bus	Situation	Pos, below average	Pos
sim card	FPS	0.60	0.40
sim card	Situation	Pos, below average	Pos, below average
high speed	FPS	0.29	0.71
high speed	Situation	Pos, below average	Pos, above average
food court	FPS	0.27	0.73
food court	Situation	Pos, below average	Pos, above average
souvenir shop	FPS	0.62	0.38
souvenir shop	Situation	Pos, above average	Pos, below average
rail bike	FPS	1.00	-
rail bike	Situation	Pos, unique	-
cable car	FPS	1.00	-
cable car	Situation	Pos, unique	-
front desk	FPS	0.29	0.71
front desk	Situation	Pos, below average	Pos, above average
hot spring	FPS	-	1.00
hot spring	Situation	-	Pos, unique
sky lantern	FPS	0.43	0.57
sky lantern	Situation	Pos, below average	Pos
mrt station	FPS	0.08	0.92
mrt station	Situation	Neu, below average	Pos, above average
rock formation	FPS	0.17	0.83
rock formation	Situation	Pos, below average	Pos, above average
FPS AVG.		0.51	0.52

Table 3. Annotation results.

	Human-Coder
VADER	Positive	Neural	Negative	Total
Positive	200	35	11	246
Neural	2	15	8	25
Negative	1	4	20	25
Total	203	54	39	296

Table 4. The performance measurements of VADER.

	Recall	Precision	F1-Score
Positive	0.9852	0.8130	0.8909
Neural	0.2778	0.6000	0.3797
Negative	0.5128	0.8000	0.6250

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tu, S.-F.; Hsu, C.-S.; Lu, Y.-T. Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies. Future Internet 2021, 13, 226. https://doi.org/10.3390/fi13090226

AMA Style

Tu S-F, Hsu C-S, Lu Y-T. Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies. Future Internet. 2021; 13(9):226. https://doi.org/10.3390/fi13090226

Chicago/Turabian Style

Tu, Shu-Fen, Ching-Sheng Hsu, and Yu-Tzu Lu. 2021. "Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies" Future Internet 13, no. 9: 226. https://doi.org/10.3390/fi13090226

APA Style

Tu, S.-F., Hsu, C.-S., & Lu, Y.-T. (2021). Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies. Future Internet, 13(9), 226. https://doi.org/10.3390/fi13090226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving RE-SWOT Analysis with Sentiment Classification: A Case Study of Travel Agencies

Abstract

1. Introduction

2. Related Works

2.1. SWOT Analysis

2.2. SWOT Development Based on User Reviews

2.3. Sentiment Analysis

2.4. KKday

3. The Proposed Method

4. Experimental Results

5. Discussion, Implications, and Conclusions

5.1. Discussion

5.2. Implication

5.3. Limitations and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI