1. Introduction
Traditionally, the distribution of tourism products is controlled by tourism product suppliers, tourism product wholesalers, tourism product retailers, and the final consumers. Due to emerging technologies, any one of the channels mentioned above can be used to reach customers directly online via social networks or review platforms. Research conducted by the Pacific Asian Travel Association (PATA) found that online travel agencies (OTAs) account for 40% of the total global travel market [
1]. According to Source Research, 70% of people travel to experience the real culture of a region. However, statistics from Trafalgar’s “The Good Life” survey show that 49% of people find that “so-called” real experiences are not “real” enough [
2]. Therefore, some OTAs specialize in offering local in-destination tours and activities for global travelers. These OTAs provide an online platform on which customers can browse and order tours and activities, and customers can leave public reviews after experiencing the tours and activities. Such platforms enable travelers to choose from a variety of tours conveniently and at a low cost.
There are local tour platforms for destinations worldwide, including Taiwan’s KKday, which was founded in 2014; USA’s Peek, founded in 2012; and Germany’s GetYourGuide, founded in 2009. Taiwan’s startup KKday offers fully planned, guided tours in more than 150 different cities. The CEO stated that KKday’s goal is to create a global travel platform that can be used by customers everywhere [
3]. Some OTA giants, such as Booking.com, Expedia, and Trip.com, have also begun to enter this market. Therefore, it can be predicted that the competition in this market will become more and more fierce. For start-up companies, determining how to remain competitive is an important issue. Launching popular products is one way to maintain competitiveness. Some tools can help organizations to formulate competitive strategies, among which the SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis matrix is commonly used. SWOT analysis was proposed a long time ago. Some researchers consider the origin of SWOT to be obscure, but Puyt et al. [
4] believes that SWOT originated in the early 1950s. Though SWOT analysis is an ancient method, it is still a widely used strategic planning tool [
5,
6,
7,
8,
9].
In 2019, Dalpiaz and Parente proposed a tool, known as RE-SWOT, which utilizes user reviews to generate a SWOT matrix, which is then used for the elicitation of requirements [
10]. Dalpiaz and Parente retrieved the user reviews and ratings of the target product to identify strengths and weaknesses. Opportunities and threats were determined from the user reviews and ratings of the competitive products. Dalpiaz and Parente took the app as the research object to illustrate how to use the proposed method to generate the SWOT matrix. Dalpiaz and Parente’s study is very useful because it provides a way to automatically generate a SWOT matrix based on users’ comments. This study intends to investigate how to utilize Dalpiaz and Parente’s method to build the SWOT matrix for KKday products. The user reviews and ratings needed to generate the SWOT matrix are available on the KKday website, but they may be less objective because the company has the authority to delete or retain the comments. User reviews and ratings are preferably obtained from third-party websites, such as the TripAdvisor forum. TripAdvisor is the largest travel forum, having users all over the world, and it has become the main source of data for many travel-related studies, because its online reviews are less likely to be changed by the owners of the tourist company [
11]. However, the TripAdvisor forum lacks user ratings, which are required for Dalpiaz and Parente’s method to create a SWOT matrix. If the problem of a lack of user ratings can be solved, Dalpiaz and Parente’s method could be widely applied in various scenarios.
Recently, some companies have tried to extract comments on their own products from the web and use a sentiment analysis to analyze the polarity or general feeling of the comments [
12]. Sentiment analysis, also known as sentiment classification or opinion mining, refers to the use of natural language processing (NLP), text analysis, and computer linguistics to systematically identify, extract, and quantify the subjectivity of the text [
13]. The application of sentiment analysis to the tourism industry has previously been proposed [
14,
15,
16,
17,
18,
19]. Polarity means the quantified value of the sentiment, and its classification can be binary (positive or negative), triple (positive, negative, or neutral), or more. Therefore, sentiment analysis provides a possible solution to the problem that the Dalpiaz and Parente’s method will not work if user ratings are missing. We can use sentiment analysis to determine the polarity of the sentiment in user reviews and then use the polarity to represent the user rating. In summary, Dalpiaz and Parente’s method is a good way to automatically generate a SWOT matrix, which is an important tool for competitive analysis. This study applied Dalpiaz and Parente’s method to create a SWOT matrix for an OTA, using KKday as an example. The user reviews were collected from the TripAdvisor forum, and the user ratings, which are required by Dalpiaz and Parente’s method but are not available in the TripAdvisor forum, were replaced by the polarity of sentiment in the reviews. In this way, we made Dalpiaz and Parente’s method applicable to the OTA.
The remainder of this paper is organized as follows: In
Section 2, the online travel agency is introduced, and Dalpiaz and Parente’s method and sentiment analysis are reviewed; in
Section 3, the method proposed to integrate sentiment analysis into the RE-SWOT tool is described in detail; in
Section 4, the experiment results are presented; and finally, discussion, implications, and conclusions are provided in
Section 5.
2. Related Works
2.1. SWOT Analysis
SWOT is an acronym for strength, weakness, opportunity, and threat. It is mainly used to analyze the strengths and weaknesses of a company as well as to identify the opportunities and threats faced in the presence of competitors. SWOT analysis is used to conduct an in-depth and comprehensive analysis of the positioning of a company’s own competitive advantages before formulating development strategies. Depending on whether an issue involves a controllable element of the organization and whether it is beneficial to achieve a certain goal, a 4-quadrant matrix can be drawn, as shown in
Figure 1. Strengths represent the unique advantages of the organization, which are factors that can be controlled within the organization. Weaknesses are those elements that weaken the strength of the organization. If organizations want to remain competitive, they must remove weaknesses or at least reduce their negative impacts. Opportunities are external factors that help a company to grow due to industrial progress or environmental changes. Threats are external factors that organizations cannot control, but organizations can take necessary precautions against losses and injuries caused by emergencies.
2.2. SWOT Development Based on User Reviews
Dalpiaz and Parente [
10] proposed a method for requirements engineering based on SWOT analysis. Requirements engineering refers to the application of effective technologies to help companies understand and determine customer requirements. Dalpiaz and Parente believe that user comments are an important basis for understanding customer requirements, so they designed a process to extract product features mentioned by users and the user ratings about the products. The final output of the process is the SWOT matrix, and requirements engineering is based on this matrix. Here, we briefly describe Dalpiaz and Parente’s SWOT matrix before moving onto a closer examination of the process. Dalpiaz and Parente called the matrix built by this process the RE-SWOT matrix, as illustrated by
Figure 2.
Features are strengths of a company’s own product if the performance of this product has a positive or above average evaluation compared with competing products. Similarly, weaknesses are features of a product that perform negatively or below average when compared with competing products. Threats are features of competing products that perform positively or above the market average, and opportunities come from the features of the competitor that perform negatively or below the market average.
Having described the RE-SWOT matrix, we now explain the steps taken to generate the RE-SWOT matrix:
First, the user reviews and ratings on the target product and its competing products are retrieved. Then, those user reviews are preprocessed using the Natural Language Processing (NLP) technique to identify the mentioned features. The original 5-point rating is transformed into a number on a scale from −2 to 2.
For a feature
j of a product
i, the feature performance score (FPS) is calculated as shown in Equation (1).
In Equation (1), m is the number of products, Vi,j is the number of reviews mentioning feature j related to product i, and ri,j[k] is the transformed rating of the k-th user review mentioning feature j of product i, and k = 1..Vi,j.
According to the FPS score, each feature is classified as positive or negative and above the market average or below the market average. A feature j is considered a positive element of product i if FPSi,j ≥ σ or negative if FPSi,j ≤ −σ. The performance of feature j of product i is above the market average if or below market average if . In addition, a feature is considered unique if its FPS score is 1 or −1. In the experiment, Dalpiaz and Parente set σ = 0.1.
The positive and above market average features of a company’s own product can be considered strengths, and those that are negative or below the market average can be considered weaknesses. On the contrary, positive and above market average features of competing products can be considered threats to the company’s own product, and negative and below market average features of competing products can be considered opportunities for the company’s own product.
In Dalpiaz and Parent’s study, the products were mobile apps, so the user reviews and ratings were retrieved from the app store. User reviews and ratings are obtained from different platforms for different products. In principle, credible platforms should be used and the product’s official website should be avoided. The app store is a third-party platform for mobile apps and does not belong to any app company, so the reviews and ratings on it can be regarded as objective. As described above, generation of the RE-SWOT matrix depends on the FPS scores, which are calculated from user ratings. If the platform we want to use only has user reviews and no user ratings, the RE-SWOT matrix cannot be created. The current study investigated tourism products, and, as mentioned in the previous section, the TripAdvisor forum was an appropriate platform for this. However, the TripAdvisor forum has user views but no user ratings. This study intended to address this issue using sentiment analysis.
2.3. Sentiment Analysis
Sentiment analysis is a method of opinion detection. The composition of opinions includes the source of the opinion (that is, the person who expresses the opinion), the target object of the opinion, the polarity of the opinion, and the text containing the opinion. The simplest sentiment analysis simply judges the polarity of the opinion; a more advanced analysis will give a quantitative value to the intensity of the opinion’s polarity. A more complex sentiment analysis will try to judge the source of the opinion and the target object mentioned and may even analyze which aspect of the target object the comment refers to as well as the emotions implicit in the opinion.
Research on sentiment analysis has a wide range of applications, covering film, catering, tourism, and other reviews. For example, Zhang [
20] collected reviews from the Bulletin Board System PTT and analyzed the sentiment score of the adjective terms to recommend the movie. There are currently ready-made sentiment analysis tools that can read text messages and determine the implied emotional polarity and intensity of the message. Of these, VADER (Valence Aware Dictionary and sEntiment Reasoner) is one of the most widely used tools [
21,
22,
23]. VADER is a rule-based sentiment analysis tool that has been proven to be more accurate than other sentiment analysis methods when dealing with text from social media, movie reviews, and product reviews. VADER can not only identify emotional polarity, it can also give a quantitative value of the intensity of emotional polarity [
13,
24].
The advantages of VADER include [
13]
It performs well with social media text covering various fields;
It does not require any training data but uses a generalized sentiment dictionary based on the psychological valence and gold-standard sentiment lexicon;
It has a satisfactory level of efficiency for online streaming data and does not seriously impact the balance between speed and performance.
Given a sentence, VADER gives scores in four categories: positive, negative, neutral, and compound. The compound score is derived by normalizing the sum of the positive, negative, and neutral scores. This score ranges from −1(most extreme negative) to +1 (most extreme positive) [
25]. Elbagir and Yang [
21] proposed a method to transform the VADER score into a rating on a scale of −2 to 2. The transformation rule is as follows:
If compound score > 0.001, then
If compound score < −0.001, then
If compound score is between −0.001 and 0.001, the rating is set to 0.
We can now propose a solution to the RE-SWOT problem that we posed in the previous subsection. If the user ratings are required but are lacking, we can utilize VADER to calculate the sentiment scores of user reviews and use Elbagir and Yang’s method to generate user ratings.
2.4. KKday
KKday was founded in 2014, and its headquarters are in Taipei, Taiwan. It is Taiwan’s first e-commerce platform dedicated to providing in-destination tour experiences and itineraries. KKday provides diversified local experience itineraries, allowing travelers to go deeper into the local area and plan their itineraries more freely and easily. Currently, KKday platform is used in more than 80 countries and 500 cities around the world and provides more than 20,000 travel itineraries and experiences. The service languages are not only traditional and simplified Chinese, but also English, Vietnamese, Thai, Japanese, and Korean. KKday has established branches around the world, including in Taiwan, Hong Kong, Singapore, Japan, South Korea, Malaysia, Thailand, the Philippines, Vietnam, Shanghai, and other places. It is continuing to expand across North America, Europe, New Zealand, and Australia. In 2016, KKday formed an alliance with Asia Miles.
This study intends to use the RE-SWOT tool to analyze the competitiveness of KKday’s products. At present, little research has been conducted on local tour experience platforms such as KKday, and as far as we know, there has been no research on competitiveness analysis. Lu [
26] tried to determine success factors related to KKday using field observations and interviews. Hsu [
27] analyzed the business model of the KKday platform. Hou [
28] conducted a survey of KKday customers to verify the impacts of its innovative service characteristics on its functional value and emotional value.
3. The Proposed Method
Figure 3 shows the process used in the proposed method, which can be broken down into four phases. We discuss these in detail.
At this stage, English text messages related to KKday and Klook were extracted from the TripAdvisor forum, and the web crawler was written in Python. The extracted text messages had to go through data cleaning before being used in the next stage. As the data collection and transfer process may cause data format errors, data loss, or other problems, it was necessary to preprocess the data prior to analysis [
29]. Data cleaning is the process of detecting and deleting inaccurate or useless records, such as spelling errors and non-target languages, leaving only valuable and relevant data. In this study, messages with a short length or those repeatedly posted by a single user were deleted. Because repeated messages may cause statistical bias, we only kept one of the repeated messages to preserve the authenticity and accuracy of the data sample [
28].
- 2.
Feature extraction
Referring to Dalpiaz and Parente’s study, we identified features of the user reviews derived from the previous stage through the following procedures:
Tokenization is the act of splitting a string of text into list of tokens, from which certain characters, such as punctuation, may be thrown away. We provide the following message as an example:
“I buy ticket in KKdays, and they will send me QR code through email. Use QR code to go in, very convenience, easy, and save time.”
The message can be broken into two sentences: “I buy ticket in KKdays, and they will send me QR code through email.” and “Use QR code to go in, very convenience, easy, and save time.” The two sentences can be further tokenized into “I”, “buy”, “ticket”, “in”, “KKdays”, “and”, “they”, “will”, “sent”, “me”, “QR”, “code”, “through”, “email”, “Use”, “QR”, “code”, “to”, “go”, “in”, “very”, “convenience”, “easy”, “and”, “save”, and “time”.
- b.
Lowercase transformation
The purpose of converting uppercase to lowercase is to prevent words from being recognized as different words due to the difference in capitalization. Take the tokens “I”, “KKdays”, and “QR” in the previous step as examples. These tokens would be converted to lowercase: “i”, “kkdays”, and “qr”.
- c.
Removal of stop words
Some words have little lexical meaning or can only express grammatical meaning within sentences, such as a, the, and he. Some words appear very frequently, such as ‘want’, but it is difficult for search engines to narrow the search scope and provide truly relevant search results. Such words are called stop words and can be ignored without losing the meaning of the sentence. In this study, we used the python NLTK library to remove stop words. Taking the above sentences as an example, after removing stop words, we were left with “buy”, “ticket”, “kkdays”, “sent”, “qr”, “code”, “email”, “qr”, “code”, “go”, “convenience”, “easy”, “save”, and “time”.
- d.
POS (part-of-speech) tagging
Features are usually described by nouns, verbs, and adjectives. Therefore, we used the part-of-speech tagging method in the python NLTK library to assign a label to each token to indicate its part of speech (POS).
Table 1 lists some POS tags. Taking the tokens without stop words as an example, after tagging, we were left with “buy, VB”, “ticket, NN”, “kkdays, NNS”, “sent, VB”, “qr, NN”, “code, NN”, “email, NN”, “qr, NN”, “code, NN”, “go, VB”, “convenience, NN”, “easy, JJ”, “save, VB”, and “time, NN”.
- e.
Lemmatization
Tokens may be recognized as different keywords due to differences in the part of speech, which results in the extraction of repeated features. This study used the Lemmatizer method in python NLTK library to remove inflectional endings and return tokens to the base form. For example, “kkdays” was returned to “kkday” and then tagged as “NN”.
- f.
Collocations
Collocation refers to a sequence of two or more consecutive words occurring more frequently than can be accidental. Thanopoulos et al. [
30] compared several statistic metrics to find collocations. Their experiment results showed that the likelihood ratio performs well relative to other metrics; therefore, this study adopted the likelihood ratio to identify collocations. After that, the collocations were sorted in descending order based on frequency, and the top 20 were selected as features.
- g.
Merging similar features
It was possible that some of the features selected in the above step could be similar. Therefore, we checked the similarity manually and unified similar features. For example, the phrases “high speed” and “speed rail” both describe high-speed rail, so we always used “high speed”.
- 3.
Sentiment Analysis
As mentioned previously, the user reviews extracted from the TripAdvisor forum were only text messages without user ratings. To calculate the FPS at the next stage, we first used VADER to generate a sentiment score for each user review and transformed the scores into a rating on a scale of −2 to 2 according to Elbagir and Yang’s method [
21], as described in
Section 2.3.
- 4.
SWOT analysis
To generate the SWOT matrix, we needed to first calculate the FPS for each feature. Suppose that there are
m1 and
m2 user reviews related to the products of KKday and Klook, respectively, and that
n features are generated at the stage of feature extraction. Let
Mk be a
mk ×
n binary matrix,
Rk be a column vector of
mk entries, and
Bk be a column vector in which all entries are the same constant, 2, where
k ∈ {1, 2}.
Mk[
i,
j] = 1 if feature
j is mentioned in review
i, and
Rk[
i, 0] is the rating of user review
i, where
i = 0 … (
mk − 1), and
j = 0 … (
n − 1). The FPS of feature
j is calculated using Equation (2):
where
Based on the FPS, the four parts of SWOT matrix can be determined as follows. Let
fs,
fw,
fo, and
ft be the set of features belong to strengths, weaknesses, opportunities, and threatens, respectively, and be defined as
The symbol σ is a user-defined parameter. The cardinality of the above sets is negatively related to the value of σ; therefore, the setting of σ depends on how many features the manager expects to see.
4. Experimental Results
In our experiment, we used the RE-SWOT tool to analyze the competitiveness of KKday’s products. As mentioned previously, we collected user reviews of KKday’s product from the TripAdvisor forum and used Elbagir and Yang’s method to generate user ratings. There is one other thing that is important for generating the RE-SWOT matrix: the opportunities and threats are external factors that come from competitors. According to Owler [
31], the main competitor of KKday is Klook. Therefore, we also collected user reviews related to Klook from the TripAdvisor forum and used the same method to generate user ratings for Klook’s products. In addition, we wanted to find out how foreigners feel about local tours in Taiwan. Therefore, this study extracted reviews on local tours in Taiwan provided by KKday and Klook from the TripAdvisor forum. Repeated and overly short messages were removed, leaving 206 messages related to KKdays and 158 messages related to Klook. Next, the sentiment score of each review was generated using the VADER tool. In addition, 20 features were extracted from these reviews through the process of feature extraction. We could then calculate the FPS for each feature using Equation (2), and the results are listed in
Table 2. To generate the SWOT matrix for KKday, we needed to categorize features into four groups (strengths, weakness, opportunities, and threats), according to Equations (4)–(7). The user-defined parameter σ was set to 0.1, which is the same value as that used by Dalpiaz and Parente [
10]. The SWOT matrix for KKday is presented in
Figure 4.
The features presented in
Figure 4 can be explained based on the part of the SWOT matrix to which they belong and the comments from which these features originate.
Features with above-average FPS scores that originate from positive comments on KKday were regarded as strengths of KKday. We checked the comments related to these features to determine what they meant, as follows: (a) tour guide—customers feel that tour guides are friendly and introduce tourist attractions by telling stories; (b) observation deck—customers mentioned that the ticket price of the observation deck was reasonable and that they enjoyed the scenery; (c) pocket wifi—customers thought that the process of renting the wifi online and claiming it at the airport was convenient; (d) credit card—customers were satisfied with the offer for credit card payments; (e) qr codes—customers thought that the qr code tickets were convenient; (f) day tour—customers were satisfied with the range of day tours; (g) souvenir shop—customers mentioned that items purchased online could be picked up at the airport; (h) rail bike—customers mentioned that the rail bike experience in South Korea was interesting; and (i) cable car—customers were satisfied with the price of cable car tickets.
- 2.
Weaknesses
Features with below-average FPS scores, representing negative comments about KKday, were regarded as weaknesses of KKday. We checked the comments related to these features to see what they meant. Some reviews showed that users were unsatisfied with the Sun Moon Lake Tour, for example, the meeting place was not clearly marked, or the tour guide was late. Now that KKday is aware of these weaknesses, it can try to remove or minimize them.
- 3.
Threats
Features with above-average FPS scores, represented by positive comments about Klook, were regarded as threats to KKday. We checked the comments related to these features to see what they meant: (a) high speed—customers like the discount for the high speed rail ticket booking available on the Klook platform; (b) food court—customers mentioned that it was easy to reach the food court from the tourist attractions; (c) front desk—the customers’ experience of the Klook service was better than that of front desk service; (d) hot spring—customers mentioned that Klook provided a satisfactory discount on the activity fee; (e) mrt station—customers mentioned that the pick-up points were near the MRT station; and (f) rock formation—customers were satisfied with the shuttle bus tickets to Yehliu Geopark. KKday should be aware of these threats coming from its competitor, and it could utilize its own strengths to lessen these threats.
- 4.
Opportunities
Features with below-average FPS scores, representing negative comments about Klook, are regarded as opportunities to KKday. Among the comments collected up to now, no features met these conditions. One of the possible reasons for this is that the number of reviews collected was insufficient. Another possible reason is that most customers are satisfied with Klook.
In this study, VADER was used to analyze the sentiment polarity of user reviews. Therefore, we investigated the extent to which VADER can correctly predict sentiments associated with user reviews. The effectiveness of VADER relies on human judgement. We asked two people, both of whom had travel experience and had no relationship with this research, to be human coders. One was responsible for annotating user reviews related to KKday, and the other was responsible for annotating those related to Klook. Because fine-grained sentiment polarity is a bit too challenging for human coders, we asked the human coders to annotate sentiments as positive, negative, and neutral. Manual annotations were used as ground truth to measure the validity of VADER.
Table 3 shows the annotation statistics of VADER and the human coders. The accuracy, which is the ratio of the number of correct predictions to the total number of reviews, was (200 + 15 + 21)/297 = 0.7946. The performance measured in terms of precision, recall, and F1-Score is shown in
Table 4. It was observed that VADER performed better in the classification of positive and negative reviews compared with that of neutral reviews. This is presumably due to the relatively narrow numerical interval of VADER scores defined as neutral. Another possible reason is that human coders tend to annotate a sentiment as neutral when positive or negative sentiments are less obvious in the review. It can be seen from
Table 3 that 54 reviews were annotated by human-coders as neutral, which was twice as many as those annotated by VADER. To validate the features extracted by our method, we also asked the two human-coders to select up to ten features from each review. In a total of 296 reviews, the total number of features selected by our method is 2960, while the total number of features selected by human-coders is 2353. For a same review, if the feature selected by our method interacts with the feature set selected by the human-coder, then it is considered as matching. Among the 2960 features, there are 1718 matching features. Therefore, the recall, precision, and F1-score metrics are 0.7301, 0.5804, and 0.6467, respectively. Compared to the feature set selected by computer, the feature set selected by human-coders is smaller, which leads to high false positive rate. As a result, the precision is lower than recall. If the set of manually selected features can be expanded, the precision would be improved.
5. Discussion, Implications, and Conclusions
The SWOT matrix is a tool that is widely used to help companies conduct competitive analyses. To generate a SWOT matrix, a company needs to collect detailed data to understand internal and external environmental factors. Extracting data from the web has become popular thanks to the advancement of information technology and the development of social media. Dalpiaz and Parente [
10] proposed a method, called RE_SWOT, to automatically generate a SWOT matrix from online user reviews and ratings. Dalpiaz and Parente took mobile apps as an example and collected user reviews and ratings from the app store. Dalpiaz and Parente’s method provides good direction if a company has difficulty generating a SWOT matrix. However, if user ratings are not available, it becomes difficult to use RE-SWOT in practice. Therefore, this study aimed to solve this problem by using sentiment classification. We used an online travel agency as an example to illustrate the feasibility of our idea.
5.1. Discussion
SWOT analysis is an important technique used by organizations to identify their strengths, weaknesses, opportunities, and threats. Strengths and weaknesses are internal elements, while opportunities and threats are external elements, usually coming from the company’s competitors. Through SWOT, organizations can understand their positions and develop competitive strategies accordingly. Up to now, SWOT analysis has been a useful tool in various fields [
32]. In addition to the enterprise level, SWOT can be applied to the application level, such as in tourism destination planning or for analyzing products or services [
33]. SWOT analysis can facilitate software requirements as well [
34,
35]. Recently, eliciting requirements from online user reviews has become a software engineering trend [
36,
37]. With the help of NLP, Groen et al. [
38] found that the automatic analysis of user requirements has better scalability than manual analysis. The Requirements Engineering Lab (RE-Lab) at Utrecht University continuously conduct research on NLP for software requirements engineering [
39]. Inspired by SWOT analysis, Dalpiaz and Parente [
10] proposed the RE-SWOT method to automatically generate a SWOT matrix for mobile apps with online user reviews gathered from app stores. RE-SWOT is a novel tool that can reduce the burden associated with generating a SWOT matrix. RE-SWOT requires user ratings to calculate the FPS, which is used as the basis to determine which features belong to which part of the SWOT matrix. However, Dalpiaz and Parente did not mention what to do if there are no user ratings. van’t Hul’s Master’s thesis focused on the fostering creativity of app designers [
40]. Vliet et al. [
41] used a crowdsourcing approach to improve the elicitation of requirements with existing NLP techniques. Considering the difficulties involved in conducting RE within a governmental organization, Wouters et al. [
42] developed a crowd-based method, called KMar-Crowd, to encourage user engagement in requirement elicitation. Therefore, the problems associated with RE-SWOT were still left unresolved.
Because the mobile industry is one of the fastest growing and most dynamic IT industries [
43], understanding essential factors associated with mobile app requirements is an important issue [
44]. Valuing the idea of Dalpiaz and Parente, some researchers also conducted competitive analyses of mobile apps using online user reviews. Shah et al. [
45] developed a tool, called RevSUM, that can automatically extract the most relevant features from user reviews for an app developer. Assi et al. [
46] proposed a review analysis method, called FeatCompare, that can automatically identify high-level features by comparing user reviews among competing apps. Lee [
47] utilized an explainable neutral network to establish a classification model to extract product features that lead to differentiation from competitors. Later, Han and Lee [
48] established a more suitable model to enhance the performance of Lee’s model. Because of the semantic gap between end users and developers, the priorities of requirements perceived by users are usually different from that of developers. Kifetew et al. [
49] extracted quantifiable properties from user feedback and used domain ontology to narrow the gap to prioritize requirements according to user preferences. However, no study has improved the RE-SWOT.
Because Dalpiaz and Parente studied mobile apps and the app store has rich reviews and ratings, the RE-SWOT problem appeared minor to them. Since RE-SWOT provides an easy way to conduct a SWOT analysis, making RE-SWOT practicable for any kind of product, it is valuable. With such thoughts, this study aimed to propose a way to cope with a lack of user ratings when using RE-SWOT. Sentiment classification is a process of automatically recognizing opinions in text and determining their emotional polarity according to the emotions expressed by users. To a certain extent, user ratings express the emotional polarity of comments. Therefore, this study employed sentiment classification to generate sentiment polarity when user ratings are not available. The sentiment classification tool used in this study was VADER, because it is especially suitable for emotions expressed in social media. Given a sentence, VADER gives scores in four categories: positive, negative, neutral, and compound. In order to better adapt to the initial RE-SWOT method, the VADER scores were transformed into number on a scale of −2 to 2. Consequently, we were able to smoothly incorporate sentiment classification into the RE-SWOT method without changing the original procedures.
To sum up, RE-SWOT provides an easy way to automatically generate a SWOT matrix, and the results of this research help to enable RE-SWOT to be used for any kind of product. Traditionally, SWOT analysis required interviews to be conducted with stakeholders and market research to be performed to gather necessary information to generate a SWOT matrix. With the advent of social media, more and more users are expressing their opinions online. Online user reviews provide valuable information that can be used for SWOT analysis. With sentiment classification, SWOT analysis can be conducted for any product with online user reviews by means of RE-SWOT. Using the proposed method, we conducted SWOT analysis for OTAs with data collected from the TripAdvisor forum. Even though the forum lacks user ratings, the experimental results show that the proposed method was able to generate a SWOT matrix successfully.
5.2. Implication
In order to retain a competitive advantage, organizations need to carry out various management activities continuously. Strategic planning is an important organization and management activity that aims to provide a clear road map for implementing strategies and achieving business goals. There are various tools to help organizations make strategic plans, and the SWOT matrix is one of them. The first step in strategic planning is to collect the information necessary to understand and determine the problems, challenges, and trends forming the strategy. Many researchers and industry experts have realized the value of online user comments. Various efforts have been made to mine valuable information from online user reviews, but, to the best of our knowledge, Dalpiaz and Parente’s study was the first to link online user reviews with a known strategic planning tool. Their procedure is clear and easy to follow. We employed sentiment classification to make RE-SWOT method proposed by Dalpiaz and Parente more practical. As the SWOT matrix is a well-known tool, managers can easily apply RE-SWOT results for strategic planning as usual.
With the improved RE-SWOT, SWOT analysis using online user comments is no longer limited to assessing mobile apps. SWOT analysis is a framework that facilitates data driven strategic decisions. Therefore, data fed into the process for building the SWOT matrix determine the value provided by the SWOT analysis. Even though RE-SWOT helps to semi-automate the process of building the SWOT matrix, organizations must keep this concept in mind. Depending on the field of application, organizations have to consider appropriate data sources when using the improved RE-SWOT. The quality and quantity of data are both important for RE-SWOT. A large amount of data can help organizations identify as many features as possible, thereby deriving critical features. In some fields, such as the tourism and software industries, it is easier to find relevant and well-known online forums that are big enough to be data sources. The contents of these forums are highly relevant to this field, as organizations need to be able to obtain rich, high-quality data from these forums. However, in some areas, the relevant online forums may not be large enough, or there may not be any relevant online forums. In such cases, it is very difficult to obtain high-quality data. Therefore, the improved RE-SWOT method can be used to aid SWOT analysis, but this does not mean that it can completely replace the manual SWOT analysis method.
The online travel agent market has declined by −20.0% due to the COVID-19 pandemic, but it still has great potential and is expected to grow at a compound annual growth rate of 14.8% from 2021 and reach USD 902.2 billion by 2023. The online travel agent market of the Asia Pacific region was the largest in 2019 and is expected to experience the fastest growth in the future [
50]. Hence, new OTAs are springing up in this growing market. Start-up OTAs need to analyze their own market positions and competitive advantages to survive in the fiercely competitive market. Some researchers have shown that online reviews have a significant impact on the tourism business [
51,
52,
53]. Some researchers have conducted sentiment analysis on tourism online reviews [
54,
55,
56]. These studies usually utilized sentiment analysis to classify words or features in the reviews into positive, negative, or neutral. Some further compared the classification results using different sentiment analysis methods. This classification may help an OTA understand which features are more attractive to customers. To shape a competitive strategy, the OTA still needs to rely on well-established strategic planning tools. The improved RE-SWOT could be an effective strategic planning tool for OTAs.
5.3. Limitations and Future Work
The limitations of this study may be considered future research directions. First, following Dalpiaz and Parent’s method, the extracted features were fine-grained and mentioned as word pairs. When there are more competitors to compare, the number of reviews becomes large, and the comparison of fine-grained features becomes challenging and time-consuming [
46]. Therefore, Assi et al. suggested that the comparison should be performed on high-level features. A so-called high-level feature represents the main function and characteristic of a product. Assi et al. used the weather forecasting app as an example. If three different word pairs extracted from user reviews are “Terrible accuracy”, “Inaccurate Forecast”, and “Excellent accuracy”, these three fine-grained features present the same high-level feature: “Forecast accuracy”. Obviously, the use of high-level features can reduce the workload in regard to comparisons. Therefore, we may employ a high-level feature extraction model in future work. Second, some researchers have found that combining SWOT analysis with another strategic planning framework, such as the PESTEL (political, economic, sociological, technological, environmental, and legal) framework, AHP (analytic hierarchy process), or five forces model, can provide more accurate information and allow a more powerful strategy to be formed [
32,
33]. In future research, we may study how to add another strategic planning framework into our procedure and propose a hybrid RE-SWOT tool. Third, some of the settings in this study were inherited from Dalpiaz and Parente’s method. For example, this study used five-point scale ratings, and the parameter σ was set at 0.1. We predict that different settings would lead to different results. It may be worth trying a different threshold to quantize the ratings or using a greater number of quantization levels. Additionally, we could use a different σ value and observe the results. Through such exploratory experiments, we may identify a rule of thumb for these settings. Fourth, other than lexicon and rule-based sentiment analysis tools, such as VADER, there are tools that are implemented in machine-learning-based sentiment analysis. VADER is easy to use, but it may be worth trying other sentiment analysis techniques and observing the results. Different sentiment analysis techniques perform differently in terms of accuracy, time complexity, and other metrics [
24]. It is worth noting that some metrics are trade-offs. In addition, the performance of sentiment analysis technology may also be affected by the dataset. Fifth, this study focused on English user reviews, but user reviews in different languages would also provide valuable information for OTAs, because tourists can come from all around the world. However, existing NLP tools do not support non-English languages sufficiently. Although there are more than 7000 languages in the world, most NLP processes use seven major languages: English, Chinese, Urdu, Persian, Arabic, French, and Spanish [
57]. Design of a non-English NLP tool is beyond the scope of this study. It is expected that NLP tools will support more languages in the future, so we will be able to extend our method to other languages. Sixth, this research mainly focuses on the use of sentiment analysis to solve the problem of RE-SWOT, and less attention is paid to the research topics of sentiment analysis itself. There are some common and important issues in sentiment analysis, such as how to deal with opposite polarities of sentiment in the same review. This study relied on VADER for sentiment analysis, and no additional algorithm is designed to deal with such situation. This may affect the validity of the research results, and further research on this topic should be considered in the future. Finally, a performance evaluation of the results achieved in this paper needs the help of domain experts in a rigorous process and is a topic for future work.