1. Introduction
During the 2010s, user-generated content (UGC) increased notably, as did its use as a data source for researchers [
1] and an information source for prospective customers [
2]. UGC is usually disseminated through electronic word-of-mouth communication (eWoM), as users and consumers share their comments and ratings on social media. In the field of travel, tourism, and hospitality, UGC evolved similarly [
3,
4]. Given that social media content generated by visitors coexists with content generated by destination stakeholders, the data source used in this study consists solely of traveler-generated content (TGC), understood to be narratives, opinions, and ratings shared on social media and based on visitors’ experiences in travelling, sightseeing, entertaining, shopping, lodging, and dining at tourist destinations [
5]. According to Marine-Roig [
6], this TGC constitutes a new and unsolicited organic image-formation agent in Gartner’s [
7] model (p. 15), with penetration into the market through eWoM greater than that of the induced and autonomous sources [
8].
Currently, most researchers use online travel reviews (OTRs) hosted on travel-related platforms as sources of TGC [
9]. OTRs are characterized by a large diversity of languages, which requires the use of big data analytics [
10] and natural language processing (NLP) techniques [
5] for study. Most of these studies are dedicated to the accommodation sector; however, research on the contribution of lodging OTRs to online destination image construction is scarce, and even rarer are studies based on peer-to-peer lodging OTRs. Some researchers did not consider accommodation and gastronomy as attributes of the tourist destination [
11], but it seems clear that tourists damage the destination image online, for example, when they share on social media that they found insects in their rooms or in their food, and that poor experiences with destination services could lead to an overall poor experience at that destination [
12].
The objective of this study is to advance one more step in a line of research that began with a doctoral thesis [
13] on destination image analytics through TGC. The most prominent milestones of the research were several publications on the following topics: methods for selecting the most suitable web sources of tourism data [
14]; the roles of identity and authenticity in tourist destination image construction [
15]; tourism analytics for a special issue on smart destinations [
16] and on religious tourism [
17,
18]; methods for selecting, downloading, arranging, and debugging tourist data from websites [
19]; destination attribute assessment versus top countries of residence of bloggers and reviewers [
20]; affective component of the destination image [
21]; methods to analyze multiscale destinations through spatial coefficients [
22]; methods for extracting information from paratextual elements [
23] and HTML meta-tags [
24] of online travel reviews; analysis of territorial tourist brands segmented by languages and countries [
25]; measurement of the gap between projected and perceived images through compositional data analysis [
26]; framework approach for measuring images through online travel reviews on sightseeing, lodging, and dining experiences [
6]; measurement of online gastronomic images [
27]; impact of personal safety on online destination image through natural language processing segmented by language [
5]; and destination image analytics for design of experiences and tourist products [
28].
Specifically, the purpose of this study is threefold: the first is to build a theoretical and methodological framework to measure online destination images and visitors’ satisfaction and loyalty through TGC categories, metrics and rankings; the second, to introduce into the model, as an element of discussion, the relationships between semiotics and consumer behavior; and, finally, to explore whether the TGC big data allows us to demonstrate complex relationships between the aforementioned constructs. The case study is a common method for testing a conceptual model in the field of tourism and hospitality [
29]. In this research, the model is checked by a comparison between districts of the city of Barcelona (Catalonia) through 753,366 Airbnb OTRs collected just before the outbreak of the COVID-19 pandemic [
30].
2. Online Destination Image, Satisfaction, and Loyalty Relationships
Disconfirmed prior expectations regarding the performance of a product or service are the concepts that best capture the formation of consumer satisfaction [
31,
32]. For decades, researchers have shown that customer satisfaction impacts and drives customer loyalty: a satisfied customer is loyal [
33] to a greater or lesser degree according to their personal characteristics [
34]. For the same constructs online, research findings indicate that e-satisfaction impacts on e-loyalty [
35,
36]. Regarding online and offline environments, customer loyalty is higher when the service is chosen online than when it is chosen offline, and the relationship between overall satisfaction and loyalty is reinforced even more online [
37]. However, the increase in competition on the Internet makes it easier for customers to be less loyal [
38], and satisfaction acquires a greater weight in the online satisfaction-loyalty relationship.
In the field of tourism and hospitality, researchers have demonstrated relationships between destination image, satisfaction, and loyalty mostly through surveys. The most commonly used keywords in destination image definitions are ‘impression’, ‘perception’, ‘belief’, and ‘idea’ [
39]. For example, Crompton [
40] defines an image as ‘the sum of beliefs, ideas, and impressions that a person has of a destination’ (p. 18). Other keywords are ‘expectations’ and ‘feelings’ [
41]. According to Chon [
42], the (dis)satisfaction of tourists depends to a large extent on their expectations regarding the destination and/or the image perceived prior to the trip, in contrast to their experiences during the visit. The theory adds, as antecedents of satisfaction, the perceptions of price and quality, or the construct ‘value for money’ which consist of a combination of both perceptions [
43] as a high price can be both a positive (product quality indicator) and a negative (economic sacrifice) sign. Generally speaking, the degree of destination loyalty is measured by the intention to visit or revisit a tourist destination and by the willingness to recommend it [
44]. As used in this article, tourist loyalty is seen in line with the concept of TGC discussed above. That is, tourists’ loyalty is comprised of intention to revisit a place, to buy a tourist product again, to return to a restaurant or accommodation, to reuse a means of transportation, and/or to recommend or have a predisposition to recommend it.
In regards to the data source used in this study, Lam et al. [
45] found, through surveys, a significant and positive relationship between UGC platform co-creation experiences and the cognitive and affective components of the image; and that these images impacted traveler overall satisfaction. Through meta-analyses, two teams of researchers [
46,
47] showed that the impact of destination images on loyalty was significant to a greater or lesser degree taking into account image dimensions and tourists’ loyalty manifestations. That is, considering designative and appraisive images, and intentions to visit or revisit an attraction or a place (behavioral loyalty) and to recommend it (attitudinal loyalty). Using integrated models, several authors [
48,
49] demonstrated the impact of destination images on tourist loyalty through satisfaction. Other authors [
50,
51] reached the same conclusion through intermediate constructs.
For the purposes of this study, research by Gim [
52] is of great interest; it produced a model to compare three neighboring areas of Korea based on a survey sample of 3756 tourists who had only visited one of the three areas. Based on visitor satisfaction in relation to the destinations’ attributes, Gim calculated overall satisfaction and demonstrated its impact on the post-visit image and on the intention to revisit and recommend the site, as well as the influence of the post-visit image on tourist loyalty. He then found that overall satisfaction has a stronger direct effect on the image than on loyalty, but if its indirect effect is considered, the overall effect on loyalty outweighs that of the image. The tourist destination image model [
6], proposed below from a holistic perspective, also considers that the experience lived by tourists is the central element of the hermeneutical circle of image formation and a precedent of tourist satisfaction and loyalty.
2.1. Destination Image Formation
In the 1990s, several authors [
7,
53,
54,
55,
56,
57,
58] laid the theoretical and methodological foundations to analyze the formation and modification of tourist destination images. Recently, other authors [
59,
60] have proposed a holistic image formation framework, distinguishing between induced and organic images [
61], between primary and secondary images [
62], and between cognitive, affective, and conative images [
63], but the model does not consider Gartner’s division [
7] between induced, autonomous, and organic tourism image formation agents.
Marine-Roig [
6] proposed an all-encompassing model of building tourist images represented by a hermeneutical circle (
Figure 1), with the tourist experience in the center. The flow of information circulates from the images projected by the agents (representations) to the images perceived by the tourists, and these perceived images are transmitted (feedback) and become images projected through word-of-mouth communication (WoM) and eWoM. Tourists evaluate their experiences based on expectations derived from the projected (re)presentations of destinations, and there are usually discrepancies between expectations and experiences [
8,
26,
64]. While there are images perceived by tourists before the visit, the experience itself is the essential image source. According to Gim’s study [
52], visitors’ satisfaction and loyalty come from experience and are incorporated into the destination image circuit through the feedback arc (
Figure 1).
Most authors [
65] have used the cognitive-affective model to analyze images and, in some cases, have included a combination of both components, known as overall or global image [
56,
66]. Many authors, such as Rapoport [
63] and Gartner [
7], included the conative component of destination images in the previous model, resulting in a tripartite cognitive-affective-conative model.
Marine-Roig [
6] adapted the model of Pocock et al. [
67] to analyse destination images through traveller-generated content. In short, she added ‘facilities’ within the designative dimensions to accommodate the mental pictures that Lynch [
68] called relatively abstract, when a visitor identifies a structure such as a museum, hotel, restaurant or station. She also included a temporal dimension [
69], and divided the prescriptive aspect (response to previous designative and appraisive stimuli) into two dimensions: attitudinal and behavioral responses.
Figure 2 shows an adaptation of the model [
6] that includes semiotic nomenclature.
2.2. Destination Image Semiotic Aspects
In his prolific work on signs and behavior, Morris [
70,
71] distinguished three main types of signs (designative, appraisive, and prescriptive) and three types of use (informative, valuative, and incitive). A fourth type of sign and use (formative-systemic) is not included in this study. Within sign science, Morris defined three subdivisions: syntactics (sign-sign relations), semantics (sign-object relations), and pragmatics (sign-interpretant relations). Each of these subdivisions of semiotics, as a whole, can represent pure semiotics (language to talk about signs), descriptive semiotics (actual signs), and applied semiotics (use of knowledge about signs to achieve various aims). The sign’s semantic dimension (designative-appraisive-prescriptive) is hierarchical [
72]: ‘a kind of rudimentary hierarchy of effects in which prescriptive modes of signifying depend on appraisive modes which, in turn, draw upon designative modes’ (p. 6).
Although Mick [
73] argued the implications of semiotics for research on consumer behavior, as well as the relationships between brand image, purchase willingness and consumer satisfaction [
74], the applications of Morris’s trichotomies in destination image studies are rare: a book [
67] and a book chapter [
75] on images in urban environments, and an article on country images [
76] are highlighted.
The tripartite model (
Figure 2) adapted from Marine-Roig [
6] represents the semantic and pragmatic aspects of Morris’s signs: designative (informative use), appraisive (valuative use), and prescriptive (incitive use). For example, a summer visitor (temporal dimension) walks along the promenade of a tourist destination (spatial dimension) and observes a building (structure) that he/she identifies as a restaurant (facilities). The visitor thinks the atmosphere is pleasant and that the restaurant has desirable features (affective dimension). He/she decides to enter and consume (behavioral response). Then, the visitor shares his/her gastronomic experience through an online travel review, in which he/she evaluates the experience (evaluative dimension), expresses his/her intention to return to the place (behavioral response), and recommends the restaurant to other visitors (attitudinal response). The model [
6] in
Figure 2 was recently adapted by Lojo et al. [
77] to deduce incongruities between online projected and perceived destination images from textual and visual UGC.
Perussia [
78] proposes a method based on semiotics to analyze peculiar types of images or representations, such as the image of a place, through a survey of individuals who have received verbal stimuli. Instead, the semiotic aspects of
Figure 2 allow us to deduce a parallelism between the place or destination image and the tourist satisfaction and loyalty constructs as defined above from TGC. The image perceived before the visit derives from the destination’s attributes and attraction factors [
40] contemplated in the designative aspect (informative use). The ‘pull motive(s)’ [
79] for travelling are associated with those qualities and features of a tourist destination that attract tourists [
20,
80]. Motivation to travel leads an individual to choose a destination that can bring satisfaction [
81]. That is, the image perceived in the designative phase leads to motivation to travel, and the tourist’s motivation generates expectations of satisfaction. The degree of tourist (dis)satisfaction derives from the disconfirmed pre-trip expectations regarding in-situ experiences, and it is measured through the appraisive aspect (valuative use: affective and evaluative dimensions). Finally, the prescriptive aspect (incitive use) responds to the previous stimuli and enables the measurement of the tourist’s loyalty through the attitudinal and behavioral dimensions.
2.3. Perspective of the Conceptual Model as a Whole
The hermeneutical model in
Figure 1 is intended to be holistic because it includes the relationships between the main concepts and constructs that influence destination image formation. That is, no single element of the model can represent the overall image. Instead, the semiotic model in
Figure 2 is intended to analyze the image using TGC as a data source, but part of the model is useful for such an analysis from induced and autonomous sources; in addition, the entire model can be implemented through surveys. Furthermore, as shown in
Figure 1 and explained in the previous paragraph, the semantic and pragmatic aspects of the model in
Figure 2 allow us to infer the satisfaction and loyalty of visitors.
3. Materials and Methods
Barcelona, the capital of Catalonia, is a smart city [
82] and an outstanding Mediterranean destination [
16]. After the Canary Islands, Catalonia is the second-most visited Spanish region by tourists [
83]. It is one of the world’s leading cities for hosting international conferences and cruise ships, and the city has been the setting for numerous films [
84,
85]. According to official figures [
86], during 2019 Barcelona hosted 10,242,713 visitors in hotels, guesthouses, and inns (21,593,378 overnight stays) and 3,480,060 visitors in homes for tourist use (11,433,427 overnight stays). The main tourist attraction of the city is the work of the Catalan architect Antoni Gaudí, declared a World Heritage Site [
87]. Of the 12,875,386 visits to works of architectural interest during 2019, 10,798,386 were to Gaudí’s masterpieces, highlighting the Basilica of La Sagrada Familia with 4,717,796 visitors and Park Güell with 3,154,349 visitors [
86]. As an example of the abundance of TGC available on Barcelona, the Basilica of La Sagrada Familia currently has more than 163,000 OTRs and 119,000 photos shared on TripAdvisor.
Barcelona is divided into 10 districts (
Figure 3a) whose residents have disparate household incomes (
Figure 3b). The names and codes of the 10 districts are Ciutat Vella (D01), Eixample (D02), Sants-Montjuïc (D03), Les Corts (D04), Sarrià-Sant Gervasi (D05), Gràcia (D06), Horta-Guinardó (D07), Nou Barris (D08), Sant Andreu (D09), and Sant Martí (D10). This administrative division of Barcelona is suitable for checking the conceptual model because its diverse districts group bordering neighborhoods with similar urban characteristics. The inner districts house Gaudí’s masterworks (D02 and D05). Three of the peripheral districts are on the coast (D01, D03 and D10) and three are bordered by a mountain range (D05, D07 and D08).
For the reasons expressed above, the peer-to-peer (P2P) lodging platform Airbnb is the most suitable TGC source for the case study. Due to its sudden growth in tourist cities [
88], Airbnb caught the researchers’ attention in preference to other P2P lodging platforms [
89] as its expansion was highly controversial in the press [
90]. Regarding other accommodations, P2P lodging has the particularity that the close host-guest relationship influences the image perceived by visitors [
91], and the lengths of stay in Barcelona are longer [
86]. Much public opinion held that Airbnb contributed to the touristification and gentrification of Barcelona [
92] and to inequalities between its neighborhoods [
93]. Generally, the content of Airbnb OTRs focuses on assessing the relationship with the host and the accommodation features and amenities. The reviewers add narrations and assessments of other experiences at the tourist destination that they consider noteworthy. These narratives are far more persuasive than rational or logic-based communications [
94].
Airbnb OTRs were downloaded from the InsideAirbnb website [
95]. This non-profit portal is highly regarded among researchers [
96] for the abundance, accuracy, and continuous updating of data collected from the Airbnb platform. After removing internal line breaks that made text processing difficult and segmenting by districts and years, the data set collected on 20 February 2020 is as listed in
Table A2 of
Appendix A (753,366 OTRs in various languages).
The language recognition method used is based on the naïve Bayes classifier, but the usual patterns (unigrams, bigrams, and trigrams) have little accuracy when it comes to detecting the language of very short sentences, which forces some OTRs to be classified semi-manually. Applying Bayes’ equations [
5], the present study used n-grams from one to five extracted from Wikimedia with the help of a natural language detection library [
97], significantly increasing the recognition accuracy of OTRs with very few words; even some that had only one word were correctly classified (e.g., Worth). After refining the classification, 497,752 OTRs were found in English (EN), 110,186 in Spanish (ES), 67,892 in French (FR), 28,232 in German (DE), 19,451 in Italian (IT), and 29,853 in other or unclassified languages. The English language represented two-thirds of the data set.
According to Roberts [
98], ‘content analysis is a class of techniques for mapping symbolic data into a data matrix suitable for statistical analysis’ (p. 2697). Thematic text analysis produces arrays of counts of words or phrases. In this study, the quantitative content analysis is based on the count and categorization of key terms, where a term is the minimum unit of analysis formed by a keyword (e.g., ‘Barcelona’, ‘great’) or group of consecutive words with their own meaning (e.g., ‘Basilica of La Sagrada Familia’, ‘would not stay anywhere else’) [
5]. The Marine-Roig algorithm [
6], implemented in Java, was used to count the key terms.
3.1. Categorization
Categories are groupings of key terms with similar meaning or connotation. Categories can be constructed a priori based on some theory, or they can emerge from the most frequent words in the analyzed text [
99]. To classify the terms, it is necessary to account for the context of the OTRs. For example, reviewers use ‘amazing’ mostly in a positive sense; however, although ‘mean’ may have a negative polarity, it is not a useful word because it appears in OTRs with multiple meanings.
Categorization allows the extraction of a data matrix from the unstructured text for quantitative analysis of the different aspects and dimensions of the model represented in
Figure 2. The measurement of some dimensions does not require categorization because the data set already contains structured information. Thus, the number of OTRs for each Airbnb property is useful in estimating the popularity of P2P accommodations in neighborhoods and districts; the dates of the OTRs inform the temporal dimension; the location of the properties in neighborhoods and districts allows the spatial segmentation of the OTRs; and the ratings of the properties given by reviewers are useful to calculate the evaluative dimension.
3.1.1. Designative Aspect
According to Quan and Wang [
100], ‘The tourist experience consists of two dimensions, namely, the dimension of the peak touristic experience and the dimension of the supporting consumer experience’ (p. 300). The ‘peak’ dimension is useful for configuring cognitive categories including ‘food and wine’, although the consumption of food is related to both dimensions [
100], and the subcategory ‘Gaudi’ is extracted from ‘tangible heritage’ because the work of this Catalan architect carries considerable weight in Barcelona. The cognitive categories proposed in this study demonstrated their effectiveness by correctly classifying the territorial tourist brands of a multiscale destination using spatial coefficients (location quotient, and localization, specialization and diversification coefficients) applied to TGC from four travel-related websites: TripAdvisor.com, TravelPod.com, TravelBlog.org, and VirtualTourist.com [
22]. Categories have a priori key terms and emergent key terms [
99]. For example, there is ‘basilica’ in the ‘Tangible heritage’ category and ‘Basilica of La Sagrada Familia’ in the ‘Gaudi work’ subcategory. Then, the algorithm [
6] gives priority to compound terms over simple ones. The categories were constructed by two researchers using the ‘intercoder reliability’ method to eliminate ‘intraobserver inconsistencies’ and ‘interobserver disagreements’ by consensus [
101]. The process was facilitated by the use of the Marine-Roig algorithm [
6] to treat the most frequent key terms as a priority. The a priori key terms are always the same for each language. The emerging key terms depend on each tourist destination. For example, the geographic and attraction names can be downloaded from the official websites or from TripAdvisor, which, in addition to the attractions, activities, restaurants and hotels, has encoded all towns and parishes including those that are uninhabited. Finally, there are nine cognitive categories that configure the designative aspect of the image together with the spatial dimension previously described.
Cognitive categories: sun, sea, sand; nature and landscape; Gaudi work; tangible heritage; intangible heritage; food and wine; urban environment; leisure and recreational activities; sports.
3.1.2. Appraisive Aspect
Evaluative dimension: positive scores (Score+), negative scores (Score−), and average overall score (AvgScore).
Affective dimension: positive feelings and moods (Feeling+) such as ‘great’ and ‘happy’ and negative feelings and moods (Feeling−) such as ’unfriendly’ and ‘disappointed’.
3.1.3. Prescriptive Aspect
Attitudinal response: positive recommendations (Recom+) such as ‘recommended’ and ‘unmissable’ and negative recommendations and warnings (Recom−) such as ‘avoid’ and ‘beware’.
Behavioural response: positive behaviors (Behav+) such as ‘return next time’ and ‘would not stay anywhere else’ and negative behaviors (Behav−) such as ‘not stay again’ and ‘will not be back’.
3.2. Metrics
Because data sets can contain different numbers of reviews, and these can be more or less extensive, the primary metric is the percentage of terms in the category relative to the total number of words in each data set, including stop words. In terms of scores, most portals only allow consumers to rate amenities from one to five (stars or bubbles). Therefore, the metrics are the percentage of positive (from 3 to 5) and negative (less than 3) marks and the average of the overall score of the properties (from 20 to 100).
The secondary metric is the ranking of the metrics defined in the previous paragraph. The global ranking of each aspect or dimension of the model is the aggregate of rankings based on de Borda’s [
102] counting function. Assuming that they are rankings of the same length, the candidate receives points by subtracting her position from the final position in each descending ranking and subtracting one point from her position in each ascending ranking. For example, in a ranking of 10 candidates, the second in descending rank receives eight points, and the second in ascending rank receives one point. The sum of the points obtained by each candidate determines the final ranking. In the event of a tie, the function assigns the tied candidates the intermediate position. For example, if there are two candidates tied for points in the second and third positions, the function assigns position 2.5 to both.
3.2.1. Designative Aspect Ranking
Designative ranking is an indicator of the destination image perceived by visitors. It is formed by the aggregation by districts of the nine cognitive rankings.
3.2.2. Appraisive Aspect Ranking
The appraisal ranking represents an indication of visitor satisfaction. It is the result of adding the evaluative and affective rankings, that is, the overall score rankings (Score+, Score−, and AvgScore) and the polarity rankings from the experience narrative (Feeling+ and Feeling−).
3.2.3. Prescriptive Aspect Ranking
The prescriptive ranking represents an indication of visitor loyalty. It is formed by the aggregation of the four rankings seen above (Recom+, Recom−, Behav+, and Behav−).
5. Concluding Remarks
There has been much research published on the relationship between destination image and tourist satisfaction and loyalty based on surveys and interviews. Most researchers agree that a random sample of 400 respondents is representative of an indeterminate population, assuming a 5% error rate [
103]. For instance, researchers from two previously seen studies used a sample of 345 [
48] and 550 [
49] respondents, respectively. Although there are drawbacks to the surveys, they have the advantage that the questions can be directed to accept or refute the hypotheses raised.
Big data analytics also have drawbacks, but they have the advantage of being able to process a large number of opinions. For example, this case study used about 750,000 OTRs in several languages for a single city, of which about 500,000 OTRs were in English. In addition, the narratives, opinions, and qualifications in the OTRs are spontaneous because they are not conditioned by the items on a questionnaire. However, some authors have shown that, in the case of P2P lodging, the close host-guest relationship can affect the image perceived by visitors [
91] and the ratings they give the accommodation [
104].
5.1. Theoretical Implications
Scholars have been studying the image of the tourist destination for more than 50 years [
42,
65], but research that uses UGC as a data source is scarce. The vast majority of studies have utilized, as a theoretical basis, variations of the cognitive-affective-conative model inherited from the field of psychology. The main contribution of this study to the body of knowledge of tourism and hospitality is to introduce an element of discussion based on semantic and pragmatic semiotic aspects of images. However, the semiotic treatment of images introduced in this article is superficial compared to the profuse amount of literature on semiotics and consumer behaviour that has been published in other fields.
5.2. Methodological Implications
This research explored whether reviews shared by guests on Airbnb are useful for analyzing complex relationships between constructs such as image, satisfaction, and loyalty. Bearing in mind that the designative aspect represents the image, that the appraisive aspect represents satisfaction, and that the prescriptive aspect represents loyalty, the results partially confirmed these relationships. For example, the first district (D06) in the ranking of cognitive attributes is the first in overall average score and in positive feelings. However, the last two districts (D08 and D09) in the cognitive ranking are the first in the appraisal ranking and first in the ranking of positive recommendations. In this case, the property price attribute [
105] probably carried more weight in the relationship than the other aspects of the tourist destination since both districts are first in the ranking of the cheapest average prices (
Table A3). It may also be for this reason that district D05 has the worst scores and occupies the last position in the prescriptive rankings as it occupies the second position in the most expensive average price ranking (
Table A3) and the first in residents’ household income ranking (
Figure 3b).
5.3. Managerial Implications
The findings of this research confirm the high confidence asserted by other authors [
104] in the scores given by reviewers to the hosts and amenities of the P2P accommodations. Additionally, they show that the location of the P2P accommodation significantly conditions the subject of the narratives. This result confirms the usefulness of TGC to classify the zones of a tourist destination as, with similar cognitive categories, Marine-Roig and Anton Clavé [
22] classified the territorial tourist brands of a multiscale region using spatial coefficients. The proposed cognitive and polarity categories, metrics and rankings can be useful for analyzing other aspects of tourist destinations from any TGC source. The TGC allows segmenting the data by space and time [
106] and, thus, the ability to analyze the temporal evolution of tourism in any area from the demand-side perspective. It is also possible to segment according to languages and countries of origin of visitors. With the categories built and the data arranged, destination management and marketing organizations (DMOs) can acquire results almost in real time.
5.4. Limitations and Future Work
The results represent an initial approximation for studying and understanding the relationship between destination image, satisfaction and loyalty using TGC big data analytics, but more corroboration would be necessary in future studies using regression analysis or experimental designs. Furthermore, the OTRs on lodging experiences do not seem adequate for a study of this type. Surely, OTRs on sightseeing experiences would provide more consistent results.
The main limitation in this case study is the impossibility of constructing exhaustive categories out of more than 31 million words in English due to misspellings and grammatical varieties in different Anglo-speaking regions, as well as the polysemy of words and phrases aggravated by informal uses of the language, irony, sarcasm, etc. The categories must be mutually exclusive and can only contain univocal terms in the context of the TGC. Future work should refine the categories and apply the proposed framework to other tourist destinations and other TGC sources, such as OTRs hosted on TripAdvisor [
107,
108], to compare results and find whether it is possible to confirm the relationships between destination image and visitor satisfaction and loyalty demonstrated by other researchers through surveys.