Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry

Lee, Ki-Kwang; Kim, In-Gyum

doi:10.3390/su12208528

Open AccessArticle

Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry

by

Ki-Kwang Lee

¹ and

In-Gyum Kim

^2,*

¹

School of Business Administration, Dankook University, Yongin-si, Gyeonggi-do 16890, Korea

²

Planning and Finance Division, National Institute of Meteorological Sciences, Seogwipo-si, Jeju-do 63568, Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2020, 12(20), 8528; https://doi.org/10.3390/su12208528

Submission received: 11 September 2020 / Revised: 8 October 2020 / Accepted: 13 October 2020 / Published: 15 October 2020

(This article belongs to the Special Issue Services Management and Digital Transformation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The weather forecast service industry needs to understand customers’ opinions of the weather forecast to enhance sustainable communication between forecast providers and recipients particularly influenced by inherent uncertainty in the forecast itself and cultural factors. This study aims to investigate the potential for using social media data to analyze users’ opinions of the wrong weather forecast. Twitter data from Korea in 2014 are analyzed using textual analysis and association rule mining to extract meaningful emotions or behaviors from weather forecast users. The results of textual analysis show that the frequency of negative opinions is considerably high compared to positive opinions. More than half of the tweets mention precipitation forecasts among the meteorological phenomena, implying that most Koreans are sensitive to rain events. Moreover, association rules extracted from the negative tweets reveal a pattern of user criticism according to the seasons and types of forecast errors such as a “false alarm” or “miss” error. This study shows that social media data can provide valuable information on the actual opinion of the forecast users in almost real time, enabling the weather forecast providers to communicate effectively with the public.

Keywords:

customer opinion; types of forecast errors; text mining; sustainable meteorological service industry

1. Introduction

Understanding customers’ perceptions of products and services is one of the most important tasks in marketing sector. This is because users’ perceptions affect their purchasing patterns and can ultimately be linked to company performance such as sales [1,2,3,4]. Therefore, the data on user perception are used to establish and revise the sales strategy by grasping the customer needs in the enterprises. Furthermore, users’ satisfaction data can be utilized for planning the better service and budget allocation in the public service sector [5]. The importance of information related with users’ perception can also be applied to the meteorological community that provides weather forecasting services. In most countries around the world, weather forecasts are delivered to the public daily through a variety of media. While these forecasts are beneficial to the public, meteorological forecasts are not always recognized as useful and valuable information. This is partly due to unavoidable uncertainties in weather forecasts [6], or partly because forecast users are unable to respond effectively to forecasts or warnings caused by communication errors [7].

Many studies have not only focused on improving the accuracy of weather forecasts, but also on enhancing communication between forecast providers and users through a better understanding of user behaviors and perceptions or attitudes toward weather forecasts. It is desirable that the two approaches proceed in an iterative and dynamic manner to connect understanding forecast users with forecast production process [6,8]. Many researchers have tried to comprehend user perceptions and satisfaction of weather forecasts using either analytical or empirical methods. The analytical approach, from the field of mathematics, focuses on maximizing user satisfaction and the economic benefits of weather forecast information by using mathematical models in various decision-making situations from daily life to industries such as agriculture and distribution [9,10,11,12,13]. However, the empirical approach, from the social sciences, is focused primarily on the knowledge needed to communicate with the public more efficiently via interviews or surveys. Semi-structured interviews based on prepared questions have been conducted to elicit in-depth knowledge on end-users’ attitudes and behaviors regarding weather forecasts [14,15,16]. Empirical research also uses surveys based on close-ended questionnaires to obtain a better understanding of general user opinions on the sources, uses, and perceptions of weather forecasts [6,17,18,19,20,21,22,23]. The figure itself obtained by a survey with close-ended questions is an answer which is derived from the set of alternatives being offered. Asking open-ended questions may be a possible solution for this problem. Open-ended questions also have disadvantages such as non-response problem [24]. In this case, the voluntary comments posted and communicated on social media by active users offer beneficial information source to trace the public thinking behind the perception.

According to the report, the ratio of internet users who accessed social media was 65.2% in 2015 [25]. Social media enables to communicate between firms and their customers, and allows users to exchange information with each other [26]. Because of the characteristic of having subjective opinions recorded in real-time, social media is widely used in a variety of fields to try to understand people’s awareness [27]. The rapid increase in social media use in recent years has driven social scientists to use social media data to understand public opinion on specific issues. Fields ranging from politics, public health, to business have been using information from social media to obtain meaningful insights by gathering public opinions on specific government policies, products, and public services [28,29,30,31,32,33,34]. In the meteorological sector, the National Weather Service (NWS) increasingly communicates through social media to help prepare people for weather phenomena. Olson et al. (2019) observed how the NWS use social media to facilitate engagement with users during threat and ordinary periods, and founded the Twitter messages sent by the organization were different from each other during two periods [35]. Silver and Andrey (2019) explored the activity between professionals and citizens using Twitter data during a severe weather, and they proved usability of Twitter platform for the spread storm-related information [36]. Sun et al. (2019) investigated the correlation between haze and negative emotion of the public by analyzing historical weather records and microblog data to understand the negative impact of severe weather on the emotion of people [37]. Precedent studies show the utilization and scalability of social media that the meteorological industry should consider. Wang et al. (2020) analyzed individuals’ sentiment by coupling weather conditions with social media posts to see the daily impacts of weather on people [38]. Moreover, the Twitter data can be analyzed to measure how the topic of discussion on Twitter changes as severe weather approaches [39].

Despite the inherent advantages of social media, however, Korea’s meteorological community has responded passively to communicate with users using social media. They have focused solely on using social media to provide users with one-way weather information (twitter.com/kma_skylove). They just rely on closed-ended question surveys to investigate user perceptions [40]. An analysis of social media data acquired from the Internet may not guarantee a completely random population sample [41]. Nonetheless, this limitation can be mitigated because social media offers very large sample sizes, and is a relatively cheap and fast method by which to collect and analyze public opinion. Furthermore, researchers can understand how many people are interested in specific issues and where these issues are being discussed by analyzing comments on social media [42]. This study uses publicly available daily tweet data from Twitter, which is one of the most popular social media platforms, to investigate whether social media can be used to obtain a more in-depth picture of public attitudes towards weather forecasts from the Korean Meteorological Administration (KMA). To do this, Twitter data were analyzed in two steps: a basic sentiment analysis and association rule mining. First, a macroscale text analysis of day-to-day tweets was performed to obtain a daily time series of public sentiments. Then, a microscopic analysis of the peaks in the negative sentiment frequencies was performed to determine specific cases with which to explain the mood of the users. Second, attempts were made to develop comprehensive correlations between Twitter users’ angry sentiments and the characteristics of weather forecast errors by utilizing association rule mining.

2. Material and Method

2.1. Data Collection

Public tweets containing the keyword “Korea Meteorological Administration (KMA)” in South Korea were collected from 1 January to 31 December 2014. Our tweet data consist of 15,783 posts and of these, 2921 containing sentiments, comments, or opinions were selected to form the data set used in our analysis. Figure 1 shows the percentages of the selected tweets posted in each month. A little more than 3% of the tweets were posted in May, in contrast to about 20% in July and August. The frequencies of the tweets related to the KMA and the number of rain events followed a very similar pattern, with a positive Pearson correlation coefficient of r = 0.85 (p-value = 0.001). It is not surprising that a relatively high percentage of people tweeted about the weather in periods with more precipitation, as this is one of the most common adverse weather phenomena in Korea. In fact, according to a survey conducted in 2011, 55.5% of the general public responded to precipitation as the most interesting forecast element of the KMA [43]. This suggests that Korean weather agencies must prioritize forecasts related to meteorological phenomena that occur in summer to boost positive public opinions of the weather forecast service.

2.2. Analysis Method

The selected tweets were classified into categories such as season, sentiment, meteorological phenomenon, type of forecast error, and purpose of forecast usage. The set of hierarchical labels for the categories depicted in Figure 2 were attached to each of the 2921 tweets based on their content. One tweet can be coded with multiple labels according to its content regarding the weather forecast.

Table 1 shows that about 75% of the aggregate sentiments for weather forecasts were negative. More than half of the negative tweets mentioned “rain” or “downpour”, indicating that Koreans are sensitive to precipitation events. Just less than 4% of the sentiment for forecasts were positive. A small percentage (22%), most of which corresponded to weather information seeking or sharing between users, was neutral.

The substantial percentage of negative sentiments does not correspond with the findings of Anderson [44], who showed that extremely satisfied and dissatisfied customers are more likely to express their experiences relative to customers with moderate levels of contentment. Users are inclined to share positive experiences so that they gain social or self-approval for their purchases [45]. However, the fact that weather forecast services in Korea are free could potentially explain the overwhelming majority of negative tweets, because Koreans have no incentive to express their satisfaction with the service. As Skowronski & Carlston [46] showed, negative comments have a greater impact than positive ones in terms of making impressions. Therefore, weather agencies need to be more concerned about reducing negative tweets to improve the public impression of their services. In accordance with these requirements, this study focuses on obtaining a better understanding of the public negative perceptions.

We conducted textual analysis of the negative comments posted on Twitter in two ways. First, four days with the most frequent negative comments were selected and the contents of the negative tweets were examined in depth. For the selected days, the responses of the public on the tweets were analyzed by comparing the weather forecasts with the actual weather phenomena by time. Second, a data mining technique was adopted to present the possibility of extracting beneficial information on the characteristics of public opinion of the weather forecast. This study uses association rule mining as an effective tool to generate a set of rules from a given data pattern.

Association rule mining is a method for extracting causal relationships or frequent patterns between item sets in large databases [47]. A typical application of this is market basket analysis, which investigates whether two or more products are likely to be purchased together, and whether the purchase of one product increases the likelihood of the other being purchased. The output of the market basket analysis, represented by a set of association rules, is used to help determine appropriate product marketing strategies. An association rule is denoted “A→B,” where A and B are two independent items, referred to, respectively, as the left-hand side (antecedent) and right-hand side (consequent). The arrow means “is related to.” The three measures used to estimate the strength of association implied by a rule are “support,” “confidence,” and “lift.” An association may be thought of as more frequent and interesting as the values of “support” and “confidence” are closer to 1. A “lift” of an association rule greater than 1 suggests that the rule has considerable usefulness or strength [48]. This study adopted the a priori algorithm [49] implemented in a package in R to obtain a useful set of association rules with large lifts.

3. Results

3.1. Textual Analysis of Negative Opinions about the Weather Forecast Errors

Figure 3 depicts the number of daily tweets with negative opinions of the weather forecast service provided by KMA in 2014. The pattern is consistent with that in Figure 1, in that most of the daily spikes were during the summer. Some examples of tweet comments posted on four days of the most negative tweets are also provided in Figure 3.

Table 2 shows that the frequency of negative sentiments, frequency of comments on meteorological phenomena, and types of forecast error for four days with the highest frequency of negative opinions. There are two concepts, “False alarm” and “Miss”, in types of forecast error. “False alarm” is an occasion when meteorological agency provides wrong information that particular weather phenomenon will happen, and “Miss” is the opposite case.

The negative comment rate was significantly high on all four days, and people were dissatisfied mostly with the forecast errors related to rain. Most postings related to Case A in Table 2 referred to “False alarms” regarding the precipitation forecast, and many of the complaints also discussed heat. The maximum rainfall (100 mm) was forecasted the day before the event, while the actual precipitation levels of 53 mm and only 9.5 mm were recorded from midnight to 3:00 a.m. and 9:00 p.m. to 11:00 p.m. in Seoul, making some people disappointed with the failure to forecast this precipitation. On that day, the actual maximum temperature was 34 °C and the humidity was 54%. The discomfort index (DI) calculated based on the temperature and humidity was 83.35, which is a “very high” level of discomfort according to KMA standards.

Case B is similar to Case A in that it also had a high frequency of comments about “False alarms” regarding the precipitation forecast. Rain was predicted owing to the indirect effects of a typhoon. In Seoul, 13 mm of rainfall was recorded, but the precipitation was concentrated at dawn and in the nighttime, not during the day. Therefore, it is possible that people felt there was almost no rain despite the heavy rainfall expected due to the typhoon. This led to a high feeling of discontent owing to the “False alarm”.

In Case C, unlike the other cases, there was not much of a difference between the frequencies of negative opinions about “False alarms” (19) and “Misses” (12). The forecast for July 18^th predicted that it would rain in Seoul, but only 1.5 mm of rainfall recorded since the rainfall actually stopped after dawn. Meanwhile, there was heavy rainfall (28.7 mm) from 3:00 p.m. in Busan despite a forecast indicating no rain. Hence it was assumed that the negative tweets about the “False alarm” were posted from Seoul, while users near Busan might have posted about the “Miss” error.

Finally, Case D indicated an overwhelming mention of “Misses” in relation to precipitation. No rain was forecast on September 12^th, but there was a sudden occurrence of thunderstorms and lightning showers, with more than 50 mm of rain, in the northern part of Seoul; this led to several people being stranded because of floods near rivers. At that time, the KMA received a lot of criticism due to the unexpected event.

These four cases indicate the advantages and potential usages of weather-related social media data. It was observed that people tweeted about real-world meteorological phenomena. The veracity of the content of the tweets suggests that forecast providers can use a real-time analysis of social media data to understand current weather phenomena that people are experiencing. In addition, as shown in Cases A, B, and C, there may be a slight difference between the accuracy of the forecast calculated by meteorological agencies and the actual experience of the public depending on the time, region, and amount of precipitation. Therefore, the results indicate a need to improve forecast accuracy models, evaluated from a user’s point of view. For instance, the accuracy of daily precipitation forecasts could be calculated by dividing a day into two times: a busy period that includes traditional working hours (e.g., 08:00–20:00) and the less busy period (e.g., 21:00–07:00) rather than the usual time period of the entire day (00:00–23:00).

3.2. Interpretation of Association Rules

This study seeks to deduce reasons of negative opinions, which have a significant impact on user attitudes toward certain products or services [50]. As with the customer segmentation method in online marketing, it is more cost-effective to focus on the most dissatisfied users than to deal with all situations with negative comments. Therefore, this study divided users who had negative comments into normal and critical groups and focused on analyzing the most dissatisfied critical user group as the target segment.

A dataset used for mining association rules was prepared with 2177 tweets associated with negative labels. The negative opinions were classified into two levels, i.e., normal and critical, according to the severity of the negative comments. Among the negative tweets, 1430 comments were assigned to the “normal” category, which encompasses inconveniences caused by incorrect forecasts. The remaining 747 comments were labeled “censure” and classified in the “critical” category, which included posts that were very angry with, slanderous of, or blameful of the KMA. An R package was used to generate a set of association rules with the consequents of “censure,” for which the support, confidences, and lifts were greater than 0.1, 0.5, and 1.4, respectively. In Figure 4, which presents information on each association rule, words other than the “censure” are items corresponding to the antecedent of an association rule. An association rule is depicted as a circle whose size is proportional to the confidence value of the corresponding rule arrows from an item point toward a circle, and arrows from a circle point to the “censure.” The color of a circle is equivalent to the lift value of its rule, as follows: the darker the color, the greater the lift value. For example, the circle marked “A” in Figure 4 represents an association rule of {Summer, Rain, FA (False Alarm), Heat}→{Censure}. It can be seen that the lift value is large because the circle is darker than others.

The typical circles marked “A” to “H” in Figure 4 were transformed into the equivalent association rules of the related items that led to the users’ accusations, as shown in Table 3. It should be noted that all the rules, except rule G, include precipitation as a factor leading to criticism. This indicates that Koreans are highly sensitive to incorrect rainfall forecasts. Most Koreans dislike getting wet. In addition, since more than 40% of Koreans are likely to use public transportation owing to the high cost of parking and traffic jams in cities, they pay keen attention to the rainfall forecast to decide whether to take umbrellas or raincoats before going out.

Whereas in spring and summer, as seen in Rules A, C, and D, there were high criticisms of the "False alarm" error in the precipitation forecast, in autumn, there were also high concerns about the "Miss" error. Since the frequency of complaints caused by the error in the rain forecast was relatively low during the winter when it usually snows rather than rains, no related rule was generated. The reason why there are more severe complaints about "False alarm" than "Miss" in spring is due to seasonal characteristics in Korea. In other words, some regions frequently suffer crop damage and water shortage around spring, as the decrease in precipitation starting from winter is continued until spring as shown in Table 3. In addition, as the particulate matter concentration is the highest in spring, a strong social atmosphere to wait for the rain to remove it makes it more disappointing not having the expected rain than having the unexpected rain in spring. In hot summer, as many Koreans also look forward to a cool rain to bring temperature down, the accusation against "False alarm" is higher than "Miss", like in spring. However, the unusual point is that in summer, as in Rule A and Rule D, "False alarm" precipitation forecasts are often accused with the term ‘heat’ and ‘umbrella’. Here are some examples of actual tweets related to this.

Rule A:

Hey, KMA... When is the shower coming? It’s hot enough.
The forecast said it would rain heavily, but sweat is pouring like rain. I was tricked again by Korea Meteorological Administration.

Rule D:

The weather service gave me muscle in my arm. What about my umbrella? It is only sunny.
I even brought an umbrella and I’m wearing boots, believing it would rain. Instead of rain, the sun is sizzling ... Weather agency, are you kidding?

Rule A can be interpreted to mean that the complaints were numerous enough to blame KMA if the rain did not fall and temperatures were high despite forecasts that rainfall would occur in the summer. This is similar to Case A in Figure 3; about 50% of tweets complaining about the situation related to the above rule were written on 25 July, when Case A occurred. This shows that Case A can be generalized as Rule A, which implies that a “False alarm” error tends to cause greater user dissatisfaction under summer weather conditions in which the temperature and humidity are high simultaneously in Korea. Rule D is similar to Rule A in that it describes a “False alarm” forecasting error in summer, except that it does not rain when people take their umbrellas or raincoats in anticipation of predicted rainfall. Rule D implies that it is very annoying to carry an umbrella that would not be used on a hot summer day.

Rules B, E, F, and H describe public frustration with the “Miss” error in precipitation forecasts due to the characteristics of Koreans who are reluctant to be rained. In other words, the four rules contain the annoying situations of rain exposures caused by, in turn, unexpected rain forecasts in autumn, no prepared umbrellas, the wrong expected time of precipitation, and the absences of torrential rain forecasts. Specific examples of the complaints about the cases are as follows.

Rule B:

It’s a lot of rain for the fall. The KMA said it’s 0.1mm per hour, but it doesn’t look like this.

Rule E:

Hey, KMA guys. You said the weather would be mild and nice today, didn’t you? I got wet without an umbrella. If I have a lousy head, I’ll sue you.

Rule F:

It’s raining. I didn’t bring an umbrella with me because it said it would rain around 9 p.m. So, I got rained on my way home from work. The weather agency’s supercomputer is worth 10 billion won? The salary of its employees is our tax.

Rule H:

The rain is so severe that it is almost invisible like typhoon, but did you say it would just rain once or twice?

Rule B indicates that the “Miss” error in rain forecasts is more upsetting in the fall; this contrast to Rule C, which emphasizes the negative effect of a “False alarm” error in the spring. The reason for this is because although there are many days in spring on which air quality is bad, including days on which particulate matter and pollen levels are high, the most favorable weather to go out occurs in the fall, making the people perceive rainfall as a phenomenon that causes considerable inconvenience. Rule E, which is the opposite of Rule D, indicates complaints in which rainfall occurred but people went out without umbrellas because no rain was forecasted. Rule F reflects the fact that although weather agencies consider daily rain forecasts to be correct, the public may consider it a “Miss” error if they are caught in rain due to an incorrect forecast. This suggests that it is desirable to consider the forecast accuracy for each forecast time zone in addition to the traditional forecast verification method in which the accuracy of the 24 h forecast is forecast on a daily basis. Rule H represents a situation where people caught in heavy rainfall consider even correct rain forecasts to be false if the severity of the predicted rainfall differed. This case shows that, while the accuracy-based evaluation method classifies a forecast that predicts a small amount of rainfall as a “Hit,” actual users of the forecasts consider it a “Miss.” Accordingly, to communicate more effectively with their users, meteorological agencies should consider the deviation between the forecasted and actual precipitation, and the extent of accuracy, as a forecast verification measure.

Rule G indicates that the long-term forecasts released in the fall are associated with a high level of dissatisfaction owing to the failure to predict frequent severely cold conditions in the coming winter. Of 76 tweets related to the cold in winter, 60 were determined to correspond to Rule G, despite the precise 24 h forecast of the cold as follows.

Rule G:

Korea Meteorological Administration said this winter is supposed to be warm, but it is cold from early December! The weather forecast from the KMA is wrong again!

Actually, in the case of the above comment, the 24-hour weather forecast accurately predicted a cold weather and issued even a cold wave watch, but the people ignored the accuracy of the forecast and accused the long-term prediction of the overall mild winter. The reason for this is that the cold weather came earlier than anticipated before being fully prepared due to the false long-term forecast. Rule G reflects that, in relation to the cold, users are more sensitive to the long-term forecasts announced before winter than the 24-hour forecasts, suggesting the importance of precise long-term forecast of the cold for the upcoming winter.

Considering the association rules shown in Figure 4 and Table 3, four ways to reduce the negative opinion of users about the weather forecast service in Korea were established. First, the “False alarm” rate in precipitation forecasts should be reduced both in summer and in spring. Second, rain forecasts should be made accurately to reduce the “Miss” rate in fall. Third, the accuracy of long-term forecasts for cold, rather than precipitation, should be improved in winter. Finally, efforts should also be made to better predict the frequency and level of rainfall, as well as whether any will occur at all. The last two involve additional costs and time for technological development, but the first two can be implemented immediately because weather forecasters can adjust the rate of “False alarms” and “Misses” by tuning the threshold of precipitation forecasts based on the season.

4. Discussion

Prior to this study, most of survey-based research could understand only the superficial or partial users’ opinions about weather forecasts service due to limitations in the scope or depth the questions could deal with. Thus, there is a need for a new method to identify the forecast users’ attitudes and behaviors in more detail, replacing the question-based survey approach. This study investigated how social media data could be used as a new method to explore more specifically and realistically how the public users perceive the weather forecast service. We conducted a textual analysis and association rule mining by using Twitter data created in Korea during 2014 to derive the beneficial lessons to enhance the satisfaction of Korean weather forecast users.

The results of textual analysis showed that the proportion of negative opinions was relatively high (75%) compared to positive, because the motivation for users to deliberately write positive comments is lower than that of paid goods in the sense that weather forecast is a free service in Korea. Thus, it is not necessary to judge the quality of the forecast service relatively low even if there are more negative opinions than other products, and the research of the content analysis of social media on free goods such as public service is needed to be carried out more deeply. As a result of analyzing daily frequency of negative Tweets, we found that most of the days of negative comments were gathered in July and August. Most of the negative comments written on the four days with the largest number of daily negative tweets were about the precipitation forecast errors. The complaints about "False alarm" and "Miss" were clearly distinguished from each other depending on the weather conditions. If the forecasted rain did not come during the hot and humid summer season, the discomfort index rose, making the accusation against "False alarm" more apparent than in other seasons. On the other hand, there was a lot of criticism regarding "Miss" when unpredicted downpour caused major or minor damage due to the failure to prepare for it. Since these tweet articles are written almost in real time and accurately reflect the actual weather conditions of the users, it can be used as data that can enhance the reliability of the forecast evaluation rather than the measurement results obtained at observation points installed only in limited places.

The association rule analysis was able to identify several major factors for the phenomena in which the degree of user complaint, among other negative comments, is very high. With the exception of winter, discontent caused by errors in precipitation forecasts was the most common pattern, varying from season to season. Blame for "False alarm" errors were mainly expressed in spring and summer, and criticism of "Miss" errors occurred significantly in autumn. In addition, we also found out blame patterns in which the ill-predicted forecasts of the time or amount of the precipitation would be regarded to be at fault. On the other hand, the conventional verification criteria would classify the forecasts which accurately predicted only precipitation itself as “Hit”, even if they had miscalculated the time of amount of the rain. In winter, there was a high level of dissatisfaction with forecast errors related to cold weather rather than precipitation forecasts. In particular, even when the severe cold was predicted accurately in the day ahead forecast, it was highly dissatisfied with the long-term forecast of overall mild winter temperatures. Therefore, the users do not form the satisfaction of the forecast service in the form of evaluating and averaging all kinds of forecasts as a whole. Instead, we can see that they tend to alter the image of the entire service to negative, if they experience inconvenience or damage to their life or schedule by at least one forecast.

5. Conclusions

From the Korean specific cultural characteristics discovered in the study so far, we can infer the following points to be considered by the KMA in order to enhance sustainable relationships between public users and the meteorological society.

It is crucial to consider the behaviors of Korean people and improve the recognized accuracy of precipitation forecasts. It is particularly necessary to make even minor corrections to precipitation forecasts for each period to reduce the frequency of “False alarm” errors in spring and summer, and to prevent “Miss” errors in fall (see Rule A~D).
In winter, the temperature forecast is more important than the precipitation forecast. The technical aspects of the long-term forecast related to winter cold, which is announced late in fall (the preparation time for winter) need improvement because this forecast has a greater impact on public impressions compared to 24-h forecasts (see Rule G).

Until recently, the KMA only interacted with the public in relation to the accuracy of its precipitation predictions, creating a large gap between the forecast providers and the public. The differences in the points of view held by these groups must be reduced with the introduction of a customized forecast evaluation model that considers accurate predictions of the time, location, and quantities of rainfall. In other words, there is a need to try to switch from the conventional forecasting system, which focused on the extent of accuracy, to forecast production, provision, and evaluation from a user perspective, which could reflect patterns of urban life that involve complicated and diverse needs. Of course, social media analysis cannot be used to assess the emotions of all users. However, it can provide valuable information that traditional survey methods cannot, owing to the large amount of data collected via widespread real-time feedback. The generalization problem caused by potential sampling biases can be mitigated if data collected from more than one social media platform are analyzed. Enhanced science communication based on useful information provided by analyses of social media can help forecast agencies to improve the existing forecast system, enabling them to ultimately produce more valuable weather forecast services that could provide high satisfaction to the public.

Author Contributions

Conceptualization, K.-K.L.; Data curation, K.-K.L. and I.-G.K.; Formal analysis, K.-K.L. and I.-G.K.; Investigation, I.-G.K.; Methodology, K.-K.L. and I.-G.K.; Project administration, K.-K.L.; Software, K.-K.L. and I.-G.K.; Supervision, K.-K.L.; Writing—Original draft, I.-G.K.; Writing—Review & Editing, K.-K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Korea Meteorological Administration Research and Development Program “Support to Use of Meteorological Information and Value Creation” under Grant (KMA2018-00122).

Conflicts of Interest

The authors declare no conflict of interest.

References

Vo, L.V.; Le, H.T.T.; Le, D.V.; Phung, M.T.; Wang, Y.-H.; Yang, F.-J. Customer satisfaction and corporate investment policies. J. Bus. Econ. Manag. 2017, 18, 202–223. [Google Scholar] [CrossRef] [Green Version]
Lam, S.Y.; Shankar, V.; Erramilli, M.K.; Murthy, B. Customer value, satisfaction, and switching costs: An illustration from business-to-business service context. JAMS 2004, 32, 293–311. [Google Scholar] [CrossRef]
Fornell, C.; Rust, R.T.; Dekimpe, M.G. The effect of customer satisfaction on consumer spending growth. J. Mark. Res. 2010, 47, 28–35. [Google Scholar] [CrossRef]
Baghestani, H.; Williams, P. Does customer satisfaction have directional predictability for U.S. discretionary spending? Appl. Econ. 2017, 49, 5504–5511. [Google Scholar] [CrossRef]
Walle, S.V.D.; Ryzin, G.G.V. The order of questions in a survey on citizen satisfaction with public services: Lessons from a split-ballot experiment. Public Admin. 2011, 89, 1436–1450. [Google Scholar] [CrossRef]
Morss, R.E.; Demuth, J.L.; Lazo, J.K. Communicating uncertainty in weather forecasts: A survey of the U.S. public. Weather Forecast. 2008, 23, 974–991. [Google Scholar] [CrossRef]
Sorensen, J.H. Hazard warning systems: Review of 20 years of progress. Nat. Hazards Rev. 2000, 1, 119–125. [Google Scholar] [CrossRef] [Green Version]
Compton, J. When weather forecasters are wrong: Image repair and public rhetoric after severe weather. Sci. Commun. 2018, 40, 778–788. [Google Scholar] [CrossRef]
Murphy, A.H. Decision making and the value of forecasts in a generlized model of the cost-loss ratio situation. Mon. Weather Rev. 1985, 113, 362–369. [Google Scholar] [CrossRef] [Green Version]
Katz, R.W.; Murphy, A.H. Economic Value of Weather and Climate Forecasts; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
Mylne, K.R. Decision-making from probability forecasts based on forecast value. Meteorol. Appl. 2002, 9, 307–315. [Google Scholar] [CrossRef]
Lee, K.-K.; Lee, J.W. The economic value of weather forecasts for decision-making problems in the profit/loss situation. Meteorol. Appl. 2007, 14, 455–463. [Google Scholar] [CrossRef]
Kim, I.-G.; Kim, J.-Y.; Kim, B.-J.; Lee, K.-K. The collective value of weather probabilistic forecasts according to public threshold distribution patterns. Meteorol. Appl. 2014, 21, 795–802. [Google Scholar] [CrossRef]
Zeigler, D.J.; Brunn, S.D.; Johnson, J.H., Jr. Focusing on Hurricane Andrew through the eyes of the victims. Area 1996, 28, 124–129. [Google Scholar]
Moore, S.; Daniel, M.; Linnan, L.; Campbell, M.; Benedict, S.; Meier, A. After Hurricane Floyd passed: Investigating the social determinants of disaster preparedness and recovery. Fam. Commun. Health 2004, 27, 204–217. [Google Scholar] [CrossRef] [PubMed]
Hoss, F.; Fischbeck, P. Increasing the value of uncertain weather and river forecasts for emergency managers. Bull. Am. Meteorol. Soc. 2016, 97, 85–97. [Google Scholar] [CrossRef]
Zhang, F. An in-person survey investigating public perceptions of and response to Hurricane Rita forecasts along the Texas coast. Weather Forecast. 2007, 22, 1177–1190. [Google Scholar] [CrossRef]
Lazo, J.K.; Morss, R.E.; Demuth, J.L. 300 billion served: Sources, Perceptions, Uses, and Values of weather forecasts. Bull. Am. Meteorol. Soc. 2009, 90, 785–798. [Google Scholar] [CrossRef] [Green Version]
Joslyn, S.; Savelli, S. Communicating forecast uncertainty: Public perception of weather forecast uncertainty. Meteorol. Appl. 2010, 17, 180–195. [Google Scholar] [CrossRef]
Silver, A.; Conrad, C. Public perception of and response to severe weather warnings in Nova Scotia, Canada. Meteorol. Appl. 2010, 17, 173–179. [Google Scholar] [CrossRef]
Demuth, J.L.; Lazo, J.K.; Morss, R.E. Exploring variations in people’s sources, uses, and perceptions of weather forecasts. Weather Clim. Soc. 2011, 3, 177–192. [Google Scholar] [CrossRef]
Drobot, S.; Anderson, A.R.S.; Burghardt, C.; Pisano, P.U.S. public preferences for weather and road condition information. Bull. Am. Meteorol. Soc. 2014, 95, 849–859. [Google Scholar] [CrossRef]
Zabini, F.; Grasso, V.; Magno, R.; Meneguzzo, F.; Gozzini, B. Communication and interpretation of regional weather forecasts: A survey of the Italian public. Meteorol. Appl. 2015, 22, 495–504. [Google Scholar] [CrossRef] [Green Version]
Reja, U.; Manfreda, K.L.; Hlebec, V.; Vehovar, V. Open-ended vs. close-ended questions in web questionaries. Dev. Appl. Stat. 2003, 19, 159–177. [Google Scholar]
KISA. 2016 Survey on the Internet Usage Summary Report; Korea Internet & Security Agency: Naju, Korea, 2016. [Google Scholar]
Mangold, W.G.; Faulds, D.J. Social media: The new hybrid element of the promotion mix. Bus. Horiz. 2009, 52, 357–365. [Google Scholar] [CrossRef]
Hutto, C.J.; Gilbert, E. VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media (ICWSM), Ann Arbor, MI, USA, 1–4 June 2014. [Google Scholar]
Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A. Twitter power: Tweets as electronic word of mouth. J. Am. Soc. Inf. Sci. Technol. 2009, 60, 2169–2188. [Google Scholar] [CrossRef]
Culnan, M.J.; McHugh, P.J.; Zubilaga, J.I. How large U.S. companies can use Twitter and other social media to gain business value. Mis Q. Exec. 2010, 9, 243–259. [Google Scholar]
Bollen, J.; Mao, H.; Zeng, X. Twitter mood predicts the stock market. J. Comput. Sci. 2011, 2, 1–8. [Google Scholar] [CrossRef] [Green Version]
Dodds, P.S.; Harris, K.D.; Kloumann, I.M.; Bliss, C.A.; Danforth, C.M. Temporal patterns of happiness and information in a global social network: Hedonometrics and twitter. PLoS ONE 2011, 6, e226752. [Google Scholar] [CrossRef]
Golbeck, J.; Hansen, D. A method for computing political preference among Twitter followers. Soc. Netw. 2014, 36, 177–184. [Google Scholar] [CrossRef]
Dong, J.Q.; Wu, W. Business value of social media technologies: Evidence from online user innovation communities. J. Strategic Inf. Syst. 2015, 24, 113–127. [Google Scholar] [CrossRef]
Baylis, P.; Obradovich, N.; Kryvasheyeu, Y.; Chen, H.; Coviello, L.; Moro, E. Weather impacts expressed sentiment. PLoS ONE 2018, 13, e0195750. [Google Scholar] [CrossRef]
Olson, M.K.; Sutton, J.; Vos, S.C.; Prestley, R.; Renshaw, S.L.; Butts, C.T. Built community before the storm: The National Weather Service’s social media engagement. J. Contingencies Crisis Manag. 2019, 27, 359–373. [Google Scholar] [CrossRef]
Silver, A.; Andrey, J. Public attention to extreme weather as reflected by social media activity. J. Contingencies Crisis Manag. 2019, 27, 346–358. [Google Scholar] [CrossRef]
Sun, X.; Yang, W.; Sun, T.; Wang, Y.P. Negative Emotion under Haze: An Investigation Based on the Microblog and Weather Records of Tianjin, China. Int. J. Environ. Res. Public Health 2019, 16, 86. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, J.; Obradovich, N.; Zheng, S. A 43-Million-Person Investigation into Weather and Expressed Sentiment in a Changing Climate. One Earth 2020, 2, 568–577. [Google Scholar] [CrossRef]
Spurce, M.; Arthur, R.; Williams, H.T.P. Using social media to measure impacts of named storm events in the United Kingdom and Ireland. Meteorol. Appl. 2020, 27, e1887. [Google Scholar] [CrossRef]
KMA. Public Satisfaction Survey on National Weather Service in 2014; Hyundae Research Institute: Seoul, Korea, 2014. [Google Scholar]
Fricker, R.D. Sampling Methods for Web and E-mail Surveys—The SAGE Handbook of Online Research Methods; SAGE Publications Ltd.: London, UK, 2008. [Google Scholar]
Government Social Research. The Use of Social Media for Research and Analysis: A Feasibility Study; Department for Work & Pensions: London, UK, 2014. [Google Scholar]
KMA. Public Satisfaction Survey on National Weather Service in 2011; Hyundae Research Institute: Seoul, Korea, 2011. [Google Scholar]
Anderson, E.W. Customer satisfaction and word of mouth. J. Serv. Res. 1998, 1, 5–17. [Google Scholar] [CrossRef]
Richins, M.L. Negative word-of mouth by dissatisfied consumers: A pilot study. J. Mark. 1983, 47, 68–78. [Google Scholar] [CrossRef] [Green Version]
Skowronski, J.J.; Carlston, D.E. Negativity and extremly in impression formation: A review of explanations. Psychol. Bull. 1989, 105, 131–142. [Google Scholar] [CrossRef]
Ledolter, J. Data Mining and Business Analytics with R; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
EMC Education Services. Data Science and Big Data Analytics—Discovering, Analyzing, Visualizing and Presenting Data; John Wiley & Sons: Indianapolis, IN, USA, 2015. [Google Scholar]
Agarwal, R.; Srikant, R. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de, Chile, Chile, 12–15 September 1994; pp. 487–499. [Google Scholar]
Sen, S.; Lerman, D. Why are you telling me this? An examination into negative consumer reviews on the web. J. Interact. Mark. 2007, 21, 76–94. [Google Scholar] [CrossRef]

Figure 1. Percentage of tweets about the weather forecast service and number of rain events for each month in 2014.

Figure 2. Hierarchy of labels for five categories, i.e., season, sentiment, weather phenomena, type of forecast error, and application of forecast. One tweet can be coded with multiple labels according to its content regarding the weather forecast.

Figure 3. Daily tweets with negative sentiments regarding the weather forecast service in Korea in 2014. The four boxes show sample content for selected days with the most number of negative tweets.

Figure 4. Diagram of set of association rules with “censure” consequents. The Diagram is generated using a package in R, whose support, confidence, and lift values are greater than 0.1, 0.5, and 1.4 respectively. The size of a circle, which corresponds to an association rule, represents the confidence value. The color of a circle expresses the lift value of its rule; a darker color indicates a larger lift value.

Table 1. Analysis of individual tweets of sentiments about the weather forecast service in Korea.

Sentiment	Occurrences	Percentage
Negative	2177	74.5%
Neutral	637	21.8%
Positive	107	3.7%
Total	2921	100.0%

Table 2. Frequency analysis of tweet content with the most negative daily comments in terms of weather phenomena and type of forecast error; “FA” denotes “False Alarm”.

Case	Date	Observed Sentiment of Tweets		Weather Phenomena Causing Negative Sentiment				Type of Forecast Error
Case	Date	Total	Negative	Rain	Heat	Downpour	Typhoon	FA	Miss
A	25 Jul	90	86	54	22	-	-	49	1
B	3 Aug	61	53	27	3	-	16	24	3
C	18 Jul	59	51	27	7	8	-	19	12
D	12 Sep	48	43	31	-	8	-	-	36
Total		258	233	139	32	16	16	100	74

The sum of the frequencies of the four meteorological phenomena is not equal to the frequency of all the negative comments (233) because some of the tweets contain only negative comments without referring to specific weather phenomena. For the same reason, the sum of the frequencies for the two error types is less than the total frequencies of the negative comments.

Table 3. Association rules obtained from the circles marked “A” to “H” in Figure 4. The rules are listed in order of their lift values.

Item	Association Rule	Support	Confidence	Lift	Comparing the Rule with each Case in Table 3
Item	Association Rule	Support	Confidence	Lift	A	B	C	D
A	{FA, Heat, Rain, Summer}→{Censure}	0.013	0.674	1.958	○	○	-	-
B	{Miss, Rain, Autumn}→{Censure}	0.025	0.625	1.814	-	-	-	○
C	{FA, Rain, Spring}→{Censure}	0.011	0.605	1.757	-	-	-	-
D	{FA, Paraphernalia, Rain, Summer}→{Censure}	0.013	0.583	1.693	-	-	○	-
E	{Miss, Paraphernalia, Rain}→{Censure}	0.015	0.569	1.652	-	-	○	-
F	{Miss, Rain, Time}→{Censure}	0.010	0.512	1.485	-	-	○	-
G	{Miss, Cold, Extended-Forecast, Winter}→{Censure}	0.017	0.507	1.472	-	-	-	-
H	{Miss, Downpour}→{Censure}	0.010	0.500	1.451	-	-	-	○

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, K.-K.; Kim, I.-G. Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry. Sustainability 2020, 12, 8528. https://doi.org/10.3390/su12208528

AMA Style

Lee K-K, Kim I-G. Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry. Sustainability. 2020; 12(20):8528. https://doi.org/10.3390/su12208528

Chicago/Turabian Style

Lee, Ki-Kwang, and In-Gyum Kim. 2020. "Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry" Sustainability 12, no. 20: 8528. https://doi.org/10.3390/su12208528

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Social Media Data Analytics to Enhance Sustainable Communications between Public Users and Providers in Weather Forecast Service Industry

Abstract

1. Introduction

2. Material and Method

2.1. Data Collection

2.2. Analysis Method

3. Results

3.1. Textual Analysis of Negative Opinions about the Weather Forecast Errors

3.2. Interpretation of Association Rules

4. Discussion

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI