Next Article in Journal
Data Fusion and Accuracy Analysis of Multi-Source Land Use/Land Cover Datasets along Coastal Areas of the Maritime Silk Road
Previous Article in Journal
Quantitative Identification of Urban Functions with Fishers’ Exact Test and POI Data Applied in Classifying Urban Districts: A Case Study within the Sixth Ring Road in Beijing
 
 
Article
Peer-Review Record

Geo-Tagged Photo Metadata Processing Method for Beijing Inbound Tourism Flow

ISPRS Int. J. Geo-Inf. 2019, 8(12), 556; https://doi.org/10.3390/ijgi8120556
by Wen Chen 1,2, Zhiyun Xu 1, Xiaoyao Zheng 3 and Yonglong Luo 1,3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
ISPRS Int. J. Geo-Inf. 2019, 8(12), 556; https://doi.org/10.3390/ijgi8120556
Submission received: 24 October 2019 / Revised: 26 November 2019 / Accepted: 2 December 2019 / Published: 3 December 2019

Round 1

Reviewer 1 Report

Dear authors,

Thank you very much for the submission of the revised manuscript and the adaptation of the manuscript based on the comments and feedback provided.

Please see my comments attached that I hope will provide further support.

 

 

Comments for author File: Comments.pdf

Author Response

Dear reviewer, we put the revised article and the response to the review comments in a word document. The red mark in the article is based on the review comments of multiple reviewers. Please refer to attachment. Thank you for your guidance!

Author Response File: Author Response.pdf

Reviewer 2 Report

Research on geo-tagged metadata for forecasting urban inbound tourism flow

 

The subject of the article is interesting and the methodology may be applicable in further research, but some changes are necessary to improve its usefulness.

 

Major changes:

 

From line 50, you should add some statistics that demonstrate the wide availability of user-generated content (UGC), particularly in the field of travel-related photography. Next, explain the advantages and disadvantages of using big data analytics in tourism and hospitality research. In addition to articles [14, 35], you can build on recent studies: "Business intelligence and big data in hospitality and tourism: a systematic literature review" and "Managing customer knowledge through the use of big data analytics in tourism research".

 

In the literature review section, you should also critically comment on some recent work on Flickr: “Store buildings as tourist attractions: Mining retail meaning of store building pictures through a machine learning approach” and “Machine learning and points of interest: typical tourist Italian cities”. In the discussion section, according to the recommendations of IJGI, authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses.

 

“Establish the tourism text corpus” (line 286): I don't understand why you use TripAdvisor.in and then translate the travel reviews about Beijing into the English language. Why not download the data directly from a TripAdvisor’s website in English?

 

“Establish the stop word dictionary”: It is necessary to deepen the process of building the stop word dictionary (line 192) and even include a list of stop words used in this case study (line 290).

 

Minor changes:

 

Abstract. “mass research data research data”: ?

 

Keywords (three to ten pertinent keywords). Please add other significant keywords such as: big data analytics; machine learning algorithm; support vector regression; extreme learning machine; data correlation analysis

 

In section 2.1, the first time Beijing appears, you should add: (formerly romanised as Peking).

 

Abbreviations and acronyms should be defined in parentheses the first time they appear and used consistently thereafter. For example, RBF.

 

Section 3.2.1. What percentage of photos was taken by foreigners residing in Beijing?

 

The bibliography does not have the style of MDPI journals. I use Zotero.org with the style “Multidisciplinary Digital Publishing Institute”.

 

Author Response

Dear reviewer, we put the revised article and the response to the review comments in a word document. The red mark in the article is based on the review comments of multiple reviewers. Please refer to attachment. Thank you for your guidance!

Author Response File: Author Response.pdf

Reviewer 3 Report

Research on geo-tagged photo metadata for forecasting urban inbound tourism flow

Geo-tagged photo metadata has provided a new source of mass research data for tourism studies.

This paper introduces and designs several methods, including data screening, text data similarity calculation, geographical location clustering, and time series data modelling, in order to realize a data preprocessing model for inbound tourist flows in cities based on geo-tagged photo metadata. The issue is tourist flow prediction.

 

1.Introduction

The current hypothesis is: the geo-tagged photo metadata contained in EXIF (exchangeable image file format), including shooting time and geographic coordinates, can also provide a large body of research data for tourism studies [13-18].

Many research works deal with the value of geo-tagged photo metadata. They process and analyze different types of data contained in geo-tagged photo metadata: namely, text tags, geotags, image contents and time of image capture [19]. They detect places, events, trends, hotspots, specific tourism behaviors, etc.

 

Page 2: “However, these authors’ work lacks in-depth research of processing methods for geo-tagged photo metadata in tourist flow prediction.”

Could you better explain what, in this specific research work, doesn’t work? What are your goals for “forecasting urban inbound tourism flow” and which of those goals the previous research works doesn’t achieve?

 

Page 2 and 3: You mention “reliable tourism demand forecasts”, “tourist flow prediction”, “inbound tourist flows in cities”, “whether a user whose location value is empty is an inbound visitor”, “whether foreigners’ activities in a certain region are tourism-related”, “prediction model for inbound tourist flow in cities”, “three prediction models are used to verify that the preprocessed data can provide effective prediction results”: could you distinguish the general goals of current works from your particular goals?

Moreover, could you mention what current works could be reused in your specific context to achieve you particular goals?

 

2.Description of the Methodology

Process:

i.Data from a certain region and a certain period can be screened out

ii.Data from local residents can be deleted: Entropy-filtering method

iii.Judgment can be made on whether the photo has been posted by an inbound person: calculate the tourism correlation (i.e. are the terms/tags of the photo related to tourism according to a dictionary of tourism terms)

iv.The number of tourists inbound to a certain region and over a certain period can be summarized

Three machine learning prediction algorithms are introduced to evaluate the feasibility and accuracy of the data processing methods

 

2.1.Screening of domestic and foreign tourists

Page 4: “Therefore, we will analyze the timespan in which the pictures are taken. If photos are taken by a user in the same location for more than one year, these users can be defined as local residents.”

What about tourists that visit a city tow times? Once first and twice a year later?

 

2.3.Data statistics and data normalization

Page 6: formula (5)

Not so clear. Could you illustrate this with an example?

 

2.4.Prediction model

Page 6: “The machine learning prediction…”

Not so clear. What are your prediction hypothesis? What data are you going to use in such approaches? Could you illustrate this stage with examples?

 

3.Case Study

Flickr API. In total, 349,665 pieces of photo data from 2007 until today were collected.

 

Page 7: “According to the idea proposed in [39]”

Could you recall briefly this idea?

 

Page 7: “in Table 3 that 2.0 is the optimal threshold value”

Why is this threshold optimal?

 

Page 7: “Some travel review texts pertaining to Beijing were acquired from TripAdvisor to make up tourism text corpus, and the text was translated into English”

Could you explain this process (i.e. TripAdvisor text selection, translation, word indexing, etc.)?

 

Page 7: “Stop words in the tag text were identified through matching”

Isn’t stop words removing process going to generate loose of significant n-grams like “great wall of muntianyu” (that will probably become “wall”, “muntianyu”)?

 

Page 7, Table 4: second row, third column

Why do you only have “landmark”? Isn’t “architecture” relevant too?

 

Page 8: “the cosine distance between two words”

Is this distance computed between two words or between two vectors composed of words?

 

Page 8: “text similarity calculation was set to 0.83629805”

Why? Could you explain this choice?

 

Page 8: “a maximum value of 90.17%”

How did you measure this value?

 

Page 8, Table 5: third column

Did you verify the effectiveness of those “supposed” matching records?

 

Page 8: “the cosine of two vector angles in the vector space to measure the difference between two individual points”

Are you sure that cosine measures the difference between two individual points?

 

Page 9: “we calculated that the cosine distance between the preprocessed data and the actual statistical yearbook data is 0.9198486”

The way you measure this correlation is not clear. Could you detail how you compare words of each corpus?

 

Page 10: “Moreover, the ELM algorithm's prediction accuracy is higher than that of the other algorithms”

Ok, I agree with this conclusion

However, what does your processing chain currently produce as forecast results on this test sample? Is the objective of this work only to select sets of Flikr photos linked to inbound tourism? Or to produce other forecast results?

 

4.Discussion

Isn’t it better to title this section “Conclusion”?

 

Finally, it’s seems that your processing chain filters Flickr photos to select those related to inbound tourism. If yes, you should claim that everywhere in the paper: title, introduction, etc. Is the issue of the paper “tourist flow prediction/forecast” or “filtering of photos related to inbound tourism”?

Do you propose other Flickr metadata analyses to produce new tourism forecast results? Or specific tourism indicators?

I suggest you to change the title of the paper to something like:

“Research on geo-tagged photo metadata for filtering photos related to urban inbound tourism”

I thing this fits better the content of the paper.

 

Typos:

Page 1:

“a new source of mass research data research data for tourism studies”?

Page 1:

“this paper has introduced and designed several” -> “this paper introduces and designs several”?

Page 2:

“Studies utilizing” -> “Studies using”?

Page 3:

“to explore the characteristics of and existing problems” -> “to explore the characteristics of existing problems”?

Page 4:

“?i(?)is denotes the days on which” -> “?i(?) denotes the days on which”?

Page 4:

“the numbers of photos” -> “the number of photos”?

 

 

Author Response

Dear reviewer, we put the revised article and the response to the review comments in a word document. The red mark in the article is based on the review comments of multiple reviewers. Please refer to attachment. Thank you for your guidance!

Author Response File: Author Response.pdf

Reviewer 4 Report

This is an interesting paper of big methodological significance. The approach proposed by the authors can be of practical importance for making forecasts in tourist destinations. It is well-written, well-structured, and citing enough literature. I recommend it for consideration in the journal after minor improvements (see recommendations below).

Title should avoid the word "research", it should explain that the paper proposes a new method that is tested with the example of Beijing. The abstract is too declarative. Tell more about your method and your results, please. Introduction: state your objective clearly and delete the unnecessary last paragraph. I strongly encourage to include a simple table explaining the considered approaches and giving the links with which one can access the necessary data/algorithms. E.g., what is Flickr and where to see this? What is DBSCAN? Note that your article may be VERY interesting to specialists in tourism (i.e., to sociologists and economists) with a very limited knowledge in GIS and other advanced technologies. So, the success of your article depends strongly on its clarity to such readers. Discussion: please, cite more published works there. The writing is ok, but, please, check the language once again. E.g., Line 118: is -> are (data is a plural form). I propose (but do not insist!) to include 2-3 images illustrating the work of your approach.

Author Response

Dear reviewer, we put the revised article and the response to the review comments in a word document. The red mark in the article is based on the review comments of multiple reviewers. Please refer to attachment. Thank you for your guidance!

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

I think you have improved the article considerably. Congratulations!

Please, on line 142, add (Table A1).

Author Response

Response to Reviewer 2 Comments

Point 1: Please, on line 142, add (Table A1).

 

Response1: Thanks to the reviewer and editor for guidance. We have adjusted the position of Table A1 in the original text to line 142. We named this table as Table 1. So, we adjusted the serial numbers of the other tables in the article.

 

Author Response File: Author Response.docx

Reviewer 3 Report

Thank you for your answers. Thanks for your modifications that take into account most of my proposals

I'm ok to accept this new version of your paper

Author Response

Thanks very much for your comments!

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Thank you for your contribution. Overall, the paper describes an interesting topic. However, the manuscript needs major improvements to make it more coherent and understandable and to illustrate the significance of the research.  

Structure: The overall structure of the paper has to be revised fundamentally. The introduction could, for example, be separated into introduction, background and motivation to make it easier to follow and explain the motivation and significance of the research. Same goes for the presentation of the approach, results and discussion. The last chapter should be extended and could be split in summary, discussion, conclusion and future work. In the current form, the structure makes the paper hard to read and the research at hand difficult to evaluate. Spelling: There are several typos and spelling mistakes as well as formatting errors that need to be fixed. These include the overall text, figures and tables. Introduction: Many references missing for statements made that are needed to prove the significance and need for the research at hand (e.g. lines 32-38, 49). Line 71f.: qualitative, quantitative etc.- needs further examples or references. Line 86: Specify what fields the algorithm was used for and why also applicable to use case at hand. Line 97: After describing the research gap and aim, could include a separate chapter to describe the approach. Line 100: “other data”- please specify Chapter 2.1 Figure 1: Needs to be redone as different fond and different alignment of text. System framework paragraph should be redone as the structure and contents are not clear. Figure 2: Should be redone and could be split up in two sections/ better highlighted that two different objectives: Show structure and explain determination based on the information provided. Chapter 2.2.1 The title needs to be revised as the paragraph also seems to cover deletion of non-travel-related data (see title 2.2.2) What are a,b,c and d? Reference and link not clear Chapter 2.2.2 Clarify why a tourist would only go to scenic areas. Tourists could also be visiting towns? Not clear and would need proof/a reference if this is the case. Line 165: POI= “Point of Interest” not “Point of Interesting” 2.1 and first paragraphs 2.2.2 show quite some overlaps and could be combined to make the approach easier to understand. Line 168: Add Algorithm 1 description directly here. Line 172/179: Another subchapter therefore rather 2.2.2.1 and 2.2.2.2? Line 179: Another subdivision/listing. The subdivisions and listings make the text hard to read and structure and setup should be rethought, e.g. rather cover some parts in a table or a figure. Line 200/201: Another numeration, see previous points. 2.3: Title needs to be more precise: What does other data mean? And where does the definition of tourists come from? Chapter 2.3: A very short chapter in comparison to others. Here again, the overall structure could be improved. Chapter 3: Chapter 3.1: Needs references to proof statements made. Chapter 3.2: Figure 3 shows inbound tourists locations in Bejing, this sounds like the overall result already. Needs to be better described. Chapter 3.2.1: Put Table 1 after paragraph. Chapter 3.2.2: Again many subdivisions and numerations, rethink structure. Line 287/288: Better explain why these parameters are best. Chapter 3.3: Line 327: Numeration (8)- What reference and where does number come from? Chapter 3.5: Table 6 needs to be redone as it uses too much space due to current formatting. Chapter 4: The chapter title is discussion, but missing a discussion as more a short summary and future work. Very short and missing a discussion of limitations, lessons learned, etc.

 

Reviewer 2 Report

This article reports on the use of photographs derived from Flickr to predict tourism activity. Geo-tagged photographs are becoming an increasingly popular source of data in a variety of fields of study and contemporary GIS so this subject would seem to fit the journal. The article is generally well-written and easy to follow. There are, however, some concerns:


The abstract suggests a scientifically strong and interesting paper will follow, but this is mis-leading. The article is actually a weak investigation of a relatively new data source to yield unsurprising results and adds nothing of significance to understanding (and hence science).

The introduction to the article raises many interesting issues but much is often asserted (e.g. where is the evidence of the need for tourism prediction? What exactly is wanted – and surely something more detailed and specific than investigated in the case study?). Many of the interesting issues are also well-known – such as missing data and how it can be handled. The introduction does set the scene for the research but more as assertion than rigorous evidence-based analysis.

The stated originality of the work (see paragraph beginning line 87) is weak. It reminds me of past work that was common in remote sensing – the argument used go along the lines ‘researchers have studied crops in general but not lettuce’ , and then a few months later the virtually same paper would be submitted on barley, followed by one on wheat etc. etc. Each paper told essentially the same story but had novelty in the specific target. There is some validity to this (but after adding carrots, potatoes, rice and spring onions to the list there must come a point at which greater novelty, for specific reason, is apparent). The core focus of the paper is then suggested to be on pre-processing (line 95) yet the paper does very basic and rather crude pre-processing.

Parts of the research are very poorly described and certainly cannot be replicated (a key part of rigorous science). For example the use of the tag information lines 136-140. This is important but the authors don’t actually say how they do this. The will face challenges and need to make decisions – these need to be stated and decisions justified.

The article appears to be mostly based on an implicit assumption that people might travel for tourism purposes one a year. This is certainly not always the case. Many will travel more than once. People may travel for business but include a short tourism component to their trip and have a regular holiday. The retired may holiday several times a year. Such issues make the approach to data deletion (one of the key pre-processing steps) poor – this issue is noted on line 155 but passed over too briefly. The issue of multiple-purpose travel is also a challenge in this article. People may travel for multiple reasons – a trip might involve a wedding but allows meeting work colleagues in a nearby town and then a cultural event later – this can give multiple transitions through one particular location over time that might potentially result in the foreign traveller being identified as a native.

Does Beijing have the most heritage sites? It may do but again is simply asserted. How does it compare to, say, Rome (including the Vatican)?

The case study is very poor. The end result is that historic Flickr photography indicates where people may go in the future. The work is not in the least bit informative or unusual. The same prediction could be made using just a city tourist map. The analysis is also weak, surely the authors need to show something difficult to predict (e.g. visitor numbers to a specific location in, say, a month when another major event is happening shortly after – e.g. how many people will visit the Emperor’s palace in Tokyo in September 2019 when the rugby world cup is in progress). All they have shown is that a city that has lots of historic sites will gets lots of visitors in the future.

Overall –this is a poor article that uses contemporary data and methods but only to yield rather unsurprising and uninteresting results.

Reviewer 3 Report

Research on Flickr metadata preprocessing method for forecasting urban inbound tourism demand

 

The subject of the article is interesting and may be useful for further research, but major changes are necessary to broaden the scope of the investigation.

 

According to the Instructions for authors, the introduction should briefly place the study in a broad context. However, you have focused all research on a specific case study (Flickr and Beijing).

 

Title:

In this sense, the title is not correct. On the one hand, "pre-processing method" does not reflect your research because, in addition to applying pre-processing methods, you make a predictive calculation of tourist demand. On the other hand, considering that the methods proposed by you may be useful for other research on geotagged photo metadata, it makes no sense to limit your application to Flickr. Therefore, a more explanatory title could be: “Research on geotagged photo metadata for forecasting urban inbound tourism demand”.

 

Keywords:

So, you can add Flickr and Beijing as keywords.

 

Background:

In addition to the predictive calculation of tourism demand, the geotagged photo metadata pre-processing methods that you propose may be useful for other research on destination images published in top journals, such as: “Visual destination images of Peru ...” (Flickr); “Spatiotemporal analysis of photo contribution patterns ...” (Panoramio); “Characterizing the location of tourist images in cities ...” (Instagram); and “A pictorial analysis of destination images ...” (Pinterest). It is important to highlight the metadata that different geotagged photos have in common, for example, standards such as EXIF (exchangeable image file format).

 

Materials and Methods:

Once you have completed the theoretical background together with the related literature review, you can describe the case study (Flickr photos about the capital city Beijing) and the methods for processing in the Materials and Methods section.

 

Discussion:

It is necessary to expand academic and managerial implications.

 

Minor changes:

Abbreviations and acronyms should be defined in parentheses the first time they appear in the abstract, main text, and in figure or table captions and used consistently thereafter. For example, DBSCAN, TF-IDF, etc.

 

Back to TopTop