Next Article in Journal
Analysis of Factors Influencing the Prices of Tourist Offers
Next Article in Special Issue
Ensemble-NQG-T5: Ensemble Neural Question Generation Model Based on Text-to-Text Transfer Transformer
Previous Article in Journal
The Phenomenon of Loss of Energy Flux Density in Pneumatic and Electromagnetic Generators for EPAT Therapy
Previous Article in Special Issue
Towards Domain-Specific Knowledge Graph Construction for Flight Control Aided Maintenance
 
 
Article
Peer-Review Record

GeoBERT: Pre-Training Geospatial Representation Learning on Point-of-Interest

Appl. Sci. 2022, 12(24), 12942; https://doi.org/10.3390/app122412942
by Yunfan Gao 1, Yun Xiong 1, Siqi Wang 2 and Haofen Wang 2,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Reviewer 4: Anonymous
Appl. Sci. 2022, 12(24), 12942; https://doi.org/10.3390/app122412942
Submission received: 24 November 2022 / Revised: 12 December 2022 / Accepted: 12 December 2022 / Published: 16 December 2022
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)

Round 1

Reviewer 1 Report

1. Abstract: Please focus the abstract on your study and your results. In particular, The results summary may be specified in the end the abstract.

2. In fig. 1, The explaination may more elaborative.

3. Table 1, and Table 2, The data sources must specified clearly.

4. In equation 1-9, all the terms must be denoted.

5. In 4.1.2. Setup, It has been clearly explained. only specify the the executing machine specifications.

6. An overall comparison analysis may be added in the discussion part.

Author Response

 Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

1. Summary and contributions

This work proposes a pre-training geospatial representation learning model GeoBERT based on POI data and BERT Model. The experiments show that GeoBERT outperforms other advanced models on five downstream tasks.

2. Questions

   a) It may be natural to consider a grid with POIs as an image instead of a sequence, compared with the Vision Transformers[30], what are the differences in the generation of sequence in the data processing part?

   b) Line 195, The max length is set to 64. It's still not clear about long-term or short-term dependencies in the POI sequence. Because the length is shorter than common sentences in NLP. It'd be better to draw attention visualizations as Attention is all you need paper[29] did.

   c) In Section 3.4, It's not clear what kind of information is encoded in the POI sequence for three different paths, considering that different POI sequences obtain similar experimental results.

   d) In Section 3.4.2, How do we know the grid center in Figure 4?

   e) In Section 3.6.1, How many classes are for the POI Number Prediction?

3. Weaknesses

   a) Line 239, "after many attempts, the mask ratio of 15 % gives the best overall", it'd better compare them in an ablation study.

   b) Line 244, the authors design five geospatial downstream tasks and validate them on the urban data of Shanghai. That means that both pre-training and fine-tuning data are related to Shanghai, it'd be better to use the data of different cities to verify the generalization of learned representation.

   c) In Line 287, the authors use the grid embedding learned by GeoBERT as the basic features and integrate additional grid features, why use additional features (131 features) if the pre-training step obtains good enough representation? 

   d) It's not fair to compare with other methods(Word2vec and GloVe). For example, Line 342, training GeoBERT for 100 epochs, but Word2vec for 20 epochs and GloVe for 10 epochs. For a fair comparison, it should perform all experiments under reasonable settings. It also needs to list all model details, such as model architecture, and parameter number.

4. Some Typos

  a) Figure 2, "On the right is a slice that covers 16 grids", should it be 20 grids in the Figure?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

In this paper, a novel geospatial representation learning model was designed, and five practical downstream tasks for this model were proposed. The manuscript should be published if some place could be improved:

Some grammar problems or mistakes should be corrected, such as:

Line 346, “for classification tasks, the higher the indicator, the better”. It doesn't look like a complete sentence.

Line 372, “The results of store site recommendation with additional features are illustrated in Table 8.” It seems not refer to table 8. Please check here.

Line 142, should be “in the urban domain”

Line 53, should be “small amount of labelled data”

Line 128, should be“ERNIE-GeoL is a geography-and-language pre-trained model”

Line 398, three different kinds of POI sequences

Line 388, should be ”on the shortest path gives better results”

The last paragraph in 2.1 (line 98-105), is it possible to explain the differences between the author’s work in this article and the previous related work.

In 3.1, Can you provide some basis or reason when designing this structure for the model?

Line 340, “Other re-training parameters are set to default according to BERTbase, is it possible to provide relevant references in here?

Line 374, Among different combination methods, individual method gets the best performance why the individual method get the best performance, is it possible to give some explanation or discussion here?

In 3.1, the process was summarized in four steps. But figure 1 includes 3 parts. Is it possible to make them matched?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

The author combined geospatial representation learning with pre-training language model, and proposed the first large-scale pre-training geospatial representation learning model called GeoBERT. this research is innovative and interesting, and the author can consider improve the manuscript follow in the aspects:

1.      In fine-tuning stage, the architecture of five downstream tasks added in the output of GeoBERT should be described.

2.      In pre-training stage, the author should provide the GeoBERT’s evaluation indicator used to select the best mask ratio.

3.      In Figure 2, on the right, there are 20 grids, while 16 grids in the manuscript.

4.      In experiments, the author should give more detailed configuration parameters both in pre-training and fine-tuning stage, including deep learning framework, cuda version, python version and so on.

5.      The author mentions lots of models for geospatial representation learning in related work, but in section 4.1.1, the author only use Word2vec and Glove as baseline to compare the GeoBERT, the number of parameters of these two models is much lower than the GeoBERT, and it is impossible to know whether these two models are the current state-of-the-art in five downstream tasks. The author should add experiments to compare the current state-of-the-art model on different tasks.

6.      Can 131 features be used directly for prediction instead of GeoBERT for indicating the necessity of introducing this model?

7.      In pre-training stage, the author can add non-existent POI in a grid as Bert’s NSP task to develop the performance of GeoBERT.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The authors added more solid experiments which well answered the questions. I recommend that it be accepted without further revision.

Author Response

Thank youYour opinions are very valuable

Reviewer 4 Report

Before acceptance, the author only needs to check the manuscript to ensure it is well-written with no spelling or grammar mistakes.

Author Response

Thank youYour opinions are very valuable. We have checked the grammar errors in the manuscript.

Back to TopTop