Next Article in Journal
A 2.5D Finite Element Method Combined with Zigzag-Paraxial Boundary for Long Tunnel under Obliquely Incident Seismic Wave
Previous Article in Journal
BiomacEMG: A Pareto-Optimized System for Assessing and Recognizing Hand Movement to Track Rehabilitation Progress
 
 
Article
Peer-Review Record

Entity Alignment Method Based on Joint Learning of Entity and Attribute Representations

Appl. Sci. 2023, 13(9), 5748; https://doi.org/10.3390/app13095748
by Cunxiang Xie 1, Limin Zhang 1 and Zhaogen Zhong 2,*
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(9), 5748; https://doi.org/10.3390/app13095748
Submission received: 8 February 2023 / Revised: 25 April 2023 / Accepted: 27 April 2023 / Published: 6 May 2023

Round 1

Reviewer 1 Report

The author of the paper has identified a problem in the existing entity embedding process, which requires pre-defined structural information such as entity pairs. To address this issue, the author has proposed a method that utilizes embedding techniques such as TransE to interpret structural information and applies an N-gram-based attribute-character embedding technique to interpret attribute value information simultaneously. The paper provides a detailed analysis of related KG representation learning papers, as well as a thorough description of the research motivation and background. However, the paper lacks adequate justification and reasoning for the proposed ideas to overcome the problems or limitations of existing research.

  1. The paper needs to provide further clarification regarding why relationships and attribute information can be used in representation learning without pre-aligned information to find relationships between different KGs. In environments where different KGs may have significant differences in abstraction, the heterogeneity of the semantic information they contain is naturally intensified. In such scenarios, can semantic embeddings and cosine similarity reliably discriminate between entity pairs? As semantic information is highly complex, it may be challenging to discriminate properly through methods such as embeddings. Therefore, recent attempts have used pre-trained models such as BERT to encode vectors. What is the novelty of the proposed method compared to this?

  2. The paper's second contribution contains redundant content. Are there any solutions to overcome the issues that must be considered when finding a joint space? Is TransE or similar embedding techniques too shallow to find a complex joint space? The paper should provide a brief description of its unique way of identifying and overcoming potential issues.

  3. The paper should provide a more detailed explanation of the loss function proposed in section 2.3.1. L_SE-direct and L_SE-multistep are expected to exhibit different loss trends. L_SE-direct may exhibit more variants and continuous values during learning, while L_SE-multistep's value may not change as the p value is increased. The normalization factor is also a concern that may further solidify the value. Due to these concerns, L_SE is expected to be biased towards L_SE-direct and produce results similar to MTransE. The paper should provide a design rationale or justification for equation (9).

  4. The paper should provide a detailed explanation of the attribute value vector in equation (11). Attributes in a KG can be represented as text or numeric values expressed in XSD. Has the paper considered this? Additionally, the proposed method does not appear to resemble an embedding that takes into account the characteristics or patterns of attributes. The paper should describe the characteristics of attributes and present the rationale for the proposed method.

  5. The paper requires a design rationale explanation for equation (17). The proposed model appears to significantly sacrifice spatial information that can discriminate unique entities to reduce the difference between structural embedding and attribute character embedding.

  6. The paper should provide a more detailed explanation and justification for the proposed equation in entity alignment. The proposed equation seems to be susceptible to the risk of Majority dominance, where very few entity pairs that occur frequently dominate the results. This is because the design only reduces the difference between different types of entity embeddings, as in Comment 5, so the discrimination power of individual entities may be significantly lost.

Author Response

Reviewer#1, Concern # 1: The paper needs to provide further clarification regarding why relationships and attribute information can be used in representation learning without pre-aligned information to find relationships between different KGs. In environments where different KGs may have significant differences in abstraction, the heterogeneity of the semantic information they contain is naturally intensified. In such scenarios, can semantic embeddings and cosine similarity reliably discriminate between entity pairs? As semantic information is highly complex, it may be challenging to discriminate properly through methods such as embeddings. Therefore, recent attempts have used pre-trained models such as BERT to encode vectors. What is the novelty of the proposed method compared to this?

Author response: 

1)     We proposed a relation and attribute alignment method based on cosine similarity. The cosine similarities of the semantic embeddings of relations and attributes are calculated, followed by the manual inspection and amendment of the results. Relation-attribute alignment is then performed by renaming relations and attributes with the same meaning to the same name.

2)     A learning model is proposed to learn the structural and attribute-character embeddings from different KGs using their relation and attribute triples. Structural embeddings are learned via a triples- modelling method using TransE and PTransE, which use the semantic information in direct and multi-step relation paths to extract structural embedding vectors from the embedding vector space. Attribute character embeddings are learned using the N-gram-based compositional function to encode a character sequence for the attribute values. Then, by using TransE to model attribute triples in the embedding vector space, we obtain the attribute-character embedding vectors. Finally, the structural embeddings and attribute character embeddings are jointly learned to transfer the structural embedding vectors of entities from different KGs into a unified vector space.

3)     We introduce a limit-based loss function that assigns absolutely low and high scores to positive and negative triples, respectively, to improve the loss function for embedding learning and prevent drift while mapping structural embeddings into the unified vector space.

 

Reviewer#1, Concern # 2: The paper's second contribution contains redundant content. Are there any solutions to overcome the issues that must be considered when finding a joint space? Is TransE or similar embedding techniques too shallow to find a complex joint space? The paper should provide a brief description of its unique way of identifying and overcoming potential issues.

Author response: 

No solution has been found. Thanks for the reviewer's suggestion. We will do further research in the future work.

 

Reviewer#1, Concern # 3: The paper should provide a more detailed explanation of the loss function proposed in section 2.3.1. L_SE-direct and L_SE-multistep are expected to exhibit different loss trends. L_SE-direct may exhibit more variants and continuous values during learning, while L_SE-multistep's value may not change as the p value is increased. The normalization factor is also a concern that may further solidify the value. Due to these concerns, L_SE is expected to be biased towards L_SE-direct and produce results similar to MTransE. The paper should provide a design rationale or justification for equation (9).

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

Reviewer#1, Concern # 4: The paper should provide a detailed explanation of the attribute value vector in equation (11). Attributes in a KG can be represented as text or numeric values expressed in XSD. Has the paper considered this? Additionally, the proposed method does not appear to resemble an embedding that takes into account the characteristics or patterns of attributes. The paper should describe the characteristics of attributes and present the rationale for the proposed method.

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

 

Reviewer#1, Concern # 5: The paper requires a design rationale explanation for equation (17). The proposed model appears to significantly sacrifice spatial information that can discriminate unique entities to reduce the difference between structural embedding and attribute character embedding.

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

 

Reviewer#1, Concern # 6: The paper should provide a more detailed explanation and justification for the proposed equation in entity alignment. The proposed equation seems to be susceptible to the risk of Majority dominance, where very few entity pairs that occur frequently dominate the results. This is because the design only reduces the difference between different types of entity embeddings, as in Comment 5, so the discrimination power of individual entities may be significantly lost.

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

 

 

Reviewer 2 Report

The article is devoted to describing a method of determining entity alignment in Knowledge Graphs. The method is described and evaluated using common datasets.

However, the article has significant problems in the review of related works and so comparison of the developed method with state-of-the-art.

1.  The authors write "The most recent advances in KG entity alignment are embedding-based techniques, such as MTransE [13]" However the reference 17 is 5 years old (published in 2017) which hardly can be considerent "most recent advance" in 2023. The other models in the same paragraph are from the same years.

2. All the models choosen for comparison to the authors' method were published in 2017-2018. However, the field of entity alignment in kknowledge graphs has been developed further in the following years (see, for example, the reviews like https://arxiv.org/abs/2203.09280 and https://arxiv.org/abs/2205.08777)  I strongly recommend using the recent models (published in 2020-22) for comparison instead of old models. It would be also good to increase geographical coverage of the reference list.

3. This article is proposed to the Applied Sciences journal, but the article doesn't discuss the possible applications of the proposed technique. The section "Featured Application" contains generic text. Please, consider thorough discussing the practical applications of your methods given the journal you're submitting to.

English in the article is generally good, but there are some minor technical problems

1. formula (3) and (4) are almost the same, there is no need to show them both - mentioning in the text that the same formula is used for other parameters is enough.

2. in lines 221-222 it's better to use subscript to improve the readability of designations like ese and ece 

3. there are typos in formula (30): "precision", not "prcision"

4. Consider not repeating well-known formula for widely-used parameters like precision, recall and F1 measure - a reference to their description with brief explanation of how they were interpreted in this study is enough.

Fixing these problems will increase the manuscripts scientific value and impact. The authors should also try to name their method so that the future studies could name it easily.

Author Response

Reviewer#1, Concern # 1: The authors write "The most recent advances in KG entity alignment are embedding-based techniques, such as MTransE [13]" However the reference 17 is 5 years old (published in 2017) which hardly can be considerent "most recent advance" in 2023. The other models in the same paragraph are from the same years.

Author response: 

Although these models have been proposed earlier, they are classical models in the field of knowledge graph entity alignment, and it is meaningful to compare them with our method.

 

Reviewer#1, Concern # 2: All the models chosen for comparison to the authors' method were published in 2017-2018. However, the field of entity alignment in knowledge graphs has been developed further in the following years (see, for example, the reviews like https://arxiv.org/abs/2203.09280 and https://arxiv.org/abs/2205.08777)  I strongly recommend using the recent models (published in 2020-22) for comparison instead of old models. It would be also good to increase geographical coverage of the reference list.

Author response: 

Although these models have been proposed earlier, they are classical models in the field of knowledge graph entity alignment, and it is meaningful to compare them with our method.

 

Reviewer#1, Concern # 3: This article is proposed to the Applied Sciences journal, but the article doesn't discuss the possible applications of the proposed technique. The section "Featured Application" contains generic text. Please, consider thorough discussing the practical applications of your methods given the journal you're submitting to.

Author response: 

In practical application, there are different knowledge graphs in different fields, such as financial graph, commodity graph, medical graph and so on. Entity alignment technique can be applied to the fusion of multiple knowledge graphs in different domains or even across domains.

Reviewer#1, Concern # 4: English in the article is generally good, but there are some minor technical problems:

  1. formula (3) and (4) are almost the same, there is no need to show them both - mentioning in the text that the same formula is used for other parameters is enough.
  2. In lines 221-222 it's better to use subscript to improve the readability of designations like ese and ece.
  3. there are typos in formula (30): "precision", not "prcision"
  4. Consider not repeating well-known formula for widely-used parameters like precision, recall and F1 measure - a reference to their description with brief explanation of how they were interpreted in this study is enough.

Author response: 

Thank you for your suggestion, we have made the modification.

 

Reviewer#1, Concern # 5: The paper requires a design rational explanation for equation (17). The proposed model appears to significantly sacrifice spatial information that can discriminate unique entities to reduce the difference between structural embedding and attribute character embedding.

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

 

Reviewer#1, Concern # 6: The paper should provide a more detailed explanation and justification for the proposed equation in entity alignment. The proposed equation seems to be susceptible to the risk of Majority dominance, where very few entity pairs that occur frequently dominate the results. This is because the design only reduces the difference between different types of entity embeddings, as in Comment 5, so the discrimination power of individual entities may be significantly lost.

Author response: 

Thank you for your suggestion, we have provided a more detailed explanation.

 

 

Round 2

Reviewer 1 Report

Authors' idea is reasonable and interesting, but it might be narrow down to specific research topic. It can be seen somehow broad to be held in a single paper. 

Recent publications have proposed numerous approaches related to KG embeddings. It may be beneficial for the author to supplement their review with the latest methodologies. Although additional explanations have been provided, it is difficult to verify how the proposed methodology's key contributions are implemented in the experimental section. It would be helpful to explain the results of the experiments focusing on the emphasized content in Chapter 1, as this would make it easier for readers to understand how the author attempted to demonstrate the superiority o f their proposed method.

Author Response

We have supplemented the review on methods related to KG embedding with the latest methods

Back to TopTop