Next Article in Journal
Efficient Neural Network for Text Recognition in Natural Scenes Based on End-to-End Multi-Scale Attention Mechanism
Previous Article in Journal
Fabric Defect Detection Algorithm Based on Image Saliency Region and Similarity Location
 
 
Article
Peer-Review Record

Systematic Approach for Measuring Semantic Relatedness between Ontologies

Electronics 2023, 12(6), 1394; https://doi.org/10.3390/electronics12061394
by Abdelrahman Osman Elfaki and Yousef H. Alfaifi *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Electronics 2023, 12(6), 1394; https://doi.org/10.3390/electronics12061394
Submission received: 27 January 2023 / Revised: 9 March 2023 / Accepted: 13 March 2023 / Published: 15 March 2023
(This article belongs to the Special Issue Advanced Ontologies and Semantic Web Technologies)

Round 1

Reviewer 1 Report

The authors address a question of how to efficiently compare two "ontologies", i.e. classes with a given list of attributes. This sort of problem often arises, e.g. when one need to compare databases, so the question seems quite relevant. The authors suggest an asymmetric similarity measure between classes based on what fraction of attributes are considered common. The core of the approach is the algorithm to determine the commonality of the atributes, which includes several stages of data cleaning (removal of function words, stemming) and then the determination of synonyms and hypernyms using WordNet.

In principle, the suggested algorithm seems rather trivial. However, it shows quite a good results in the benchmarking, so I think it is publishable. 

I basically have one big question to the authors and several minor technical suggestions.

The big question is as follows. Classes with single unstructured set of properties, as studied here, seem to be the simplest and most trivial example of a classification. It seems that it is much more often that the set of properties is additionally structure (e.g., by forming nested subsets), or classification consists of a set of hierarchically organized classes, or something even more complex.
Just to give a random example consider a library classification consisting of 
I. Class Publication with properties {publication year, country, city, publishing house}
with further subclasses
Ia. Book with properties {author, title, number of pages, topic}
Ib. Newspaper with properties {date, title, number, size of paper, editor}
Ic. Scientific paper with properties {journal title, authors, affiliations, title, page number, PACS}
Now, if I want to compare this library classification to one from some other library (with probably a slightly different set of classes and attributes), is the algorithm suggested in this paper of any use to me? Is it possible to somehow generalize it to work with structured/hierarchical classes like that?

Minor comments:
1. In table 2 both classes include 7 attributes, in tables 3 and below both have 6, in the final discussion (lines 270-274) it is claimed that one class ("scientific article") has 6 attributes, and the other ("research") has 7. This is self-contradictory, please fix it.
2. The similarities in this example are 5/7 and 5/6. The typical uncertainty in this number is at least 1/6-1/7, i.e. more than 10%. Indeed, in this case there is an example of false negative comparison - the algorithm was unable to understand that "author" and "researcher" are connected fields (by the way I applaud authors for including this false negative into the example). This means that it makes no sense whatsoever to give answers with 4 decimal letters, 1 (or at most 2) should be enough:
"therefore, synonyms_similarity (scientific article, research) \approx 0.8 (or 83%). ... therefore, synonyms_similarity (research, scientific article) \approx 0.7 (or 71%)"
3. The design of the evaluation experiment seems very reasonable (albeit small-scale), and the results seem very good. However, please publish the full result of the evaluation as an appendix so the interested readers can dig deeper into the results: after all, 250 comparisons are not that much and can be easily inspected by naked eye.

 

Author Response

Comments and Answers:

 

Manuscript: Systematic Approach for Measuring Semantic Relatedness between Ontologies

Firstly, we are grateful and highly appreciate our esteemed reviewers for their valuable and well-considered comments that guided and assisted us in improving this research.

Comments and Suggestions for Authors:

The authors address a question of how to efficiently compare two "ontologies", i.e. classes with a given list of attributes. This sort of problem often arises, e.g. when one need to compare databases, so the question seems quite relevant. The authors suggest an asymmetric similarity measure between classes based on what fraction of attributes are considered common. The core of the approach is the algorithm to determine the commonality of the atributes, which includes several stages of data cleaning (removal of function words, stemming) and then the determination of synonyms and hypernyms using WordNet.

In principle, the suggested algorithm seems rather trivial. However, it shows quite a good results in the benchmarking, so I think it is publishable. 

Comment: I basically have one big question to the authors and several minor technical suggestions.

The big question is as follows. Classes with single unstructured set of properties, as studied here, seem to be the simplest and most trivial example of a classification. It seems that it is much more often that the set of properties is additionally structure (e.g., by forming nested subsets), or classification consists of a set of hierarchically organized classes, or something even more complex.
Just to give a random example consider a library classification consisting of 
I. Class Publication with properties {publication year, country, city, publishing house}
with further subclasses
Ia. Book with properties {author, title, number of pages, topic}
Ib. Newspaper with properties {date, title, number, size of paper, editor}
Ic. Scientific paper with properties {journal title, authors, affiliations, title, page number, PACS}
Now, if I want to compare this library classification to one from some other library (with probably a slightly different set of classes and attributes), is the algorithm suggested in this paper of any use to me? Is it possible to somehow generalize it to work with structured/hierarchical classes like that?

Answer:

As detailed in Section 5 (Discussion and Conclusion), “The relatedness between two ontologies is measured on a class-based level, i.e., at a low level of the ontologies”. Therefore, we now consider the provided example: 

library classification:

Book with properties {author, title, number of pages, topic}
Newspaper with properties {date, title, number, size of paper, editor}

Scientific paper with properties {journal title, authors, affiliations, title, page number, PACS}

 

Will be divided into four classes:

Class 1: Publication with properties {publication year, country, city, publishing house}
Class 2: Book with properties {author, title, number of pages, topic}
Class 3: Newspaper with properties {date, title, number, size of paper, editor}

Class 4: Scientific paper with properties {journal title, authors, affiliations, title, page number, PACS}

Consider comparing the provided “library” example ontology with the "Scientific article" ontology from Example 1 in the paper; the compression will be conducted at the class level. We now use our approach to measure the relatedness between the “library” ontology and the “Scientific article” ontology.

 

Ontology/class

Data properties

Scientific Article

article author; submission date; article title, decision of the article; article identifier; domain

Library.Publication

publication year, country, city, publishing house

Library.Book

author, title, number of pages, topic

Library.Newspaper

date, title, number, size of paper, editor

Library.Scientific paper

journal title, authors, affiliations, title, page number, PACS

 

Step 1: Remove the stopping words.

Ontology/class

Data properties after removing stop words

Scientific Article

article author; submission date; article title, decision article; article identifier; domain

Library.Publication

publication year, country, city, publishing house

Library.Book

author, title, number pages, topic

Library.Newspaper

date, title, number, size paper, editor

Library.Scientific paper

journal title, authors, affiliations, title, page number, PACS

 

 

Step 2: Remove the class sign

Ontology/class

Data properties after removing class sign

Scientific Article

author; submission date; title, decision; identifier; domain

Library.Publication

year, country, city, publishing house

Library.Book

author, title, number pages, topic

Library.Newspaper

date, title, number, size, editor

Library.Scientific paper

journal title, authors, affiliations, title, page number, PACS

 

Step 3: Stemming

Ontology/class

Data properties after stemming

Scientific Article

author; submit date; title, decide; identify; domain

Library.Publication

year, country, city, publish house

Library.Book

author, title, number page, topic

Library.Newspaper

date, title, number, size, editor

Library.Scientific paper

journal title, author, affiliation, title, page number, PACS

 

Step 4: Apply cartesian product to the properties

 

4.1. Apply cartesian product to the properties of “scientific article” and “library.publication” classes

(author, year); (author, country); (author, city); (author, publish house);

(submit date, year); (submit date, country); (submit date, city); (submit date, publish house);

(title, year); (title, country); (title, city); (title, publish house);

(decide, year); (decide, country); (decide, city); (decide, publish house);

(identify, year); (identify, country); (identify, city); (identify, publish house);

(domain, year); (domain, country); (domain, city); (domain, publish house);

 

4.2. Apply cartesian product to the properties of “scientific article” and “library.book” class

(author, author); (author, title); (author,number page); (author,topic)

(submit date, author); (submit date, title); (submit date,number page); (submit date,topic)

(title, author); (title, title); (title,number page); (title,topic)

(decide, author); (decide, title); (decide, number page); (decide,topic)

(identify, author); (identify, title); (identify,number page); (identify,topic)

(domain, author); (domain, title); (domain,number page); (domain,topic)

 

4.3. Apply cartesian product to the properties of “scientific article” and “library.newspaper” class

(author,date); (author,title);(author, number);(author, size); (author, editor)

(submit date,date); (submit date,title);( submit date, number);( submit date, size); (submit date, editor)

(title,date); (title,title);(title, number);(title, size); (title, editor)

(decide,date); (decide,title);( decide, number);( decide, size); (decide, editor)

(identify, date); (identify, title);( identify, number);( identify, size); (identify, editor)

(domain, date); (domain,title);( domain, number);( domain, size); (domain, editor)

 

4.4. Apply cartesian product to the properties of “scientific article” and “library. scientific paper” class

(author, journal title); (author, author); (author, affiliation); (author, title); (author, page number); (author, PACS);

(submit date, journal title); (submit date, author); (submit date, affiliation); (submit date, title); (submit date, page number); (submit date, PACS);

(title, journal title); (title, author); (title, affiliation); (title, title); (title, page number); (title, PACS);

(decide, journal title); (decide, author); (decide, affiliation); (decide, title); (decide, page number); (decide, PACS);

(identify, journal title); (identify, author); (identify, affiliation); (identify, title); (identify, page number); (identify, PACS);

(domain, journal title); (domain, author); (domain, affiliation); (domain, title); (domain, page number); (domain, PACS);

 

  1. Similarity

There is no synonym similarity between “scientific article” and “library.publication” classes.

Synonyms_Similarity (scientific article, library.publication) = 0

Synonym similarity between “scientific article” and “library.book” classes

(author, author)

Yes

(title, title)

Yes

Synonyms_Similarity (scientific article, library.book) = 33.33%

Synonym similarity between “scientific article” and “library.newspaper” classes

)submit date,date(

Yes

)title,title(

Yes

Synonyms_Similarity (scientific article, library.newspaper) = 33.33%

Synonym similarity between “scientific article” and “library. scientific paper” classes

(author, author)

Yes

(title, journal title)

Yes

(title, title)

Yes

Synonyms_Similarity (scientific article, library.newspaper) = 50%

Hypernym similarity of classes “scientific article” and “library.publication”

(submit date, year)

Yes

Hypernym_Similarity (scientific article, library.publication) = 16.66 %

Hypernym similarity of classes “scientific article” and “library. book”

(title,topic)

Yes

(domain,topic)

Yes

Hypernym_Similarity (scientific article, library.book) = 33.33%

Hypernym similarity of classes “scientific article” and “library.newspaper”

Hypernym_Similarity (scientific article, library. newspaper) = 0

Hypernym similarity of classes “scientific article” and “library.scientific paper”

Hypernym_Similarity (scientific article, library.scientific paper) = 0

The relatedness between the "scientific article" and "library" ontologies has been measured based on class level; this example has been added to the new version as appendix.

 

Minor comments:
1. In table 2 both classes include 7 attributes, in tables 3 and below both have 6, in the final discussion (lines 270-274) it is claimed that one class ("scientific article") has 6 attributes, and the other ("research") has 7. This is self-contradictory, please fix it.

Answer

This issue has been fixed in the new version.


Comment: 2. The similarities in this example are 5/7 and 5/6. The typical uncertainty in this number is at least 1/6-1/7, i.e. more than 10%. Indeed, in this case there is an example of false negative comparison - the algorithm was unable to understand that "author" and "researcher" are connected fields (by the way I applaud authors for including this false negative into the example). This means that it makes no sense whatsoever to give answers with 4 decimal letters, 1 (or at most 2) should be enough:
"therefore, synonyms_similarity (scientific article, research) \approx 0.8 (or 83%). ... therefore, synonyms_similarity (research, scientific article) \approx 0.7 (or 71%)"

Answer

This proposal measures relatedness (synonyms and hypernyms) between ontologies at the class level.  To show the degree of relatedness, we have suggested percentage as a metric for this relatedness. The proposed algorithm can detect the hypernym relationship between "author" and "research".

 

Comment: 3. The design of the evaluation experiment seems very reasonable (albeit small-scale), and the results seem very good. However, please publish the full result of the evaluation as an appendix so the interested readers can dig deeper into the results: after all, 250 comparisons are not that much and can be easily inspected by naked eye.

Answer

We really thank and appreciate our esteemed reviewer for this comment. We will send all software as separate files with the new version. 

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper proposes an approach for defining semantic relatedness between ontologies based on measuring synonym, hyponym, and hypernym components. No clear indication of novelty.

This paper provides a good literature review on hypernyms and hyponyms based similarity scores. However some aspects are not covered on similarity measures, e.g., Discussed in https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy-systems/ifs18120 and “A review on semantic similarity measures for ontology“ (2019) https://ieeexplore.ieee.org/document/8082069  offers “An approach to measuring semantic similarity and relatedness between concepts in an ontology” (2017) not considered.

Limitation of the proposed approach:1) Ontologies with classes without data properties cannot be compared.

2) Relations between classes are not considered in finding similarity between ontologies.

3) Meta data, e.g., prefix and imports are not considered in finding similarity between ontologies.4) No similarity evaluation of bigger ontologies done, e.g., Gene Ontology, Open Power Ontology, SAREF, COSMO, etc

Author Response

Comments and Answers:

Manuscript: Systematic Approach for Measuring Semantic Relatedness between Ontologies

Firstly, we are grateful and highly appreciate our esteemed reviewers for their valuable and well-considered comments that guided and assisted us in improving this research.

 

Comments and Suggestions for Authors:

This paper proposes an approach for defining semantic relatedness between ontologies based on measuring synonym, hyponym, and hypernym components. No clear indication of novelty.

Answer

In this research, we have combined concepts from linguistics and knowledge engineering sciences to measure relatedness between ontologies. To the best of our knowledge, this is the first research that provides a flexible relatedness measurement methodology between ontologies, at the class level.

In the new version, the above statement has been added to Section 5 (Discussion and Conclusion).

 

Comment: This paper provides a good literature review on hypernyms and hyponyms based similarity scores. However some aspects are not covered on similarity measures, e.g., Discussed in https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy-systems/ifs18120 and “A review on semantic similarity measures for ontology“ (2019) https://ieeexplore.ieee.org/document/8082069  offers “An approach to measuring semantic similarity and relatedness between concepts in an ontology” (2017) not considered.

 

Answer

In the related work section, we focus on works that used lexical relations, particularly the hypernym-hyponym relationship.

 

The first paper ``A review on semantic similarity measures for ontology’’ provides a review of semantic similarity measurements. In our research, we are focusing on lexical ontology units by using concepts of linguistics science.

The second paper has been considered, analyzed, and added to the related works as a reference [35].

 

Comments: Limitation of the proposed approach:
1) Ontologies with classes without data properties cannot be compared.

Answer

Our research only considers classes with data properties. 

 

Comment: 2) Relations between classes are not considered in finding similarity between ontologies.

Answer

Relations between classes are out of the scope of this paper and will be considered in future research.

 

Comment: 3) Meta data, e.g., prefix and imports are not considered in finding similarity between ontologies.

Answer

This suggestion will be considered in future research.

 

Comment: 4) No similarity evaluation of bigger ontologies done, e.g., Gene Ontology, Open Power Ontology, SAREF, COSMO, etc

Answer

In this paper, we have used WordNet to prove the correctness and applicability of our proposed approach. The WordNet lexical database is used as a reference model for calculating synonym and hypernym similarities [35].

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper present a novel method for the ontology matching by using an algorithm introduced by authors. The proposed algorithm is oriented on the semantic relatedness measured using synonyms and hypernym-hyponym relationships. The method is well described. In my opinion, the major lack of this paper is the method evaluation which uses randomly generated clases grouped in five tuple. This evaluation does not give me an adequate information about how good is the proposed method neither compares the proposed method with other approaches. In my opinion, authors shoud take some existing ontologies which are used by other authors and compare their results with each other. Or at least use some other methods of the ontology matching on the same ontology and compare results. 

Author Response

Comments and Answers:

Manuscript: Systematic Approach for Measuring Semantic Relatedness between Ontologies

 

Firstly, we are grateful and highly appreciate our esteemed reviewers for their valuable and well-considered comments that guided and assisted us in improving this research.

Comments and Suggestions for Authors:

The paper present a novel method for the ontology matching by using an algorithm introduced by authors. The proposed algorithm is oriented on the semantic relatedness measured using synonyms and hypernym-hyponym relationships. The method is well described. In my opinion, the major lack of this paper is the method evaluation which uses randomly generated clases grouped in five tuple. This evaluation does not give me an adequate information about how good is the proposed method neither compares the proposed method with other approaches. In my opinion, authors shoud take some existing ontologies which are used by other authors and compare their results with each other. Or at least use some other methods of the ontology matching on the same ontology and compare results. 

Answer

This paper proposes an approach to measuring relatedness between ontologies using linguistics concepts. The best standard reference that can be used as a benchmark is WordNet. The WordNet lexical database is used as a reference model for calculating synonym and hypernym similarities [35].

 

Author Response File: Author Response.pdf

Reviewer 4 Report

My main comments are :

1) The paper presents a novel approach for measuring semantic relatedness but without a convincing evaluation of this measure. The authors evaluate their measure through a theoretical comparison of caracteristics as compared to other measures. This rather is a motivation that should be introduced when defining the measure but is not an evaluation. The evaluation should be performed with "true" ontologies. 

2) The contribution of the paper could be to recap the concept of semantic relatedness and to analyze related work but these sections are ill-written and not informative. 

Last, here are some comments. The english is very bad, some sentences do not have verbs.

The introduction is difficult to follow, somehow the beginning of the discussion is more clear than the introduction.

In the introduction and in the related work you introduce lots of works but it reads like a list with no clear conclusion and analysis of your own (except one or two sentences in the related work section). When you write line 150 the conclusion of the related work, it is not quite supported by the text for the reader. 

In the methodology, is it correct to write Assumption 1 that way, without specifying that O is an ontology (before the sign "there exists").

You introduce an example to illustrate the approach, yet you don't tell enough about these ontologies, where do they come from.

Author Response

Comments and Answers:

Manuscript: Systematic Approach for Measuring Semantic Relatedness between Ontologies

Firstly, we are grateful and highly appreciate our esteemed reviewers for their valuable and well-considered comments that guided and assisted us in improving this research.

 

Comments and Suggestions for Authors:

My main comments are:

1) The paper presents a novel approach for measuring semantic relatedness but without a convincing evaluation of this measure. The authors evaluate their measure through a theoretical comparison of caracteristics as compared to other measures. This rather is a motivation that should be introduced when defining the measure but is not an evaluation. The evaluation should be performed with "true" ontologies. 

Answer

The evaluation of the accuracy (of the proposed approach) has been implemented by using the parameters of precision and recall. This is the standard method for the evaluation of approaches within this field, and it has been used in all similar works.

Comment: 2) The contribution of the paper could be to recap the concept of semantic relatedness and to analyze related work, but these sections are ill-written and not informative. 

Answer

In the new version, the following statement has been added to the conclusion, “In this research, we have combined concepts from linguistics and knowledge engineering sciences for measuring relatedness between ontologies. To the best of our knowledge, this is the first research which provides a flexible relatedness measurement methodology between ontologies at the class level”

In lines 437-438, this statement is provided” this paper provides a methodology with open technical details, which draws a clear picture of how our approach has been developed”. 

 

Comment: Last, here are some comments. The english is very bad, some sentences do not have verbs.

The introduction is difficult to follow, somehow the beginning of the discussion is more clear than the introduction.

Answer

The new version of this paper has been proofread, and the certificate will be attached with the new version.

Comment: In the introduction and in the related work you introduce lots of works but it reads like a list with no clear conclusion and analysis of your own (except one or two sentences in the related work section).

Answer

The new version of this paper has been proofread.

Comment: When you write line 150 the conclusion of the related work, it is not quite supported by the text for the reader. 

Answer:

Table 1 gives a summary of related works, giving purpose and methodology used. This helps to demonstrate our contribution as we provide a clear and flexible methodology. It is clear because it is open-source; each step in our methodology has been obviously described.  It is flexible as similarity and homonyms have been measured separately

 

Comment: In the methodology, is it correct to write Assumption 1 that way, without specifying that O is an ontology (before the sign "there exists").

Answer:

Assumption 1:

∀ O, ÆŽ C: ontology(O) ∧class (C) ∧ contain(O,C) ⟹ True

Assumption 1 denotes that any (therefore we have used ∀) ontology has at least one class. 

Comment: You introduce an example to illustrate the approach, yet you don't tell enough about these ontologies, where do they come from.

Answer:

These ontologies have been borrowed from

https://www.scitechnol.com/download.php?download=peer-review-pdfs/building-ontology-for-library-management-system-using-dewey-decimal-classification-scheme-Tk7V.pdf

 

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

In the previous review I asked that evaluation has to be extended in order to give real life example by using existing ontologies and to compare the results of presentet methods with others approaches. This has not been done in the revised paper. Authors gave in the appendix an example where no synonims and hypernimy were used and yet the emphasis of the proposed methods is about using synonims and hypernims.

Author Response

Response to Reviewer 3 Comments

Point 1: In the previous review I asked that evaluation has to be extended in order to give real life example by using existing ontologies and to compare the results of present methods with others approaches. This has not been done in the revised paper.

Response 1:

Firstly, we are grateful and highly appreciate our esteemed reviewers for their valuable and well-considered comments that guided and assisted us in improving this research.

To measure relatedness between two published ontologies, we have considered these two published ontologies as mentioned in Appendix B:

First ontology is `People Ontology’, which is published in:

https://spec.edmcouncil.org/fibo/ontology/FND/AgentsAndPeople/People/

The data properties are:

(adult, age of majority, birth certificate, date of birth, date of death, death certificate,

driver's license, identity document, incapacitated adult,  working, name,

passport number,  person name, place of birth, residence, relative)

 

The Second ontology is `Person Ontology’, which is published in

https://enterpriseintegrationlab.github.io/icity/Person/doc/index-en.html#Person

Person Class

The data properties are:

)identifier, date of birth, date of death, mother, father, spouse, Job,  associated Income, driving id, address,  phone, name)

After applying our proposed algorithm:

Cardinality pair

Synonyms Similarity

(identify, identity document)

Yes

(date birth, date birth)

Yes

(date death, date death (

Yes

(name, name)

Yes

 

Cardinality pair

Hypernym Similarity

(father, relatives)

Yes

(mother, relatives)

Yes

(spouse, relatives)

Yes

(drive id, drive license)

Yes

(Address, residence)

Yes

(Job, working)

Yes

 

The result of relatedness between person and people ontologies based on person classes in both is:

synonyms_similarity (person.person, people.person) =33.33%

hypernym_similarity(person.person, people.person) =   50%

The above example has been added to the new version as Appendix B.             

Point 2: Authors gave in the appendix an example where no synonyms and hypernymy were used and yet the emphasis of the proposed methods is about using synonyms and hypernyms.

 

Response 1:

In appendix A, we have answered the first reviewer by considering structured ontology which has been provided by him.

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

the authors have fulfilled all the prescribed remarks

Back to TopTop