Construction of Topic Hierarchy with Subtree Representation for Knowledge Graphs
Abstract
:1. Introduction
- Adapting the nHDP to analyze and construct hierarchy based on knowledge graphs by replacing documents with subjects, words with predicates, and objects;
- Proposing a set of new coverage measures to evaluate hierarchies;
- Evaluating, both quantitatively and qualitatively, hierarchical structures by conducting experiments on four real-world datasets such as Freebase, Wikidata, DBpedia, and WebRed;
- Obtaining a first-rate performance of the proposed nHDP_KG method, surpassing other neural-network-based hierarchical clustering techniques, including TraCo, SawETM, and HyperMiner.
2. Related Work
2.1. Hierarchy of Knowledge Graphs
2.2. Knowledge Graph Embedding
3. Hierarchy Construction as Topic Modeling
4. Description of Proposed Method
4.1. Dirichlet Process
4.2. Hierarchical Dirichlet Process
4.3. Adapted Nested Hierarchical Dirichlet Process
4.3.1. Global Tree: Distribution on Paths
4.3.2. Local Tree: Generating a Subject
- For each node in the tree T, draw a beta-distributed random variable that acts as a stochastic switch.
- To generate a word n in a subject s, start at the root node and recursively traverse down the tree according to until reaching some node . With probability , emit the topic at this node. Otherwise, continue traversing down the tree according to .
4.4. Stochastic Variational Inference
4.4.1. Greedy Subtree Selection
4.4.2. Stochastic Updates for Local Variables
4.4.3. Stochastic Updates for Global Variables
5. Experiment Setup
5.1. Dataset
5.2. Evaluation Metrics
5.2.1. Hierarchy Topic Quality
5.2.2. Coverage
5.3. Experiment Environment
6. Experiment Results
6.1. Quantitative Evaluation
6.2. Qualitative Evaluation
6.3. Discussion and Limitation
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Philip, S.Y. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 494–514. [Google Scholar] [CrossRef] [PubMed]
- Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada, 12–19 June 2008; pp. 1247–1250. [Google Scholar]
- Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
- Pietrasik, M.; Reformat, M. Path based hierarchical clustering on knowledge graphs. arXiv 2021, arXiv:2109.13178. [Google Scholar]
- Pietrasik, M.; Reformat, M. A Simple Method for Inducing Class Taxonomies in Knowledge Graphs. In Proceedings of the European Semantic Web Conference; Springer: Cham, Germany, 2020; pp. 53–68. [Google Scholar]
- Croft, W. On two mathematical representations for “semantic maps”. Z. Sprachwiss. 2022, 41, 67–87. [Google Scholar] [CrossRef]
- Jalving, J.; Shin, S.; Zavala, V.M. A graph-based modeling abstraction for optimization: Concepts and implementation in plasmo. Math. Program. Comput. 2022, 14, 699–747. [Google Scholar] [CrossRef]
- Zhang, Y.; Pietrasik, M.; Xu, W.; Reformat, M. Hierarchical Topic Modelling for Knowledge Graphs. In Proceedings of the European Semantic Web Conference; Springer: Cham, Germany, 2022; pp. 270–286. [Google Scholar]
- Li, M.; Wang, Y.; Zhang, D.; Jia, Y.; Cheng, X. Link prediction in knowledge graphs: A hierarchy-constrained approach. IEEE Trans. Big Data 2018, 8, 630–643. [Google Scholar] [CrossRef]
- Zhang, Z.; Cai, J.; Zhang, Y.; Wang, J. Learning hierarchy-aware knowledge graph embeddings for link prediction. Proc. AAAI Conf. Artif. Intell. 2020, 34, 3065–3072. [Google Scholar] [CrossRef]
- Dong, J.; Zhang, Q.; Huang, X.; Duan, K.; Tan, Q.; Jiang, Z. Hierarchy-Aware Multi-Hop Question Answering Over Knowledge Graphs. In Proceedings of the ACM Web Conference 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 2519–2527. [Google Scholar]
- Griffiths, T.; Jordan, M.; Tenenbaum, J.; Blei, D. Hierarchical topic models and the nested Chinese restaurant process. Adv. Neural Inf. Process. Syst. 2003. Available online: https://proceedings.neurips.cc/paper_files/paper/2003/file/7b41bfa5085806dfa24b8c9de0ce567f-Paper.pdf (accessed on 30 March 2025).
- Kim, J.H.; Kim, D.; Kim, S.; Oh, A. Modeling Topic Hierarchies with the Recursive Chinese Restaurant Process. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, HI, USA, 29 October–2 November 2012; pp. 783–792. [Google Scholar]
- Wu, X.; Pan, F.; Nguyen, T.; Feng, Y.; Liu, C.; Nguyen, C.D.; Luu, A.T. On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024. [Google Scholar]
- Duan, Z.; Wang, D.; Chen, B.; Wang, C.; Chen, W.; Li, Y.; Ren, J.; Zhou, M. Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network. In Proceedings of the International Conference on Machine Learning. PMLR, Virtual, 18–24 July 2021; pp. 2903–2913. [Google Scholar]
- Xu, Y.; Wang, D.; Chen, B.; Lu, R.; Duan, Z.; Zhou, M. Hyperminer: Topic taxonomy mining with hyperbolic embedding. Adv. Neural Inf. Process. Syst. 2022, 35, 31557–31570. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013. Available online: https://proceedings.neurips.cc/paper_files/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf (accessed on 30 March 2025).
- Yang, B.; Yih, W.t.; He, X.; Gao, J.; Deng, L. Embedding entities and relations for learning and inference in knowledge bases. arXiv 2014, arXiv:1412.6575. [Google Scholar]
- Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction. In Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA, 19–24 June 2016; pp. 2071–2080. [Google Scholar]
- Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
- Nickel, M.; Rosasco, L.; Poggio, T. Holographic Embeddings of Knowledge Graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Ferguson, T.S. A Bayesian Analysis of Some Nonparametric Problems. Ann. Stat. 1973, 1, 209–230. [Google Scholar] [CrossRef]
- Paisley, J.; Wang, C.; Blei, D.M.; Jordan, M.I. Nested hierarchical Dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 256–270. [Google Scholar] [CrossRef] [PubMed]
- Whye Teh, Y.; Jordan, M.I.; Blei, D.M. Hierarchical Dirichlet Processes. J. Am. Stat. Assoc. 2006, 101, 1566–1581. [Google Scholar] [CrossRef]
- Sethuraman, J. A constructive definition of Dirichlet priors. Stat. Sin. 1994, 4, 639–650. [Google Scholar]
- Toutanova, K.; Chen, D. Observed vs. Latent Features for Knowledge Base and Text Inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality, Beijing, China, 31 July 2015; pp. 57–66. [Google Scholar]
- Wang, X.; Gao, T.; Zhu, Z.; Zhang, Z.; Liu, Z.; Li, J.; Tang, J. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Trans. Assoc. Comput. Linguist. 2021, 9, 176–194. [Google Scholar] [CrossRef]
- Ormandi, R.; Saleh, M.; Winter, E.; Rao, V. WebRED: Effective Pretraining and Finetuning for Relation Extraction on the Web. arXiv 2021, arXiv:2102.09681. [Google Scholar]
- Marius, M.K.N.C.J.; Burkhardt, K.S. Hierarchical Topic Evaluation: Statistical vs. Neural Models. In Proceedings of the Bayesian Deep Learning Workshop, NeurIPS, Virtual, 10 December 2021. [Google Scholar]
- Röder, M.; Both, A.; Hinneburg, A. Exploring the Space of Topic Coherence Measures. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; pp. 399–408. [Google Scholar]
- Almars, A.M.; Ibrahim, I.A.; Zhao, X.; Al-Maskari, S. Evaluation Methods of Hierarchical Models. In Proceedings of the Advanced Data Mining and Applications: 14th International Conference, ADMA 2018, Nanjing, China, 16–18 November 2018; Springer: Cham, Germany, 2018; pp. 455–464. [Google Scholar]
Dataset | # Subjects | # Entities | # Relations | # Triplets |
---|---|---|---|---|
FB15k-237 | 13,781 | 14,541 | 237 | 272,115 |
FB15k-237 subset | 10,000 | 22,982 | 237 | 197,497 |
DBpedia | 908 | 31,222 | 345 | 57,192 |
Wikidata subset | 10,000 | 27,608 | 374 | 44,896 |
WebRED subset | 10,000 | 16,595 | 428 | 45,712 |
Models | Embeddings | FB15k-237 | WikiData | DBpedia | WebRED | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BTQ | LTQ | HTQ | BTQ | LTQ | HTQ | BTQ | LTQ | HTQ | BTQ | LTQ | HTQ | ||
nHDP_KG | 0.785 | 0.805 | 0.795 | 0.671 | 0.647 | 0.659 | 0.502 | 0.445 | 0.473 | 0.489 | 0.479 | 0.484 | |
hLDA | 0.662 | 0.592 | 0.627 | 0.299 | 0.228 | 0.263 | 0.413 | 0.424 | 0.418 | 0.269 | 0.220 | 0.244 | |
Traco | TransE | 0.385 | 0.385 | 0.385 | 0.015 | 0.023 | 0.019 | 0.443 | 0.449 | 0.446 | 0.190 | 0.190 | 0.190 |
DistMult | 0.337 | 0.341 | 0.339 | 0.075 | 0.150 | 0.112 | 0.396 | 0.420 | 0.408 | 0.267 | 0.277 | 0.272 | |
ComplEx | 0.427 | 0.427 | 0.427 | 0.012 | 0.017 | 0.014 | 0.369 | 0.390 | 0.380 | 0.379 | 0.414 | 0.396 | |
RotatE | 0.322 | 0.322 | 0.322 | 0.098 | 0.131 | 0.115 | 0.410 | 0.423 | 0.416 | 0.272 | 0.282 | 0.277 | |
HolE | 0.370 | 0.374 | 0.372 | 0.100 | 0.158 | 0.129 | 0.344 | 0.333 | 0.339 | 0.283 | 0.287 | 0.285 | |
SawETM | TransE | 0.564 | 0.647 | 0.626 | 0.203 | 0.203 | 0.203 | 0.268 | 0.436 | 0.352 | 0.410 | 0.481 | 0.445 |
DistMult | 0.548 | 0.635 | 0.591 | 0.082 | 0.086 | 0.084 | 0.272 | 0.390 | 0.331 | 0.294 | 0.346 | 0.319 | |
ComplEx | 0.505 | 0.621 | 0.563 | 0.116 | 0.116 | 0.116 | 0.301 | 0.459 | 0.380 | 0.272 | 0.320 | 0.296 | |
RotatE | 0.524 | 0.621 | 0.573 | 0.137 | 0.143 | 0.140 | 0.167 | 0.299 | 0.233 | 0.318 | 0.380 | 0.349 | |
HolE | 0.521 | 0.607 | 0.564 | 0.203 | 0.203 | 0.203 | 0.143 | 0.218 | 0.180 | 0.317 | 0.367 | 0.342 | |
HyperMiner | TransE | 0.628 | 0.705 | 0.666 | 0.097 | 0.100 | 0.098 | 0.128 | 0.384 | 0.256 | 0.367 | 0.374 | 0.370 |
DistMult | 0.650 | 0.734 | 0.692 | 0.028 | 0.034 | 0.031 | 0.128 | 0.384 | 0.256 | 0.335 | 0.335 | 0.335 | |
ComplEx | 0.559 | 0.635 | 0.597 | 0.121 | 0.138 | 0.130 | 0.128 | 0.384 | 0.256 | 0.306 | 0.324 | 0.315 | |
RotatE | 0.607 | 0.682 | 0.644 | 0.101 | 0.109 | 0.105 | 0.128 | 0.384 | 0.256 | 0.294 | 0.354 | 0.324 | |
HolE | 0.580 | 0.655 | 0.618 | 0.082 | 0.094 | 0.088 | 0.128 | 0.384 | 0.256 | 0.402 | 0.434 | 0.418 |
Entities | Node ID | Details |
---|---|---|
Subtree | (1,) | |
(1, 1) | •/people/person/profession | |
/award/award_nominee/award_nominations./award/award_nomination/award | ||
•/film/actor/film./film/performance/film | ||
•/award/award_nominee/award_nominations./award/award_nomination/nominated_for | ||
/people/person/nationality | ||
(1, 1, 1) | •/award/award_nominee/award_nominations./award/award_nomination/award | |
•/people/person/profession | ||
/award/award_nominee/award_nominations./award/award_nomination/award_nominee | ||
/music/artist/track_contributions./music/track_contribution/role | ||
/award/award_winner/awards_won./award/award_honor/award_winner | ||
(1, 1, 3) | •/award/award_nominee/award_nominations./award/award_nomination/award_nominee | |
/award/award_winner/awards_won./award/award_honor/award_winner | ||
•/film/actor/film./film/performance/film | ||
/award/award_nominee/award_nominations./award/award_nomination/award | ||
•/award/award_nominee/award_nominations./award/award_nomination/nominated_for | ||
Original_Triples | Original Triples for: /m/027l0b—Gene Wilder | |
‘/film/actor/film./film/performance/film’, ‘/m/085bd1’ | ||
‘/common/topic/webpage./common/webpage/category’, ‘/m/08mbj5d’ | ||
‘/people/person/place_of_birth’, ‘/m/0dyl9’ | ||
‘/film/actor/film./film/performance/film’, ‘/m/0hvvf’ | ||
‘/award/award_nominee/award_nominations./award/award_nomination/award’, ‘/m/09qvc0’ | ||
‘/award/award_nominee/award_nominations./award/award_nomination/award’, ‘/m/0gqy2’ | ||
‘/award/award_nominee/award_nominations./award/award_nomination/nominated_for’, ‘/m/0291ck’ | ||
‘/award/award_winner/awards_won./award/award_honor/award_winner’, ‘/m/052hl’ | ||
‘/people/person/profession’, ‘/m/0dxtg’ | ||
‘/people/person/religion’, ‘/m/03_gx’ | ||
‘/people/person/spouse_s./people/marriage/type_of_union’, ‘/m/04ztj’ | ||
‘/people/person/profession’, ’/m/02jknp’, ’/people/person/profession’, ‘/m/018gz8’ | ||
‘/people/person/religion’, ‘/m/0kpl’ | ||
‘/people/person/profession’, ‘/m/02hrh1q’ | ||
‘/people/person/profession’, ‘/m/0kyk’ | ||
‘/film/actor/film./film/performance/film’, ‘/m/017kz7’ | ||
‘/film/actor/film./film/performance/film’, ‘/m/0291ck’ | ||
‘/people/person/profession’, ‘/m/0xzm’ | ||
‘/film/actor/film./film/performance/film’, ‘/m/018f8’ | ||
‘/award/award_nominee/award_nominations./award/award_nomination/nominated_for’, ‘/m/01q_y0’ | ||
‘/people/person/profession’, ‘/m/0cbd2’ |
Entities | Node ID | Details |
---|---|---|
Subtree | (1,) | |
(1, 1) | •/people/person/profession | |
/award/award_nominee/award_nominations./award/award_nomination/award | ||
•/film/actor/film./film/performance/film | ||
•/award/award_nominee/award_nominations./award/award_nomination/nominated_for | ||
/people/person/nationality | ||
(1, 1, 1) | •/award/award_nominee/award_nominations./award/award_nomination/award | |
•/people/person/profession | ||
/award/award_nominee/award_nominations./award/award_nomination/award_nominee | ||
/music/artist/track_contributions./music/track_contribution/role | ||
/award/award_winner/awards_won./award/award_honor/award_winner | ||
(1, 1, 3) | •/award/award_nominee/award_nominations./award/award_nomination/award_nominee | |
/award/award_winner/awards_won./award/award_honor/award_winner | ||
•/film/actor/film./film/performance/film | ||
/award/award_nominee/award_nominations./award/award_nomination/award | ||
•/award/award_nominee/award_nominations./award/award_nomination/nominated_for | ||
(1, 2) | •/music/record_label/artist | |
/m/09nqf (United States dollar) | ||
/people/ethnicity/people | ||
/olympics/olympic_participating_country/medals_won./olympics/olympic_medal_honor/olympics | ||
/media_common/netflix_genre/titles | ||
(1, 2, 1) | •/location/location/contains | |
/location/location/adjoin_s./location/adjoining_relationship/adjoins | ||
/location/location/time_zones | ||
/m/0jbk9 (United States Department of Housing and Urban Development) | ||
/location/hud_foreclosure_area/estimated_number_of_mortgages./measurement_unit/dated_integer/source | ||
(1, 2, 4) | •/award/award_category/winners./award/award_honor/award_winner | |
•/award/award_category/winners./award/award_honor/ceremony | ||
•/award/award_category/nominees./award/award_nomination/nominated_for | ||
/government/government_office_category/officeholders./government/government_position_held/jurisdiction_of_office | ||
/business/job_title/people_with_this_title./business/employment_tenure/company | ||
Original_Triples | Original Triples for: /m/09qj50—Primetime Emmy Award for Outstanding Lead Actress in a Comedy Series | |
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/0kfpm’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/02xhpl’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/04nw9’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/09yrh’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/02czd5’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/01b9w3’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/0c4f4’ | ||
‘/award/award_category/winners./award/award_honor/ceremony’, ‘/m/03nnm4t’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/0pyww’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/0q9jk’ | ||
‘/award/award_category/winners./award/award_honor/ceremony’, ‘/m/02q690_’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/0l76z’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/01z5tr’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/01bv8b’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/0vjr’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/01q_y0’ | ||
‘/award/award_category/nominees./award/award_nomination/nominated_for’, ‘/m/023ny6’ | ||
‘/award/award_category/winners./award/award_honor/award_winner’, ‘/m/0m66w’ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Y.; Xu, W.; Yu, Z.; Reformat, M.Z. Construction of Topic Hierarchy with Subtree Representation for Knowledge Graphs. Axioms 2025, 14, 300. https://doi.org/10.3390/axioms14040300
Zhang Y, Xu W, Yu Z, Reformat MZ. Construction of Topic Hierarchy with Subtree Representation for Knowledge Graphs. Axioms. 2025; 14(4):300. https://doi.org/10.3390/axioms14040300
Chicago/Turabian StyleZhang, Yujia, Wenjie Xu, Zheng Yu, and Marek Z. Reformat. 2025. "Construction of Topic Hierarchy with Subtree Representation for Knowledge Graphs" Axioms 14, no. 4: 300. https://doi.org/10.3390/axioms14040300
APA StyleZhang, Y., Xu, W., Yu, Z., & Reformat, M. Z. (2025). Construction of Topic Hierarchy with Subtree Representation for Knowledge Graphs. Axioms, 14(4), 300. https://doi.org/10.3390/axioms14040300