Exploring the Importance of Entities in Semantic Ranking
Abstract
:1. Introduction
- The dataset analysis reveals that the importance of entities in a document should be considered in the IR task.
- Toy model and ESR are enhanced by considering the importance of entities in documents.
- Extensive experiments reveal that the enhanced models could achieve better performance than the original models on all evaluation metrics, especially on the long queries and the queries where ESR fails.
2. Related Work
2.1. Traditional Information Retrieval Methods
2.2. Entity-Based Ranking Methods
2.3. Entity Ranking
3. Data Analysis
3.1. Dataset
3.2. Entity Ranking
3.3. Analysis Result
4. Proposed Approach
4.1. Basic Entity-Based Models
4.1.1. Basic Toy Model
4.1.2. Basic Explicit Semantic Ranking (ESR)
4.2. Enhanced Entity-based Model by Considering the Importance of Entities
4.2.1. Enhanced Toy Model by Considering the Importance of Entities
4.2.2. ESR optimized by considering the importance of entities
5. Results and Discussion
5.1. Experimental Setup
5.2. Evaluation and Analysis
5.2.1. Results of Enhanced Toy Model
5.2.2. Results of Enhanced ESR
5.3. Performance on Different Scenarios
5.3.1. Multiple Difficulty Degrees
5.3.2. Multiple Query Length Degrees
5.4. Impact of Document Entities with Different Order
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Guo, J.; Fan, Y.; Ai, Q.; Croft, W.B. Semantic Matching by Non-Linear Word Transportation for Information Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management; ACM: New York, NY, USA, 2016; pp. 701–710. [Google Scholar]
- Huang, P.S.; He, X.; Gao, J.; Deng, L.; Acero, A.; Heck, L. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data. In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management; ACM: New York, NY, USA, 2013; pp. 2333–2338. [Google Scholar]
- Guo, J.; Fan, Y.; Ai, Q.; Croft, W.B. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management; ACM: New York, NY, USA, 2016; pp. 55–64. [Google Scholar] [Green Version]
- Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval; Cambridge University Press: New York, NY, USA, 2008. [Google Scholar]
- Robertson, S.E.; Zaragoza, H. The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Inf. Retr. 2009, 3, 333–389. [Google Scholar] [CrossRef]
- Xiong, C.; Power, R.; Callan, J. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. In Proceedings of the 26th International Conference on World Wide Web; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2017; pp. 1271–1279. [Google Scholar]
- Xiong, C.; Callan, J.; Liu, T.Y. Bag-of-Entities Representation for Ranking. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval; ACM: New York, NY, USA, 2016; pp. 181–184. [Google Scholar]
- Liu, Z.; Xiong, C.; Sun, M.; Liu, Z. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 2395–2405. [Google Scholar]
- Ferragina, P.; Scaiella, U. TAGME: On-the-fly Annotation of Short Text Fragments (by Wikipedia Entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management; ACM: New York, NY, USA, 2010; pp. 1625–1628. [Google Scholar]
- Hasibi, F.; Balog, K.; Bratsberg, S.E. Entity Linking in Queries: Tasks and Evaluation. In Proceedings of the 2015 International Conference on The Theory of Information Retrieval; ACM: New York, NY, USA, 2015; pp. 171–180. [Google Scholar]
- Yin, D.; Hu, Y.; Tang, J.; Daly, T.; Zhou, M.; Ouyang, H.; Chen, J.; Kang, C.; Deng, H.; Nobata, C.; et al. Ranking Relevance in Yahoo Search. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 323–332. [Google Scholar]
- Liu, X.; Fang, H. Latent entity space: A novel retrieval approach for entity-bearing queries. Inf. Retr. J. 2015, 18, 473–503. [Google Scholar] [CrossRef]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26; Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Lake Tahoe, Nevada, 2013; pp. 3111–3119. [Google Scholar]
- Li, H.; Xu, J. Semantic Matching in Search. Found. Trends Inf. Retr. 2014, 7, 343–469. [Google Scholar] [CrossRef] [Green Version]
- Shen, Y.; He, X.; Gao, J.; Deng, L.; Mesnil, G. Learning Semantic Representations Using Convolutional Neural Networks for Web Search. In Proceedings of the 23rd International Conference on World Wide Web; ACM: New York, NY, USA, 2014; pp. 373–374. [Google Scholar]
- Hu, B.; Lu, Z.; Li, H.; Chen, Q. Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Advances in Neural Information Processing Systems 27; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Lake Tahoe, Nevada, 2014; pp. 2042–2050. [Google Scholar]
- Pang, L.; Lan, Y.; Guo, J.; Xu, J.; Wan, S.; Cheng, X. Text Matching as Image Recognition. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2793–2799. [Google Scholar]
- Buckley, C.; Salton, G.; Allan, J.; Singhal, A. Automatic Query Expansion Using SMART: TREC 3. Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, MD, USA, 2–4 November 1994; pp. 69–80. [Google Scholar]
- Cao, G.; Nie, J.Y.; Gao, J.; Robertson, S. Selecting Good Expansion Terms for Pseudo-relevance Feedback. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2008; pp. 243–250. [Google Scholar]
- Metzler, D.; Croft, W.B. Latent Concept Expansion Using Markov Random Fields. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2007; pp. 311–318. [Google Scholar]
- Balabanović, M.; Shoham, Y. Fab: Content-based, Collaborative Recommendation. Commun. ACM 1997, 40, 66–72. [Google Scholar] [CrossRef]
- Bobadilla, J.; Ortega, F.; Hernando, A.; Gutiérrez, A. Recommender systems survey. Knowl.-Based Syst. 2013, 46, 109–132. [Google Scholar] [CrossRef]
- Pouli, V.; Kafetzoglou, S.; Tsiropoulou, E.E.; Dimitriou, A.; Papavassiliou, S. Personalized multimedia content retrieval through relevance feedback techniques for enhanced user experience. In Proceedings of the 2015 13th International Conference on Telecommunications (ConTEL), Graz, Austria, 13–15 July 2015; pp. 1–8. [Google Scholar]
- Stai, E.; Kafetzoglou, S.; Tsiropoulou, E.E.; Papavassiliou, S. A holistic approach for personalization, relevance feedback & recommendation in enriched multimedia content. Multimedia Tools Appl. 2018, 77, 283–326. [Google Scholar]
- Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. DBpedia: A Nucleus for a Web of Open Data. In The Semantic Web; Aberer, K., Choi, K.S., Noy, N., Allemang, D., Lee, K.I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., et al., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 722–735. [Google Scholar]
- Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data; ACM: New York, NY, USA, 2008; pp. 1247–1250. [Google Scholar]
- Lample, G.; Ballesteros, M.; Subramanian, S.; Kawakami, K.; Dyer, C. Neural Architectures for Named Entity Recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 260–270. [Google Scholar]
- Shen, W.; Wang, J.; Han, J. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Trans. Knowl. Data Eng. 2015, 27, 443–460. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge Graph Embedding by Translating on Hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; pp. 1112–1119. [Google Scholar]
- Croft, W.B.; Metzler, D.; Strohman, T. Search Engines—Information Retrieval in Practice; Pearson Education: London, England, 2009. [Google Scholar]
- Raviv, H.; Kurland, O.; Carmel, D. Document Retrieval Using Entity-Based Language Models. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2016; pp. 65–74. [Google Scholar]
- Dietz, L.; Kotov, A.; Meij, E. Utilizing Knowledge Bases in Text-centric Information Retrieval. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval; ACM: New York, NY, USA, 2016. [Google Scholar]
- Xiong, C.; Liu, Z.; Callan, J.; Hovy, E. JointSem: Combining Query Entity Linking and Entity Based Document Ranking. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management; ACM: New York, NY, USA, 2017; pp. 2391–2394. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems 26; Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Lake Tahoe, Nevada, 2013; pp. 2787–2795. [Google Scholar]
- Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 2181–2187. [Google Scholar]
- Xiong, C.; Dai, Z.; Callan, J.; Liu, Z.; Power, R. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2017; pp. 55–64. [Google Scholar] [Green Version]
- De Vries, A.P.; Vercoustre, A.M.; Thom, J.A.; Craswell, N.; Lalmas, M. Overview of the INEX 2007 Entity Ranking Track; Focused Access to XML Documents; Fuhr, N., Kamps, J., Lalmas, M., Trotman, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 245–251. [Google Scholar]
- Kang, C.; Yin, D.; Zhang, R.; Torzec, N.; He, J.; Chang, Y. Learning to rank related entities in Web search. Neurocomputing 2015, 166, 309–318. [Google Scholar] [CrossRef]
- Wang, C.; Zhou, G.; He, X.; Zhou, A. NERank+: A graph-based approach for entity ranking in document collections. Front. Comput. Sci. 2018, 12, 504–517. [Google Scholar] [CrossRef]
- Mihalcea, R.; Tarau, P. TextRank: Bringing Order into Texts. In Emnlp; Association for Computational Linguistics: Stroudsburg, Pennsylvania, 2004; pp. 404–411. [Google Scholar]
- Ammar, W.; Groeneveld, D.; Bhagavatula, C.; Beltagy, I.; Crawford, M.; Downey, D.; Dunkelberger, J.; Elgohary, A.; Feldman, S.; Ha, V.; et al. Construction of the Literature Graph in Semantic Scholar. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers); Association for Computational Linguistics: Stroudsburg, Pennsylvania, 2018; pp. 84–91. [Google Scholar]
- Xiong, C.; Callan, J.; Liu, T.Y. Word-Entity Duet Representations for Document Ranking. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2017; pp. 763–772. [Google Scholar] [Green Version]
- Liu, T.Y. Learning to Rank for Information Retrieval. Found. Trends Inf. Retr. 2009, 3, 225–331. [Google Scholar] [CrossRef]
- Ferragina, P.; Scaiella, U. Fast and Accurate Annotation of Short Texts with Wikipedia Pages. IEEE Softw. 2012, 29, 70–75. [Google Scholar] [CrossRef]
- Joachims, T. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2002; pp. 133–142. [Google Scholar]
Method | NDCG@1 | NDCG@5 | NDCG@10 | NDCG@20 | W/T/L | ||||
---|---|---|---|---|---|---|---|---|---|
tf.idf-F | 0.2020 | +12.98% | 0.2254 | +3.73% | 0.2741 | -2.94% | 0.3299 | -16.73% | 30/2/68 |
toy model | 0.1788 | - | 0.2173 | - | 0.2824 | - | 0.3962 | - | - |
toy model with TextRank | 0.2221 | +24.22% | 0.2402 | +10.54% | 0.3025 | +7.12% | 0.4108 | +3.69% | 49/18/33 |
toy model with TF-IDF | 0.2128 | +19.02% | 0.2410 | +10.91% | 0.2981 | +5.56% | 0.4143 | +4.57% | 45/18/37 |
Method | P@3 | P@10 | ||
---|---|---|---|---|
toy model | 0.4167 | - | 0.4400 | - |
toy model with TextRank | 0.4667 | +12.00% | 0.4510 | +2.50% |
toy model with TF-IDF | 0.4500 | +7.99% | 0.4500 | +2.72% |
Method | NDCG@1 | NDCG@5 | NDCG@10 | NDCG@20 | W/T/L | ||||
---|---|---|---|---|---|---|---|---|---|
tf.idf-F | 0.2020 | −2.88% | 0.2254 | −1.79% | 0.2741 | -6.26% | 0.3299 | −17.21% | 28/5/67 |
ESR | 0.2080 | - | 0.2295 | - | 0.2924 | - | 0.3985 | - | -/-/- |
ESR with TextRank | 0.2270 | +9.13% | 0.2358 | +2.75% | 0.3029 | +3.59% | 0.4094 | +2.74% | 46/23/31 |
ESR with TF-IDF | 0.2154 | +3.56% | 0.2319 | +1.05% | 0.2949 | +0.85% | 0.4040 | +1.38% | 40/25/35 |
Method | P@3 | P@10 | ||
---|---|---|---|---|
ESR | 0.4600 | - | 0.4550 | - |
ESR with TextRank | 0.4633 | +0.72% | 0.4650 | +2.20% |
ESR with TF-IDF | 0.4900 | +6.52% | 0.4690 | +3.08% |
Method | NDCG@1 | NDCG@5 | NDCG@10 | NDCG@20 |
---|---|---|---|---|
important entities | 0.2252 | 0.2427 | 0.3042 | 0.4070 |
important entities + related entities | 0.2203 | 2438 | 0.3014 | 0.4072 |
all entities | 0.2244 | 0.2397 | 0.2984 | 0.4089 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Xu, G.; Liang, X.; Li, F.; Wang, L.; Zhang, D. Exploring the Importance of Entities in Semantic Ranking. Information 2019, 10, 39. https://doi.org/10.3390/info10020039
Li Z, Xu G, Liang X, Li F, Wang L, Zhang D. Exploring the Importance of Entities in Semantic Ranking. Information. 2019; 10(2):39. https://doi.org/10.3390/info10020039
Chicago/Turabian StyleLi, Zhenyang, Guangluan Xu, Xiao Liang, Feng Li, Lei Wang, and Daobing Zhang. 2019. "Exploring the Importance of Entities in Semantic Ranking" Information 10, no. 2: 39. https://doi.org/10.3390/info10020039
APA StyleLi, Z., Xu, G., Liang, X., Li, F., Wang, L., & Zhang, D. (2019). Exploring the Importance of Entities in Semantic Ranking. Information, 10(2), 39. https://doi.org/10.3390/info10020039