Numerical Markov Logic Network: A Scalable Probabilistic Framework for Hybrid Knowledge Inference
Abstract
:1. Introduction
- Modeling the integration of logic formula and arithmetic expression. We note that the latest approach of Probabilistic Soft Logic (PSL) [10] enables MAP inference on continuous variables over a set of arithmetic rules such as “” by considering it as a constraint on prior probability. However, it can be observed that an arithmetic expression (e.g., ) is not a predefined continuous logic variable; thus, it cannot be easily integrated into the objective function defined by PSL. Specifically, even though the arithmetic inequalities like “” in can be regarded as a Boolean variable by PSL, computing the truth value of by the function used in PSL would render its corresponding objective function non-convex. Since the inference of PSL was built on convex optimization, applying PSL inference on would lead to inaccurate results and convergence failure. Therefore, the existing MLN solutions cannot effectively support the integration of logic formula and arithmetic expression.
- Scalability. Arithmetic expressions usually involve pair-wise numerical comparison. The existing MLN solutions would generate the combination of all the predicate variables in the grounding process. This results in the undesirable quadratic or even cubic explosion of grounded clauses, which can easily render the inference process unscalable. For instance, consider the rule in Table 1. The existing inference solutions would result in an size of clauses for n variables. It is worth pointing out that clause explosion would not only result in inference inefficiency, but also meaningless inference results. In the circumstance of clause explosion, the techniques based on Gibbs sampling [11,12] may fail because the sampler would be trapped in a local state. As shown in our experimental study, the predictions of PSL may become inaccurate because it fails to converge.
- We propose a novel probabilistic framework for hybrid knowledge inference. We define the hybrid knowledge rules and present the optimization model.
- We propose a scalable inference approach for the proposed framework based on the decomposition of the exp-loss function.
- We present a parallel solution for hybrid knowledge inference based on convex optimization.
- We empirically evaluate the performance of the proposed framework on real data. Our extensive experiments show that compared to the existing MLN techniques, the proposed approach can achieve better prediction accuracy while significantly reducing inference time.
2. Related Work
3. Hybrid Knowledge Rules
- (1) is a first-order relation x or its negation , where ;
- (2) is a logic expression, and denotes its variables, where ;
- (3) is a linear inequality in the form of , where .
4. Inference Framework
- The exp-loss is a natural extension to the hinge-loss function defined in Equation (8). The exponential power guarantees a greater loss when a violation of the rule occurs. On the other hand, it can be observed that even though the function is not zero when the rule is satisfied (e.g., if , the loss is if the rule is satisfied), the value of the exp-loss and its gradient becomes very small in the negative interval, which can be considered as a soft constraint of the function.
- As shown in the following section, the exp-loss function enables the scalable inference based on function decomposition. It can effectively address the challenge of the explosion of grounded clauses.
5. Inference Optimization
5.1. Decomposition of Exponential Loss Function
Algorithm 1: Find irreducible groups and joint variables. |
Input: relations set and predicate variable set with respect to Output: irreducible groups and joint variable set . and = ⌀; for in do for in do if then find where ; find where ; merge the set and ; end end end for in do for in do if then add to ; end end end |
Algorithm 2: Inference of hybrid knowledge rules. |
Input: set of hybrid knowledge rules , relation set , predicate variable set with respect to , and the instances of dataset D. Output: Solution for the inference variables V. Generate the set of variables according to and D; for do Find irreducible groups and joint variables for by Algorithm 1; Generate the for (grounding) in the form of Equation (20); end Find the optimal solution |
5.2. Parallel Optimization
6. Experimental Study
6.1. Comparative Study
- Wiki-Sports: This dataset contains the articles on the topic of sports extracted under the feature article page in Wikipedia. The mentions in the dataset are extracted from the anchor texts in the articles and annotated by the entities to which they link. We used the disambiguation page of Wikipedia to generate the candidates for each mention. In order to avoid the leakage of label information, we eliminated the corresponding Wiki pages during the extracting link text for the entities.
- Wiki-FourDomains: This dataset contains the articles extracted on four topics, which include films, music, novels, and television episodes, on Wikipedia. We applied the same process on the dataset as Wiki-Sports to generate mention-annotations and candidate entities.
6.2. Scalability
6.3. Sensitivity Evaluation
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Knowledge Rules in the Phone Dataset
Knowledge Rules | Weight | |
---|---|---|
2 | ||
4 | ||
1 | ||
2 | ||
1 | ||
1 |
Appendix B. Knowledge Rules in the Aida Dataset
- : the prior distribution computed by the number of entities linking to e.
- : the similarity measured by mutual information according to the “hasWikipediaAnchorText” file in YAGO2.
- : the similarity measured by word2vector according to the “hasWikipediaAnchorText” file in YAGO2.
- : the similarity measured by mutual information according to the “hasWikipediaCategory” file in YAGO2.
- : the similarity measured by word2vector according to the “hasWikipediaCategory” file in YAGO2.
- : the edit distance-based similarity.
- : the syntactical similarity computed by WordNet and YAGO2.
- : we find the max candidate of pairs for one document, for which all entities in the set have the maximum word similarity.
Knowledge Rules | Weight | |
---|---|---|
- | ||
1 | ||
1 |
References
- Richardson, M.; Domingos, P.M. Markov logic networks. Mach. Learn. 2006, 62, 107–136. [Google Scholar] [CrossRef] [Green Version]
- Banerjee, O.; El Ghaoui, L.; d’Aspremont, A. Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data. J. Mach. Learn. Res. 2008, 9, 485–516. [Google Scholar]
- Dong, X.L.; Gabrilovich, E.; Heitz, G.; Horn, W.; Murphy, K.; Sun, S.; Zhang, W. From Data Fusion to Knowledge Fusion. PVLDB 2014, 7, 881–892. [Google Scholar] [CrossRef] [Green Version]
- Jiang, S.; Lowd, D.; Dou, D. Learning to Refine an Automatically Extracted Knowledge Base Using Markov Logic. In Proceedings of the 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium, 10–13 December 2012; pp. 912–917. [Google Scholar] [CrossRef] [Green Version]
- Zhang, C.; Ré, C.; Cafarella, M.J.; Shin, J.; Wang, F.; Wu, S. DeepDive: Declarative knowledge base construction. Commun. ACM 2017, 60, 93–102. [Google Scholar] [CrossRef] [Green Version]
- Singla, P.; Domingos, P.M. Entity Resolution with Markov Logic. In Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, China, 18–22 December 2006; pp. 572–582. [Google Scholar] [CrossRef] [Green Version]
- Niu, F.; Ré, C.; Doan, A.; Shavlik, J.W. Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS. PVLDB 2011, 4, 373–384. [Google Scholar] [CrossRef]
- Chen, Y.; Wang, D.Z. Knowledge expansion over probabilistic knowledge bases. In Proceedings of the International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014; pp. 649–660. [Google Scholar] [CrossRef]
- Sa, C.D.; Ratner, A.; Ré, C.; Shin, J.; Wang, F.; Wu, S.; Zhang, C. Incremental knowledge base construction using DeepDive. VLDB J. 2017, 26, 81–105. [Google Scholar] [CrossRef] [Green Version]
- Bach, S.H.; Broecheler, M.; Huang, B.; Getoor, L. Hinge-Loss Markov Random Fields and Probabilistic Soft Logic. J. Mach. Learn. Res. 2017, 18, 109:1–109:67. [Google Scholar]
- Wick, M.L.; McCallum, A.; Miklau, G. Scalable Probabilistic Databases with Factor Graphs and MCMC. PVLDB 2010, 3, 794–804. [Google Scholar] [CrossRef]
- Zhang, C.; Ré, C. Towards High-throughput Gibbs Sampling at Scale: A Study Across Storage Managers. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ACM, SIGMOD’13, New York, NY, USA, 22–27 June 2013; pp. 397–408. [Google Scholar] [CrossRef]
- Krapu, C.; Borsuk, M. Probabilistic programming: A review for environmental modellers. Environ. Model. Softw. 2019, 114, 40–48. [Google Scholar] [CrossRef]
- Salvatier, J.; Wiecki, T.V.; Fonnesbeck, C.; Elkan, C. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci. 2016, 2. [Google Scholar] [CrossRef] [Green Version]
- Tran, D.; Kucukelbir, A.; Dieng, A.B.; Rudolph, M.R.; Liang, D.; Blei, D.M. Edward: A library for probabilistic modeling, inference, and criticism. arXiv 2016, arXiv:1610.09787. [Google Scholar]
- Bingham, E.; Chen, J.P.; Jankowiak, M.; Obermeyer, F.; Pradhan, N.; Karaletsos, T.; Singh, R.; Szerlip, P.A.; Horsfall, P.; Goodman, N.D. Pyro: Deep Universal Probabilistic Programming. J. Mach. Learn. Res. 2019, 20, 28:1–28:6. [Google Scholar]
- Zhong, P.; Li, Z.; Chen, Q.; Wang, Y.; Wang, L.; Ahmed, M.H.M.; Fan, F. POOLSIDE: An Online Probabilistic Knowledge Base for Shopping Decision Support. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, 6–10 November 2017; pp. 2559–2562. [Google Scholar] [CrossRef]
- Gutiérrez-Basulto, V.; Jung, J.C.; Kuzelka, O. Quantified Markov Logic Networks. In Proceedings of the Principles of Knowledge Representation and Reasoning: Proceedings of the Sixteenth International Conference, KR 2018, Tempe, Arizona, 30 October–2 November 2018; pp. 602–612. [Google Scholar]
- Sabek, I.; Musleh, M.; Mokbel, M.F. Flash in Action: Scalable Spatial Data Analysis Using Markov Logic Networks. PVLDB 2019, 12, 1834–1837. [Google Scholar] [CrossRef]
- Gayathri, K.; Easwarakumar, K.; Elias, S. Probabilistic ontology based activity recognition in smart homes using Markov Logic Network. Knowl. Based Syst. 2017, 121, 173–184. [Google Scholar] [CrossRef]
- Schoenfisch, J.; Meilicke, C.; von Stülpnagel, J.; Ortmann, J.; Stuckenschmidt, H. Root cause analysis in IT infrastructures using ontologies and abduction in Markov Logic Networks. Inf. Syst. 2018, 74, 103–116. [Google Scholar] [CrossRef]
- Kennington, C.; Schlangen, D. Situated incremental natural language understanding using Markov Logic Networks. Comput. Speech Lang. 2014, 28, 240–255. [Google Scholar] [CrossRef]
- Ge, C.; Gao, Y.; Miao, X.; Yao, B.; Wang, H. A Hybrid Data Cleaning Framework Using Markov Logic Networks. IEEE Trans. Knowl. Data Eng. 2020, 1. [Google Scholar] [CrossRef]
- Sabek, I. Adopting Markov Logic Networks for Big Spatial Data and Applications. In Proceedings of the International Conference on Very Large Data Bases (VLDB), Los Angeles, CA, USA, 26–30 August 2019. [Google Scholar]
- Hao, W.; Menglin, J.; Guohui, T.; Qing, M.; Guoliang, L. R-KG: A Novel Method for Implementing a Robot Intelligent Service. AI 2020, 1, 117–140. [Google Scholar] [CrossRef] [Green Version]
- Sarkhel, S.; Venugopal, D.; Singla, P.; Gogate, V. Lifted MAP Inference for Markov Logic Networks. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, 22–25 April 2014; pp. 859–867. [Google Scholar]
- Beedkar, K.; Corro, L.D.; Gemulla, R. Fully Parallel Inference in Markov Logic Networks. In Proceedings of the Datenbanksysteme für Business, Technologie und Web (BTW), 15. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS), Magdeburg, Germany, 11–15 March 2013; pp. 205–224. [Google Scholar]
- Zhou, X.; Chen, Y.; Wang, D.Z. ArchimedesOne: Query Processing over Probabilistic Knowledge Bases. PVLDB 2016, 9, 1461–1464. [Google Scholar] [CrossRef]
- Sun, Z.; Zhao, Y.; Wei, Z.; Zhang, W.; Wang, J. Scalable learning and inference in Markov logic networks. Int. J. Approx. Reason. 2017, 82, 39–55. [Google Scholar] [CrossRef]
- Singla, P.; Domingos, P.M. Discriminative Training of Markov Logic Networks. In Proceedings of the Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, PA, USA, 9–13 July 2005; pp. 868–873. [Google Scholar]
- Lowd, D.; Domingos, P.M. Efficient Weight Learning for Markov Logic Networks. In Proceedings of the Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, 17–21 September 2007; pp. 200–211. [Google Scholar] [CrossRef] [Green Version]
- Huynh, T.N.; Mooney, R.J. Max-Margin Weight Learning for Markov Logic Networks. In Proceedings of the Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Bled, Slovenia, 7–11 September 2009; pp. 564–579. [Google Scholar] [CrossRef] [Green Version]
- Kok, S.; Domingos, P.M. Learning the structure of Markov logic networks. In Proceedings of the Machine Learning, Proceedings of the Twenty-Second International Conference (ICML 2005), Bonn, Germany, 7–11 August 2005; pp. 441–448. [Google Scholar] [CrossRef] [Green Version]
- Mihalkova, L.; Mooney, R.J. Bottom-up learning of Markov logic network structure. In Proceedings of the Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, OR, USA, 20–24 June 2007; pp. 625–632. [Google Scholar] [CrossRef] [Green Version]
- Khot, T.; Natarajan, S.; Kersting, K.; Shavlik, J. Gradient-based boosting for statistical relational learning: The Markov logic network and missing data cases. Mach. Learn. 2015, 100. [Google Scholar] [CrossRef] [Green Version]
- Marra, G.; Kuzelka, O. Neural Markov Logic Networks. arXiv 2019, arXiv:1905.13462. [Google Scholar]
- Wang, J.; Domingos, P.M. Hybrid Markov Logic Networks. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, IL, USA, 13–17 July 2008; pp. 1106–1111. [Google Scholar]
- Klir, G.J.; Yuan, B. Fuzzy Sets and Fuzzy Logic—Theory and Applications; Prentice Hall: Upper Saddle River, NJ, USA, 1995. [Google Scholar]
- Boyd, S.P.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Found. Trends Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
- Sang, E.F.T.K.; Meulder, F.D. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of the Seventh Conference on Natural Language Learning, CoNLL 2003, Edmonton, AB, Canada, 31 May–1 June 2003; pp. 142–147. [Google Scholar]
- Hoffart, J.; Suchanek, F.M.; Berberich, K.; Lewis-Kelham, E.; de Melo, G.; Weikum, G. YAGO2: Exploring and querying world knowledge in time, space, context, and many languages. In Proceedings of the 20th International Conference on World Wide Web, WWW 2011, Hyderabad, India, 28 March–1 April 2011; pp. 229–232. [Google Scholar] [CrossRef]
Knowledge Rules | Size | |
---|---|---|
n | ||
n | ||
n | ||
Dataset | Total No. of Variables | No. of Non-Matches | No. of Matches |
---|---|---|---|
Mobile Phone | 1058 | – | – |
AIDA-CONLL | 728,225 | 713,113 | 15,112 |
Wiki-Sport | 28,244 | 24,244 | 4000 |
Wiki-FourDomains | 23,828 | 19,318 | 4510 |
Distance_avg | Grounding | Inference | Total | |
---|---|---|---|---|
NMLN | 0.857 | 0.13 s | 0.45 s | 2.09 s |
PSL | 0.853 | 34.7 s | 37.9 s | 73.9 s |
In-KBacc | Grounding | Inference | Total | |
---|---|---|---|---|
AIDA-CNOLL | ||||
NMLN | 0.805 | 278 s | 1344 s | 1745 s |
PSL | 0.708 | 25,636 s | 25,566 s | 51,661 s |
Wiki-Sport | ||||
NMLN | 0.865 | 23 s | 162s | 201 s |
PSL | 0.826 | 1889 s | 889 s | 2793 s |
Wiki-FourDomains | ||||
NMLN | 0.893 | 14 s | 138 s | 164 s |
PSL | 0.876 | 1196 s | 545 s | 1753 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhong, P.; Li, Z.; Chen, Q.; Hou, B.; Ahmed, M. Numerical Markov Logic Network: A Scalable Probabilistic Framework for Hybrid Knowledge Inference. Information 2021, 12, 124. https://doi.org/10.3390/info12030124
Zhong P, Li Z, Chen Q, Hou B, Ahmed M. Numerical Markov Logic Network: A Scalable Probabilistic Framework for Hybrid Knowledge Inference. Information. 2021; 12(3):124. https://doi.org/10.3390/info12030124
Chicago/Turabian StyleZhong, Ping, Zhanhuai Li, Qun Chen, Boyi Hou, and Murtadha Ahmed. 2021. "Numerical Markov Logic Network: A Scalable Probabilistic Framework for Hybrid Knowledge Inference" Information 12, no. 3: 124. https://doi.org/10.3390/info12030124
APA StyleZhong, P., Li, Z., Chen, Q., Hou, B., & Ahmed, M. (2021). Numerical Markov Logic Network: A Scalable Probabilistic Framework for Hybrid Knowledge Inference. Information, 12(3), 124. https://doi.org/10.3390/info12030124