FGeo-SSS: A Search-Based Symbolic Solver for Human-like Automated Geometric Reasoning
Abstract
:1. Introduction
- Geometric shapes consist of several points, and when performing basic transformations such as rotation and scaling on the shapes, the relative positional information among the points does not change; that is, the shape’s topological structure information is symmetrical with respect to basic transformations. A complex geometric shape is often composed of several basic shapes, and the order of combination of the basic shapes does not change the final complex shape constructed; that is, the construction result is symmetrical relative to the construction order. We have implemented the above formal representation methods and construction methods, transforming the symmetry and asymmetry into a series of rules for symbolic operations. The implementation of these symmetries helps to ensure the comprehensiveness and correctness of the FGPS design.
- The process of GPS can be seen as the application of theorems, where each application derives new conditions, ultimately leading to the derivation of the solution objective. The application order of geometric theorems can form a directed acyclic graph (DAG), and any theorem sequence obtained through its topological sorting can solve the current geometric problem. The solvability of geometric problems with respect to the topological sorting of the theorem’s DAG is symmetrical. By leveraging this symmetry, we expand the number of problems according to the process of GPS, effectively addressing issues such as high labeling costs and scarcity of datasets.
- FGPS incorporates two symmetric problem-solving algorithms: forward search, which starts from known conditions and continuously applies theorems to reach the problem’s goal, and backward search, which starts from the problem’s goal and expands it into sub-goals until all sub-goals are known conditions. We organize the process of GPS into a hypertree with conditions as hypernodes and theorems as hyperedges. The hypertrees obtained from the two algorithms are mirror-symmetrical. We designed the backward search algorithm by analyzing the symmetrical processes of forward search, and the solution generation process of backward search is also based on forward search.
- We crafted FGPS in Python. It serves as both an interactive assistant for verifying problem-solving processes and an automated problem solver that utilizes a variable search-based method and strategy. FGPS incorporates functions such as formal statement parsing, condition validity checks, automatic diagram construction, and condition expansion. It is capable of performing readable, traceable, and verifiable algebraic equation solving and relational reasoning.
- We explored the symmetry phenomena inherent in basic geometric transformations, geometric constructions, geometric problem solving, and the design of the solving system, and utilized these symmetries to further refine the system’s design. The two symmetrical parts can verify and complement each other, making the design of FGPS more comprehensive and avoiding omissions.
- We annotated the formalgeo7k dataset, which contains 6981 (expanded to 133,818 using data augmentation method). Each problem comprises a complete natural language description, geometric shapes, formal language annotations, and theorem sequences annotations. Additionally, we have attempted to annotate 18 geometry IMO-level problems, forming the dataset formalgeo-imo.
- We conducted experiments on formalgeo7k, comparing different search methods and strategies in terms of their problem-solving success rate, search time, and search step. The forward search method combined with random search strategy achieved a 39.7% problem-solving accuracy rate.
2. Related Works
3. FGPS
3.1. Architecture
- Main is the control module of FGPS, invoking other modules to enable interactive problem solving and automated problem solving. The automated solving component implements both forward search and backward search, allowing for the configuration of various search strategies and defining interfaces for AI-assisted searches.
- Engine is the core component of FGPS, responsible for parsing and executing GPL and consists of two sub-modules, GPL executor for relational reasoning and equation solver for algebraic computation.
- Parser facilitates bidirectional conversion between formal language and machine language. It consists of three sub-modules. GDL parser parses GPDL and GTDL into machine language, enabling custom configuration of the solver. CDL parser parses the formal describing of problems into machine language for subsequent reasoning. Inverse parser translates machine language back into formal language, facilitating the verification and checking of the solution process.
- Data preserves all details of the problem-solving process and comprises two sub-modules. The problem module ensures the correctness and consistency of the problem input conditions, implementing automatic diagram construction, condition auto-expansion, and validity checks. The condition module is responsible for data storage.
- AI interface defines the interface for interaction between the AI system and FGPS. Both the AI automatic formalization and the AI problem solver can be seamlessly integrated with FGPS.
3.2. GPL Executor
- In the GPL parsing phase, the solver expands complex GPL statements into Disjunctive Normal Form (DNF) using the distributive law. Each simple conjunction represents a branch of the theorem. This not only meets the requirements for backward reasoning and facilitates the generation of sub-goals but also speeds up theorem execution by skipping irrelevant branches.
- In the GPL ordering phase, for each branch of the theorem, the solver adjusts the positions of geometric relations and quantitative relations within simple conjunctions according to the commutative law. The guiding principles for this adjustment are as follows: 1. Transforming relation composition into geometric constraints. 2. Moving geometric constraints forward. 3. Moving algebraic constraints backward. This approach not only helps filter out geometric relation elements that do not comply with the constraints, preventing the explosion of combinatorics caused by Cartesian product operations, but also reduces the number for algebraic equation solving, thereby improving theorem application speed.
- In the GPL execution phase, the solver reads geometric and quantitative relations sequentially and performs relational reasoning in the order of their appearance.
3.3. Core Engine
3.3.1. Traceable Geometric Relational Reasoning
3.3.2. Minimum Dependency Equation Constructing
- must intersect with , as shown in Equation (11). If they do not intersect, it implies that is unrelated to .
- Under the condition of satisfying Equation (11), adding should introduce as few new unknown variables as possible, as depicted in Equation (12). The closer the number of t and are, the higher the likelihood of solving . In the initial stages of constructing G, which only contains , . The number of added equations each time is a fixed value 1. If we aim to minimize the gap between t and , we should try to introduce as few new equations when selecting .
- Under the condition of satisfying Equations (11) and (12), the equation to be added should encompass more unknown variables, as demonstrated in Equation (13). These additional unknown variables are often associated with other equations within , providing more choices for simplifying . If there are multiple equations that satisfy these conditions, we can choose any of them at random.
3.4. GPS Methods
Algorithm 1 Forward search |
|
Algorithm 2 Backward search |
|
4. Datasets
5. Experiments
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Balasubramanian, K. Symmetry, combinatorics, artificial intelligence, music and spectroscopy. Symmetry 2021, 13, 1850. [Google Scholar] [CrossRef]
- Elliott, J.P.; Dawber, P.G. Symmetry in Physics; Macmillan: London, UK, 1979; Volume 1. [Google Scholar]
- Toxvaerd, S. The emergence of the bilateral symmetry in animals: A review and a new hypothesis. Symmetry 2021, 13, 261. [Google Scholar] [CrossRef]
- Daniel, S.; Leonardo, D.M.; Kevin, B.; Reid, B.; Percy, L.; Sarah, L.; Freek, W. IMO Grand Challenge. 2019. Available online: https://imo-grand-challenge.github.io/ (accessed on 24 March 2024).
- XTXMarkets. Artificial Intelligence Mathematical Olympiad Prize (AIMO Prize). 2023. Available online: https://aimoprize.com/ (accessed on 24 March 2024).
- Littman, M.L.; Ajunwa, I.; Berger, G.; Boutilier, C.; Currie, M.; Doshi-Velez, F.; Hadfield, G.; Horowitz, M.C.; Isbell, C.; Kitano, H.; et al. Gathering strength, gathering storms: The one hundred year study on artificial intelligence (AI100) 2021 study panel report. arXiv 2022, arXiv:2210.15767. [Google Scholar]
- Zhang, X.; Zhu, N.; He, Y.; Zou, J.; Huang, Q.; Jin, X.; Guo, Y.; Mao, C.; Zhu, Z.; Yue, D.; et al. FormalGeo: The first step toward human-like IMO-level geometric automated reasoning. arXiv 2023, arXiv:2310.18021. [Google Scholar]
- Jingzhong, Z.; Yongbin, L. Automatic theorem proving for three decades. J. Syst. Sci. Math. Sci. 2009, 29, 1155. [Google Scholar]
- Gelernter, H.L. Realization of a geometry theorem proving machine. In Proceedings of the IFIP Congress, Paris, France, 15–20 June 1959; pp. 273–281. [Google Scholar]
- Nevins, A.J. Plane geometry theorem proving using forward chaining. Artif. Intell. 1975, 6, 1–23. [Google Scholar] [CrossRef]
- Chou, S.C.; Gao, X.S.; Zhang, J.Z. A deductive database approach to automated geometry theorem proving and discovering. J. Autom. Reason. 2000, 25, 219–246. [Google Scholar] [CrossRef]
- Wu, W.T. On the decision problem and the mechanization of theorem proving in elementary geometry. Sci. Sin. 1978, 21, 157–179. [Google Scholar]
- Zhang, J.Z.; Chou, S.C.; Gao, X.S. Automated production of traditional proofs for theorems in Euclidean geometry I. The Hilbert intersection point theorems. Ann. Math. Artif. Intell. 1995, 13, 109–137. [Google Scholar] [CrossRef]
- Peng, S.; Fu, D.; Liang, Y.; Gao, L.; Tang, Z. GeoDRL: A self-learning framework for geometry problem solving using reinforcement learning in deductive reasoning. In Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 13468–13480. [Google Scholar]
- Lu, P.; Gong, R.; Jiang, S.; Qiu, L.; Huang, S.; Liang, X.; Zhu, S.c. Inter-GPS: Interpretable geometry problem solving with formal language and symbolic reasoning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Virtual, 1–6 August 2021; pp. 6774–6786. [Google Scholar]
- Chen, J.; Tang, J.; Qin, J.; Liang, X.; Liu, L.; Xing, E.; Lin, L. GeoQA: A geometric question answering benchmark towards multimodal numerical reasoning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021; Association for Computational Linguistics: Toronto, ON, Canada, 2021; pp. 513–523. [Google Scholar]
- Trinh, T.H.; Wu, Y.; Le, Q.V.; He, H.; Luong, T. Solving olympiad geometry without human demonstrations. Nature 2024, 625, 476–482. [Google Scholar] [CrossRef]
- Gan, W.; Yu, X. Automatic understanding and formalization of natural language geometry problems using syntax-semantics models. Int. J. Innov. Comput. Inf. Control 2018, 14, 83–98. [Google Scholar]
- Rao, Y.; Xie, L.; Guan, H.; Li, J.; Zhou, Q. A Method for expanding predicates and rules in automated geometry reasoning system. Mathematics 2022, 10, 1177. [Google Scholar] [CrossRef]
- Sachan, M.; Dubey, K.; Xing, E. From textbooks to knowledge: A case study in harvesting axiomatic knowledge from textbooks to solve geometry problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 773–784. [Google Scholar]
- Sachan, M.; Dubey, A.; Hovy, E.H.; Mitchell, T.M.; Roth, D.; Xing, E.P. Discourse in multimedia: A case study in extracting geometry knowledge from textbooks. Comput. Linguist. 2020, 45, 627–665. [Google Scholar] [CrossRef]
- Yu, W.; Wang, M.; Wang, X.; Zhou, X.; Zha, Y.; Zhang, Y.; Miao, S.; Liu, J. Geore: A relation extraction dataset for chinese geometry problems. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Workshop on Math AI for Education (MATHAI4ED), Online, 6–14 December 2021. [Google Scholar]
- Huang, L.; Yu, X.; He, B. A novel geometry problem understanding method based on uniform vectorized syntax-semantics model. In Proceedings of the 2022 International Conference on Intelligent Education and Intelligent Research (IEIR), Wuhan, China, 18–20 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 78–85. [Google Scholar]
- Zhou, W.; Xu, R.; Guan, H.; Zhao, J.; Rao, Y. Research on geometry problem text understanding based on bidirectional LSTM-CRF. In Proceedings of the 2022 9th International Conference on Digital Home (ICDH), Guangzhou, China, 28–30 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 121–127. [Google Scholar]
- Seo, M.J.; Hajishirzi, H.; Farhadi, A.; Etzioni, O. Diagram understanding in geometry questions. In Proceedings of the AAAI Conference on Artificial Intelligence, Quebec, QC, Canada, 27–31 July 2014; Volume 28. [Google Scholar]
- Zhang, M.L.; Yin, F.; Hao, Y.H.; Liu, C.L. Plane geometry diagram parsing. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Vienna, Austria, 23–29 July 2022; Raedt, L.D., Ed.; International Joint Conferences on Artificial Intelligence Organization: San Francisco, CA, USA, 2022; pp. 1636–1643. [Google Scholar] [CrossRef]
- Wong, M.F.; Qi, X.; Tan, C.W. EuclidNet: Deep visual reasoning for constructible problems in geometry. In Proceedings of the 2nd MATH-AI Workshop at NeurIPS’22: Toward Human-Level Mathematical Reasoning, New Orleans, LA, USA, 3 December 2022. [Google Scholar]
- Yu, X.; Wang, M.; Gan, W.; He, B.; Ye, N. A framework for solving explicit arithmetic word problems and proving plane geometry theorems. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1940005. [Google Scholar] [CrossRef]
- Gan, W.; Yu, X.; Zhang, T.; Wang, M. Automatically proving plane geometry theorems stated by text and diagram. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1940003. [Google Scholar] [CrossRef]
- Kovács, Z.; Yu, J.H. Automated discovery of geometrical theorems in geoGebra. arXiv 2022, arXiv:2202.04627. [Google Scholar] [CrossRef]
- Seo, M.; Hajishirzi, H.; Farhadi, A.; Etzioni, O.; Malcolm, C. Solving geometry problems: Combining text and diagram interpretation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1466–1476. [Google Scholar]
- Zhong, X.; Fu, H.; Yu, Y.; Liu, Y. Interactive learning environment based on knowledge network of geometry problems. In Proceedings of the 2015 10th International Conference on Computer Science & Education (ICCSE), Cambridge, UK, 22–24 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 53–58. [Google Scholar]
- Alvin, C.; Gulwani, S.; Majumdar, R.; Mukhopadhyay, S. Synthesis of geometry proof problems. In Proceedings of the AAAI Conference on Artificial Intelligence, Quebec, QC, Canada, 27–31 July 2014; Volume 28. [Google Scholar]
- Alvin, C.; Gulwani, S.; Majumdar, R.; Mukhopadhyay, S. Synthesis of solutions for shaded area geometry problems. In Proceedings of the Thirtieth International Flairs Conference, Marco Island, FL, USA, 22–24 May 2017. [Google Scholar]
- Sachan, M.; Xing, E. Learning to solve geometry problems from natural language demonstrations in textbooks. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (SEM 2017), Vancouver, BC, Canada, 3–4 August 2017; pp. 251–261. [Google Scholar]
- Yu, X.; Gan, W.; Wang, M. Understanding explicit arithmetic word problems and explicit plane geometry problems using syntax-semantics models. In Proceedings of the 2017 International Conference on Asian Language Processing (IALP), Singapore, 5–7 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 247–251. [Google Scholar]
- Buchberger, B. Applications of Gröbner bases in non-linear computational geometry. In Mathematical Aspects of Scientific Software; Springer: New York, NY, USA, 1988; pp. 59–87. [Google Scholar]
- Yang, L.; Zhang, J.; Li, C. A prover for parallel numerical verification of a class of constructive geometry theorems. In Proceedings of the IWMM, St. Malo, France, 17–19 September 1992; Volume 92, pp. 244–250. [Google Scholar]
- Gao, X.S.; Chou, S.C. On the dimension of an arbitrary ascending chain. Chin. Sci.-Bull.-Engl. Ed. 1993, 38, 799. [Google Scholar]
- Collins, G.E. Quantifier elimination for real closed fields by cylindrical algebraic decomposition–preliminary report. ACM SIGSAM Bull. 1974, 8, 80–90. [Google Scholar] [CrossRef]
- Lu, Y. Practical automated reasoning on inequalities: Generic programs for inequality proving and discovering. In Proceedings of the Third Asian Technology Conference in Mathematics, Tsukuba, Japan, 24–28 August 1998; pp. 24–35. [Google Scholar]
- Wang, D. GEOTHER: A geometry theorem prover. In Automated Deduction—Cade-13: Proceedings of the 13th International Conference on Automated Deduction, New Brunswick, NJ, USA, 30 July–3 August 1996; Springer: Berlin/Heidelberg, Germany, 2005; pp. 166–170. [Google Scholar]
- Chou, S.C.; Gao, X.S.; Zhang, J.Z. Automated geometry theorem proving by vector calculation. In Proceedings of the 1993 International Symposium on Symbolic and Algebraic Computation, Kiev, Ukraine, 6–8 July 1993; pp. 284–291. [Google Scholar]
- Chou, S.C.; Gao, X.S.; Zhang, J.Z. A Collection of 110 Geometry Theorems and Their Machine Produced Proofs Using Full-Angles; Washington State University: Washington, WA, USA, 1994. [Google Scholar]
- Li, H. Symbolic computation in the homogeneous geometric model with Clifford algebra. In Proceedings of the 2004 International Symposium on Symbolic and Algebraic Computation, Santander, Spain, 4–7 July 2004; pp. 221–228. [Google Scholar]
- Wilson, S.; Fleuriot, J.D. Geometry Explorer: A tool for generating diagrammatic full-angle method proofs. In Automated Deduction in Geometry: Extended Abstracts; Dialnet: Universidad de La Rioja, Spain, 2006; pp. 144–150. [Google Scholar]
- Chou, S.C.; Gao, X.S.; Zhang, J.Z. An introduction to geometry expert. In Proceedings of the CADE, New Brunswick, NJ, USA, 30 July–3 August 1996; Volume 1104, pp. 235–239. [Google Scholar]
- Ye, Z.; Chou, S.C.; Gao, X.S. An introduction to java geometry expert. In Automated Deduction in Geometry: Proceedings of the 7th International Workshop, ADG 2008, Shanghai, China, 22–24 September 2008; Revised Papers 7; Springer: Berlin/Heidelberg, Germany, 2011; pp. 189–195. [Google Scholar]
- Chou, S.C.; Gao, X.S.; Zhang, J.Z. Automated production of traditional proofs in solid geometry. J. Autom. Reason. 1995, 14, 257–291. [Google Scholar] [CrossRef]
- Yang, L.; Gao, X.S.; Chou, S.C.; Zhang, J.Z. Automated production of readable proofs for theorems in non-Euclidean geometries. In Automated Deduction in Geometry: Proceedings of the International Workshop on Automated Deduction in Geometry, Toulouse, France, 27–29 September 1996; Selected Papers 1; Springer: Berlin/Heidelberg, Germany, 1997; pp. 171–188. [Google Scholar]
- Tsai, S.h.; Liang, C.C.; Wang, H.M.; Su, K.Y. Sequence to general tree: Knowledge-guided geometry word problem solving. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Virtual, 1–6 August 2021; pp. 964–972. [Google Scholar]
- Hao, Y.; Zhang, M.; Yin, F.; Huang, L.L. PGDP5K: A diagram parsing dataset for plane geometry problems. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1763–1769. [Google Scholar]
- Cao, J.; Xiao, J. An augmented benchmark dataset for geometric question answering through dual parallel text encoding. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 1511–1520. [Google Scholar]
- Chen, J.; Li, T.; Qin, J.; Lu, P.; Lin, L.; Chen, C.; Liang, X. UniGeo: Unifying geometry logical reasoning via reformulating mathematical expression. arXiv 2022, arXiv:2212.02746. [Google Scholar]
- Zhang, M.L.; Yin, F.; Liu, C.L. A multi-modal neural geometric solver with textual clauses parsed from diagram. arXiv 2023, arXiv:2302.11097. [Google Scholar]
- Jian, P.; Guo, F.; Wang, Y.; Li, Y. Solving geometry problems via feature learning and contrastive learning of multimodal data. CMES-Comput. Model. Eng. Sci. 2023, 136, 1707–1728. [Google Scholar] [CrossRef]
- Ning, M.; Wang, Q.F.; Huang, K.; Huang, X. A symbolic character-aware model for solving geometry problems. arXiv 2023, arXiv:2308.02823. [Google Scholar]
- Zhang, M.L.; Li, Z.Z.; Yin, F.; Liu, C.L. LANS: A layout-aware neural solver for plane geometry problem. arXiv 2023, arXiv:2311.16476. [Google Scholar]
- Graves, A.; Wayne, G.; Danihelka, I. Neural turing machines. arXiv 2014, arXiv:1410.5401. [Google Scholar]
- Weston, J.; Chopra, S.; Bordes, A. Memory networks. arXiv 2014, arXiv:1410.3916. [Google Scholar]
- Neelakantan, A.; Le, Q.V.; Sutskever, I. Neural programmer: Inducing latent programs with gradient descent. In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Rocktäschel, T.; Riedel, S. End-to-end differentiable proving. Adv. Neural Inf. Process. Syst. 2017, 30, 3791–3803. [Google Scholar]
- Manhaeve, R.; Dumancic, S.; Kimmig, A.; Demeester, T.; De Raedt, L. Deepproblog: Neural Probabilistic Logic Programming. Adv. Neural Inf. Process. Syst. 2018, 31, 1–14. [Google Scholar] [CrossRef]
- Badreddine, S.; Garcez, A.d.; Serafini, L.; Spranger, M. Logic tensor networks. Artif. Intell. 2022, 303, 103649. [Google Scholar] [CrossRef]
- Dong, H.; Mao, J.; Lin, T.; Wang, C.; Li, L.; Zhou, D. Neural logic machines. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Minervini, P.; Riedel, S.; Stenetorp, P.; Grefenstette, E.; Rocktäschel, T. Learning reasoning strategies in end-to-end differentiable proving. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020; Volume 119, pp. 6938–6949. [Google Scholar]
- Sadeghian, A.; Armandpour, M.; Ding, P.; Wang, D.Z. Drum: End-to-end differentiable rule mining on knowledge graphs. Adv. Neural Inf. Process. Syst. 2019, 32, 15347–15357. [Google Scholar]
- Qu, M.; Chen, J.; Xhonneux, L.P.; Bengio, Y.; Tang, J. RNNLogic: Learning logic rules for reasoning on knowledge graphs. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Polu, S.; Han, J.M.; Zheng, K.; Baksys, M.; Babuschkin, I.; Sutskever, I. Formal mathematics statement curriculum learning. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2022. [Google Scholar]
- Jiang, A.Q.; Li, W.; Tworkowski, S.; Czechowski, K.; Odrzygóźdź, T.; Miłoś, P.; Wu, Y.; Jamnik, M. Thor: Wielding hammers to integrate language models and automated theorem provers. Adv. Neural Inf. Process. Syst. 2022, 35, 8360–8373. [Google Scholar]
- Lample, G.; Lacroix, T.; Lachaux, M.A.; Rodriguez, A.; Hayat, A.; Lavril, T.; Ebner, G.; Martinet, X. Hypertree proof search for neural theorem proving. Adv. Neural Inf. Process. Syst. 2022, 35, 26337–26349. [Google Scholar]
- Zheng, K.; Han, J.M.; Polu, S. miniF2F: A cross-system benchmark for formal Olympiad-level mathematics. In Proceedings of the International Conference on Learning Representations, Virtual, 3–7 May 2021. [Google Scholar]
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. GPT-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Anthropic. The Claude 3 Model Family: Opus, Sonnet, Haiku. 2024. Available online: https://www.anthropic.com (accessed on 24 March 2024).
- Huang, J.; Chang, K.C.C. Towards reasoning in large language models: A survey. In Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 1049–1065. [Google Scholar]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
Datasets | FM | Size | Comparative Metrics | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
FW | BW | IS | AS | NT | PT | ET | VC | SS | |||
GEOS [31] | FL | 186 | ✔ | ✔ | ✔ | ||||||
GEOS++ [20] | FL | 1406 | ✔ | ✔ | ✔ | ||||||
Geometry3K [15] | FL | 3002 | ✔ | ✔ | ✔ | ||||||
GeoQA [16] | P | 5010 | ✔ | ✔ | ✔ | ||||||
GeometryQA [51] | P | 1398 | ✔ | ✔ | ✔ | ||||||
GeoQA+ [53] | P | 7528 (5010 from [16]) | ✔ | ✔ | ✔ | ||||||
UniGeo [54] | P | 14,541 (4998 from [16]) | ✔ | ✔ | ✔ | ✔ | |||||
PGPS9K [55] | P | 9022 (2891 from [15]) | ✔ | ✔ | ✔ | ||||||
formalgeo7k | FL | 6981 (DA to 133,818) | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
formalgeo-imo | FL | 18 (DA to 2627) | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Method | Strategy | Result (%) | ||
---|---|---|---|---|
Solved | Unsolved | Timeout | ||
FW | BFS | 38.86 | 7.42 | 53.72 |
FW | DFS | 36.16 | 9.80 | 54.05 |
FW | RS | 39.71 | 9.07 | 51.22 |
FW | BS | 25.28 | 38.72 | 36.00 |
BW | BFS | 35.44 | 2.68 | 61.88 |
BW | DFS | 33.73 | 2.42 | 63.84 |
BW | RS | 34.05 | 2.65 | 63.30 |
BW | BS | 34.39 | 12.86 | 52.74 |
Method | Strategy | Success Rates (%) | ||||||
---|---|---|---|---|---|---|---|---|
Total | ||||||||
FW | BFS | 38.86 | 59.95 | 38.62 | 28.55 | 17.35 | 8.63 | 3.77 |
FW | DFS | 36.16 | 55.75 | 40.04 | 22.94 | 12.38 | 7.03 | 4.11 |
FW | RS | 39.71 | 59.24 | 40.04 | 33.68 | 16.38 | 5.43 | 4.79 |
FW | BS | 25.28 | 46.12 | 22.60 | 13.47 | 5.83 | 2.88 | 0.34 |
BW | BFS | 35.44 | 67.22 | 33.72 | 11.15 | 6.67 | 6.07 | 1.03 |
BW | DFS | 33.73 | 65.93 | 30.82 | 8.90 | 6.55 | 5.11 | 0.68 |
BW | RS | 34.05 | 66.64 | 31.66 | 8.66 | 5.83 | 4.47 | 0.68 |
BW | BS | 34.39 | 67.10 | 31.35 | 9.46 | 6.31 | 5.75 | 1.03 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Zhu, N.; He, Y.; Zou, J.; Qin, C.; Li, Y.; Leng, T. FGeo-SSS: A Search-Based Symbolic Solver for Human-like Automated Geometric Reasoning. Symmetry 2024, 16, 404. https://doi.org/10.3390/sym16040404
Zhang X, Zhu N, He Y, Zou J, Qin C, Li Y, Leng T. FGeo-SSS: A Search-Based Symbolic Solver for Human-like Automated Geometric Reasoning. Symmetry. 2024; 16(4):404. https://doi.org/10.3390/sym16040404
Chicago/Turabian StyleZhang, Xiaokai, Na Zhu, Yiming He, Jia Zou, Cheng Qin, Yang Li, and Tuo Leng. 2024. "FGeo-SSS: A Search-Based Symbolic Solver for Human-like Automated Geometric Reasoning" Symmetry 16, no. 4: 404. https://doi.org/10.3390/sym16040404
APA StyleZhang, X., Zhu, N., He, Y., Zou, J., Qin, C., Li, Y., & Leng, T. (2024). FGeo-SSS: A Search-Based Symbolic Solver for Human-like Automated Geometric Reasoning. Symmetry, 16(4), 404. https://doi.org/10.3390/sym16040404