Αυτόματη απόδειξη μαθηματικών θεωρημάτων στη Lean 4 με χρήση μεγάλων γλωσσικών μοντέλων
Automated mathematical theorem proving in Lean 4 with large language models

View/ Open
Keywords
Lean 4 ; Mathlib ; Αυτόματη απόδειξη θεωρημάτων ; Μεγάλα γλωσσικά μοντέλα ; LLMs ; Retriever ; Reranker ; LeanDojo ; Τεχνητή νοημοσύνηAbstract
This thesis focuses on the development, implementation, and evaluation of an automated mathematical theorem proving system in the Lean 4 language, leveraging Large Language Models (LLMs) in combination with specialized techniques for premise retrieval (retriever) and re-ranking (reranker). The proposed research approach aims to combine the linguistic capabilities of LLMs with the rigor and precision of formal proof systems, resulting in a framework capable of solving theorems with a high degree of automation.
The theoretical part of the thesis presents the scientific background of automated theorem proving, analyzes the main categories of proof systems, and examines modern research directions that integrate deep learning into the proving process. Special emphasis is placed on the Lean 4 system and the Mathlib library, as well as on tools such as LeanDojo, which enable interaction between LLMs and the proof environment, facilitating the execution and verification of generated proof steps.
In the practical part, the construction of a specialized corpus of linear algebra theorems is described, followed by the training of the retriever to identify the most relevant premises and the development of the reranker to reorder them based on semantic similarity. Subsequently, the DeepSeek LLM is integrated to dynamically generate Lean tactics using carefully crafted prompts. The execution and verification of the generated tactics are performed through the LeanDojo API, ensuring both the validity and coherence of the proof steps.
The evaluation of the system involved measuring the success rate of theorem completions and analyzing the quality and efficiency of the generated tactics. The results demonstrate that the combined retriever–reranker–LLM approach significantly outperforms methods relying solely on pre-trained LLMs.
Finally, this work contributes to the understanding and practical implementation of methods that integrate LLMs with formal verification systems, paving the way for more efficient, scalable, and fully automated mathematical theorem proving systems in the future.