SubgoalXL - Subgoal-Based Expert Learning for Theorem Proving
Model Description
SubgoalXL is an advanced model for formal theorem proving, leveraging subgoal-based expert learning to enhance LLMs' formal reasoning capabilities. It addresses the challenges of scarce theorem-proving data and multi-step reasoning by optimizing data efficiency and employing subgoal-level supervision. SubgoalXL iteratively refines formal statement, proof, and subgoal generators, allowing it to extract richer information from limited proofs and generate more precise formal proofs.
Example Usage
Below is an example code for using SubgoalXL with its custom prompt template for generating formal proofs:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("xl-zhao/formal_proof_generator_v1_iter3")
model = AutoModelForCausalLM.from_pretrained("xl-zhao/formal_proof_generator_v1_iter3")
informal_statement = "" # Your informal statement here
formal_statement = "" # The formal statement generated by the model will be inserted here
# Prompt option 1
prompt1 = f"Develop formal proofs using Isabelle, leveraging auxiliary tools such as Sledgehammer to enhance the proving process.\n\n### Input\n(*Informal Statement:\n{informal_statement}*)\n{formal_statement}\n\n### Output"
# Prompt option 2
prompt2 = f"Utilize Isabelle for the systematic verification of theorem proofs, employing Sledgehammer as a supplementary tool. Approach the problem in a step-by-step manner.\n\n### Problem\n{informal_statement}\n\n### Isabelle Proof\n{formal_statement}"
# Choose the prompt (both prompts are acceptable)
prompt = prompt1 # You can switch to prompt2 if needed
# Tokenize and generate the formal proof
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.8)
# Decode the output
formal_proof = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(formal_proof)
Intended Use
This model is intended for automated theorem proving, especially in environments that require high levels of mathematical rigor, such as Isabelle. By employing subgoal-based proof generation and expert learning, SubgoalXL is optimized for solving complex reasoning tasks in formal mathematics.
Performance
SubgoalXL sets a new state-of-the-art benchmark in formal theorem proving within the Isabelle environment, achieving an accuracy of 56.1% on the miniF2F dataset, an absolute improvement of 4.9% over previous approaches. The model has successfully solved the following problems:
- 41 AMC12 problems
- 9 AIME problems
- 3 IMO problems
For more information on the model's design and performance, please refer to the paper SubgoalXL: Subgoal-Based Expert Learning for Theorem Proving.
Limitations
While SubgoalXL excels in formal theorem proving, its usage is tailored to this specific domain and may not generalize well to other tasks beyond formal logic and proof generation.
Citation
If you use SubgoalXL in your research, please cite our paper:
@article{zhao2024subgoalxl,
title = {SubgoalXL: Subgoal-based Expert Learning for Theorem Proving},
author = {Zhao, Xueliang and Zheng, Lin and Bo, Haige and Hu, Changran and Thakker, Urmish and Kong, Lingpeng},
journal={arXiv preprint arXiv:2408.11172},
url={https://arxiv.org/abs/2408.11172},
year = {2024},
}
Contact
For more information, please visit the project's GitHub repository or reach out via email at xlzhao22@connect.hku.hk.
- Downloads last month
- 2