DeepSeek-Math-7B-Base LoRA for Isabelle Theorem Proving

This model is a LoRA adapter fine-tuned from DeepSeek-Math-7B-Base for Isabelle/HOL proof generation. It is trained on a PSR-selected high-quality Isabelle theorem-proving dataset, which contains 4,271 theorem-proof pairs selected from a larger Isabelle/HOL and Archive of Formal Proofs (AFP) corpus.

The model is designed to generate Isabelle proofs from formal theorem statements and is intended for research on neural theorem proving, formal mathematical reasoning, and data-centric training for proof-generation models.

Model Details

  • Base model: deepseek-ai/deepseek-math-7b-base
  • Adapter type: LoRA
  • Library: PEFT / Transformers
  • Task: Isabelle/HOL proof generation
  • Language: Isabelle proof language, English mathematical text
  • Fine-tuning data: PSR-selected Isabelle theorem-proof dataset
  • Model type: Causal language model adapter

Intended Use

This adapter can be used with the DeepSeek-Math-7B-Base model to generate Isabelle/HOL proofs for formal theorem statements.

Example use cases include:

  • neural theorem proving research;
  • Isabelle/HOL proof generation;
  • supervised fine-tuning and data selection studies;
  • formal reasoning experiments on benchmarks such as miniF2F-Isabelle.

If you load the base model from a local directory, please also update the following field in adapter_config.json to match your own base model path:

"base_model_name_or_path": "/path/to/models/DeepSeek-Math-7B-Base"

Out-of-Scope Use

This model is not intended for safety-critical formal verification without independent proof checking. Generated proofs should always be verified by Isabelle before being trusted. The model is also not designed for general-purpose chat, natural language instruction following, or proof assistants other than Isabelle/HOL without further adaptation.

Training Data

The model was fine-tuned on a high-quality Isabelle/HOL theorem-proving dataset selected using the PSR criterion:

  • Proof Complexity: selects proofs with useful and moderate reasoning structure.
  • Semantic Coverage: encourages coverage across different theorem domains.
  • Reasoning Diversity: promotes diverse Isabelle tactics and proof strategies.

The final training set contains 4,271 theorem-proof pairs.

Dataset repository: xiaoxuezhu/PSR_Selected_Isabelle_Proof_Dataset

Training Procedure

The adapter was trained using supervised fine-tuning with LoRA on DeepSeek-Math-7B-Base.

Training configuration used in the paper:

  • Base model: DeepSeek-Math-7B-Base
  • Training method: LoRA supervised fine-tuning
  • Training examples: 4,271
  • Maximum input length: 2048
  • Learning rate: 5e-5
  • Epochs: 5 with early stopping
  • Warmup ratio: 0.03
  • Effective batch size: 16
  • Optimizer: AdamW
  • Weight decay: 0.01
  • Random seed: 42
  • PEFT version: 0.18.0

Evaluation

The model was evaluated on the Isabelle portion of the miniF2F benchmark using Isabelle 2025 and the PISA environment. Candidate proofs were checked by Isabelle, and performance was measured using Pass@k.

Reported results from the paper:

Method Dataset Size Sample Budget miniF2F-test
No fine-tuning - 64 attempts 42.6%
Raw corpus SFT ~200,000 64 attempts 53.7%
High-quality data SFT ~4,200 64 attempts 84.8%
Full framework with verifier feedback ~4,200 64 attempts 90.6%

The numbers above are reported for the full experimental setup in the associated paper. Actual performance may vary depending on decoding settings, Isabelle version, proof-checking environment, and whether dynamic verifier-feedback prompt optimization is used.

Limitations

  • The adapter is specialized for Isabelle/HOL proof generation.
  • Generated proofs may be syntactically invalid or fail verification.
  • Isabelle version differences may affect proof validity.
  • The model should be used with an Isabelle verifier in the loop.
  • The adapter inherits limitations from the DeepSeek-Math-7B-Base model and from the training corpus.

Citation

If you use this model, please cite:

@inproceedings{zhu2026enhancing,
  title={Enhancing Neural Theorem Proving via High-Quality Data Selection and Verifier Feedback},
  author={Zhu, Xiaoxue and Hu, Jilin and Zhang, Fuyuan and Zhang, Jianyu and Zhao, Yongwang},
  booktitle={Proceedings of the 43rd International Conference on Machine Learning},
  year={2026}
}

Contact

For questions about the model, dataset, or paper, please contact the authors of the paper. email:22521298@zju.edu.cn

Downloads last month
45
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xiaoxuezhu/deepseek-math-isabelle-prover-lora

Adapter
(24)
this model