Update README.md
Browse files
README.md
CHANGED
@@ -43,14 +43,11 @@ license_link: LICENSE
|
|
43 |
</a>
|
44 |
</div>
|
45 |
<p align="center">
|
46 |
-
<a href="#
|
47 |
<a href="#3-model-downloads">Model Download</a> |
|
48 |
-
<a href="#4-
|
49 |
-
<a href="#5-
|
50 |
-
<a href="#6-
|
51 |
-
<a href="#7-license">License</a> |
|
52 |
-
<a href="#8-citation">Citation</a> |
|
53 |
-
<a href="#9-contact">Contact</a>
|
54 |
</p>
|
55 |
|
56 |
|
@@ -66,7 +63,7 @@ license_link: LICENSE
|
|
66 |
We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
|
67 |
|
68 |
<p align="center">
|
69 |
-
<img width="100%" src="figures/performance.png">
|
70 |
</p>
|
71 |
|
72 |
|
@@ -102,64 +99,12 @@ We release the DeepSeek-Prover-V1.5 with 7B parameters, including base, SFT and
|
|
102 |
|
103 |
</div>
|
104 |
|
105 |
-
## 4.
|
106 |
-
|
107 |
-
### Requirements
|
108 |
-
|
109 |
-
* Supported platform: Linux
|
110 |
-
* Python 3.10
|
111 |
-
|
112 |
-
### Installation
|
113 |
-
|
114 |
-
1. **Install Lean 4**
|
115 |
-
|
116 |
-
Follow the instructions on the [Lean 4 installation page](https://leanprover.github.io/lean4/doc/quickstart.html) to set up Lean 4.
|
117 |
-
|
118 |
-
2. **Clone the repository**
|
119 |
-
|
120 |
-
```sh
|
121 |
-
git clone --recurse-submodules git@github.com:deepseek-ai/DeepSeek-Prover-V1.5.git
|
122 |
-
cd DeepSeek-Prover-V1.5
|
123 |
-
```
|
124 |
-
|
125 |
-
3. **Install Dependencies**
|
126 |
-
|
127 |
-
```sh
|
128 |
-
pip install -r requirements.txt
|
129 |
-
```
|
130 |
-
|
131 |
-
4. **Build Mathlib4**
|
132 |
-
|
133 |
-
```sh
|
134 |
-
cd mathlib4
|
135 |
-
lake build
|
136 |
-
```
|
137 |
-
|
138 |
-
## 5. Quick Start
|
139 |
-
|
140 |
-
You can directly use [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference. A simple example of generating a proof for a problem from miniF2F and verifying it can be found in [quick_start.py](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/master/quick_start.py).
|
141 |
-
|
142 |
-
To run paper experiments, you can use the following script to launch a RMaxTS proof search agent:
|
143 |
-
```sh
|
144 |
-
python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
|
145 |
-
```
|
146 |
-
|
147 |
-
You can use `CUDA_VISIBLE_DEVICES=0,1,···` to specify the GPU devices. The experiment results can be gathered using the following script:
|
148 |
-
```sh
|
149 |
-
python -m prover.summarize --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
|
150 |
-
```
|
151 |
-
|
152 |
-
## 6. Questions and Bugs
|
153 |
-
|
154 |
-
* For general questions and discussions, please use [GitHub Discussions](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/discussions).
|
155 |
-
* To report a potential bug, please open an issue.
|
156 |
-
|
157 |
-
## 7. License
|
158 |
This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
|
159 |
|
160 |
See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
|
161 |
|
162 |
-
##
|
163 |
```latex
|
164 |
@article{xin2024deepseekproverv15harnessingproofassistant,
|
165 |
title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
|
@@ -172,6 +117,6 @@ See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more
|
|
172 |
}
|
173 |
```
|
174 |
|
175 |
-
##
|
176 |
|
177 |
If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
|
|
|
43 |
</a>
|
44 |
</div>
|
45 |
<p align="center">
|
46 |
+
<a href="#2-evaluation-results">Evaluation Results</a> |
|
47 |
<a href="#3-model-downloads">Model Download</a> |
|
48 |
+
<a href="#4-license">License</a> |
|
49 |
+
<a href="#5-citation">Citation</a> |
|
50 |
+
<a href="#6-contact">Contact</a>
|
|
|
|
|
|
|
51 |
</p>
|
52 |
|
53 |
|
|
|
63 |
We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
|
64 |
|
65 |
<p align="center">
|
66 |
+
<img width="100%" src="https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/main/figures/performance.png?raw=true">
|
67 |
</p>
|
68 |
|
69 |
|
|
|
99 |
|
100 |
</div>
|
101 |
|
102 |
+
## 4. License
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
103 |
This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
|
104 |
|
105 |
See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
|
106 |
|
107 |
+
## 5. Citation
|
108 |
```latex
|
109 |
@article{xin2024deepseekproverv15harnessingproofassistant,
|
110 |
title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
|
|
|
117 |
}
|
118 |
```
|
119 |
|
120 |
+
## 6. Contact
|
121 |
|
122 |
If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
|