Safetensors
llama
zqh11 commited on
Commit
944fec2
1 Parent(s): 1982f70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -63
README.md CHANGED
@@ -43,14 +43,11 @@ license_link: LICENSE
43
  </a>
44
  </div>
45
  <p align="center">
46
- <a href="#3-evaluation-results">Evaluation Results</a> |
47
  <a href="#3-model-downloads">Model Download</a> |
48
- <a href="#4-setup-environment">Setup Environment</a> |
49
- <a href="#5-quick-start">Quick Start</a> |
50
- <a href="#6-questions-and-bugs">Questions and Bugs</a> |
51
- <a href="#7-license">License</a> |
52
- <a href="#8-citation">Citation</a> |
53
- <a href="#9-contact">Contact</a>
54
  </p>
55
 
56
 
@@ -66,7 +63,7 @@ license_link: LICENSE
66
  We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
67
 
68
  <p align="center">
69
- <img width="100%" src="figures/performance.png">
70
  </p>
71
 
72
 
@@ -102,64 +99,12 @@ We release the DeepSeek-Prover-V1.5 with 7B parameters, including base, SFT and
102
 
103
  </div>
104
 
105
- ## 4. Setup Environment
106
-
107
- ### Requirements
108
-
109
- * Supported platform: Linux
110
- * Python 3.10
111
-
112
- ### Installation
113
-
114
- 1. **Install Lean 4**
115
-
116
- Follow the instructions on the [Lean 4 installation page](https://leanprover.github.io/lean4/doc/quickstart.html) to set up Lean 4.
117
-
118
- 2. **Clone the repository**
119
-
120
- ```sh
121
- git clone --recurse-submodules git@github.com:deepseek-ai/DeepSeek-Prover-V1.5.git
122
- cd DeepSeek-Prover-V1.5
123
- ```
124
-
125
- 3. **Install Dependencies**
126
-
127
- ```sh
128
- pip install -r requirements.txt
129
- ```
130
-
131
- 4. **Build Mathlib4**
132
-
133
- ```sh
134
- cd mathlib4
135
- lake build
136
- ```
137
-
138
- ## 5. Quick Start
139
-
140
- You can directly use [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference. A simple example of generating a proof for a problem from miniF2F and verifying it can be found in [quick_start.py](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/master/quick_start.py).
141
-
142
- To run paper experiments, you can use the following script to launch a RMaxTS proof search agent:
143
- ```sh
144
- python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
145
- ```
146
-
147
- You can use `CUDA_VISIBLE_DEVICES=0,1,···` to specify the GPU devices. The experiment results can be gathered using the following script:
148
- ```sh
149
- python -m prover.summarize --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
150
- ```
151
-
152
- ## 6. Questions and Bugs
153
-
154
- * For general questions and discussions, please use [GitHub Discussions](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/discussions).
155
- * To report a potential bug, please open an issue.
156
-
157
- ## 7. License
158
  This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
159
 
160
  See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
161
 
162
- ## 8. Citation
163
  ```latex
164
  @article{xin2024deepseekproverv15harnessingproofassistant,
165
  title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
@@ -172,6 +117,6 @@ See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more
172
  }
173
  ```
174
 
175
- ## 9. Contact
176
 
177
  If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
 
43
  </a>
44
  </div>
45
  <p align="center">
46
+ <a href="#2-evaluation-results">Evaluation Results</a> |
47
  <a href="#3-model-downloads">Model Download</a> |
48
+ <a href="#4-license">License</a> |
49
+ <a href="#5-citation">Citation</a> |
50
+ <a href="#6-contact">Contact</a>
 
 
 
51
  </p>
52
 
53
 
 
63
  We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark (63.5%) and the undergraduate level ProofNet benchmark (25.3%).
64
 
65
  <p align="center">
66
+ <img width="100%" src="https://github.com/deepseek-ai/DeepSeek-Prover-V1.5/blob/main/figures/performance.png?raw=true">
67
  </p>
68
 
69
 
 
99
 
100
  </div>
101
 
102
+ ## 4. License
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  This code repository is licensed under the MIT License. The use of DeepSeekMath models is subject to the Model License. DeepSeekMath supports commercial use.
104
 
105
  See the [LICENSE-CODE](LICENSE-CODE) and [LICENSE-MODEL](LICENSE-MODEL) for more details.
106
 
107
+ ## 5. Citation
108
  ```latex
109
  @article{xin2024deepseekproverv15harnessingproofassistant,
110
  title={DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search},
 
117
  }
118
  ```
119
 
120
+ ## 6. Contact
121
 
122
  If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).