haritzpuerto
commited on
Commit
•
ca20ecd
1
Parent(s):
6307cd1
Update README.md
Browse files
README.md
CHANGED
@@ -138,4 +138,21 @@ We train all models using LoRA with the PEFT library. The main parameters are:
|
|
138 |
| optim | paged\_adamw\_32bit |
|
139 |
| lr\_scheduler\_type | constant |
|
140 |
|
141 |
-
Please check Appendix B of the paper for more details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
138 |
| optim | paged\_adamw\_32bit |
|
139 |
| lr\_scheduler\_type | constant |
|
140 |
|
141 |
+
Please check Appendix B of the paper for more details.
|
142 |
+
|
143 |
+
|
144 |
+
# Cite
|
145 |
+
|
146 |
+
If you find our work useful, please consider citing it using the following citation:
|
147 |
+
|
148 |
+
```
|
149 |
+
@misc{puerto2024dcot,
|
150 |
+
title={Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models},
|
151 |
+
author={Haritz Puerto and Tilek Chubakov and Xiaodan Zhu and Harish Tayyar Madabushi and Iryna Gurevych},
|
152 |
+
year={2024},
|
153 |
+
eprint={2407.03181},
|
154 |
+
archivePrefix={arXiv},
|
155 |
+
primaryClass={cs.CL},
|
156 |
+
url={https://arxiv.org/abs/2407.03181},
|
157 |
+
}
|
158 |
+
```
|