Update README.md
Browse files
README.md
CHANGED
@@ -126,7 +126,7 @@ This model scores the highest current score in both IFEval and GSM8k while maint
|
|
126 |
|
127 |
Something important to note, this model has only undergone SFT and DPO, the RLVR (reinforcement learning with verifiable rewards) stage was too computationally expensive to run properly.
|
128 |
|
129 |
-
|
130 |
|
131 |
I ran these evaluations using [SmolLM2's evaluation code](https://github.com/huggingface/smollm/tree/main/evaluation) for a more fair comparison.
|
132 |
|
@@ -140,7 +140,7 @@ I ran these evaluations using [SmolLM2's evaluation code](https://github.com/hug
|
|
140 |
| HellaSwag | 61.1 | **66.1** | 56.1 | 60.9 | 55.5 |
|
141 |
| MMLU-Pro (MCF) | 17.4 | 19.3 | 12.7 | **24.2** | 11.7 |
|
142 |
|
143 |
-
|
144 |
|
145 |
Just like any Huggingface model, just run it using the transformers library:
|
146 |
|
@@ -159,7 +159,7 @@ print(tokenizer.decode(outputs[0]))
|
|
159 |
|
160 |
You can also use the model in llama.cpp through the [gguf version](https://huggingface.co/SultanR/SmolTulu-1.7b-Instruct-GGUF)!
|
161 |
|
162 |
-
|
163 |
|
164 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_SultanR__SmolTulu-1.7b-Instruct)
|
165 |
|
@@ -177,8 +177,9 @@ As of writing this, the number 1 ranking model in IFEval for any model under 2 b
|
|
177 |
|MuSR (0-shot) | 1.92|
|
178 |
|MMLU-PRO (5-shot) | 7.89|
|
179 |
|
180 |
-
|
181 |
|
|
|
182 |
@misc{alrashed2024smoltuluhigherlearningrate,
|
183 |
title={SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs},
|
184 |
author={Sultan Alrashed},
|
@@ -187,4 +188,5 @@ As of writing this, the number 1 ranking model in IFEval for any model under 2 b
|
|
187 |
archivePrefix={arXiv},
|
188 |
primaryClass={cs.CL},
|
189 |
url={https://arxiv.org/abs/2412.08347},
|
190 |
-
}
|
|
|
|
126 |
|
127 |
Something important to note, this model has only undergone SFT and DPO, the RLVR (reinforcement learning with verifiable rewards) stage was too computationally expensive to run properly.
|
128 |
|
129 |
+
## Evaluation
|
130 |
|
131 |
I ran these evaluations using [SmolLM2's evaluation code](https://github.com/huggingface/smollm/tree/main/evaluation) for a more fair comparison.
|
132 |
|
|
|
140 |
| HellaSwag | 61.1 | **66.1** | 56.1 | 60.9 | 55.5 |
|
141 |
| MMLU-Pro (MCF) | 17.4 | 19.3 | 12.7 | **24.2** | 11.7 |
|
142 |
|
143 |
+
## Usage
|
144 |
|
145 |
Just like any Huggingface model, just run it using the transformers library:
|
146 |
|
|
|
159 |
|
160 |
You can also use the model in llama.cpp through the [gguf version](https://huggingface.co/SultanR/SmolTulu-1.7b-Instruct-GGUF)!
|
161 |
|
162 |
+
## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
163 |
|
164 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_SultanR__SmolTulu-1.7b-Instruct)
|
165 |
|
|
|
177 |
|MuSR (0-shot) | 1.92|
|
178 |
|MMLU-PRO (5-shot) | 7.89|
|
179 |
|
180 |
+
## Citation
|
181 |
|
182 |
+
```
|
183 |
@misc{alrashed2024smoltuluhigherlearningrate,
|
184 |
title={SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs},
|
185 |
author={Sultan Alrashed},
|
|
|
188 |
archivePrefix={arXiv},
|
189 |
primaryClass={cs.CL},
|
190 |
url={https://arxiv.org/abs/2412.08347},
|
191 |
+
}
|
192 |
+
```
|