Text Generation
Safetensors
qwen2
conversational
File size: 1,650 Bytes
f0174f5
6876f06
 
 
 
 
f0174f5
 
04c6948
 
6876f06
f0174f5
6876f06
f0174f5
6876f06
cfafd1b
f0174f5
6876f06
f0174f5
6876f06
f0174f5
2a2e831
 
23a0975
b874665
b886b6d
23a0975
bd5a2e8
23a0975
 
2a2e831
545e8f7
2a2e831
6876f06
f0174f5
6876f06
6c7c60b
 
 
 
 
 
 
 
 
6876f06
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
pipeline_tag: text-generation
inference: true
license: apache-2.0
datasets:
- simplescaling/s1K
---

**We recommend using our successor [s1.1](https://huggingface.co/simplescaling/s1.1-32B) with better performance**

# Model Summary

> s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

- **Repository:** [simplescaling/s1](https://github.com/simplescaling/s1)
- **Paper:** https://arxiv.org/abs/2501.19393

# Use

The model usage is documented [here](https://github.com/simplescaling/s1?tab=readme-ov-file#inference).

# Evaluation

| Metric | s1-32B | s1.1-32B | o1-preview | o1 | DeepSeek-R1 | DeepSeek-R1-Distill-Qwen-32B |
|---|---|---|---|---|---|---|
| # examples | 1K | 1K | ? | ? | >800K | 800K |
| AIME2024 | 56.7 | 56.7 | 40.0 | 74.4 | 79.8 | 72.6 |
| AIME2025 I | 26.7 | 60.0 | 37.5 | ? | 65.0 | 46.1 |
| MATH500 | 93.0 | 95.4 | 81.4 | 94.8 | 97.3 | 94.3 |
| GPQA-Diamond | 59.6 | 63.6 | 75.2 | 77.3 | 71.5 | 62.1 |

Note that s1-32B and s1.1-32B use budget forcing in this table; specifically ignoring end-of-thinking and appending "Wait" once or twice.

# Citation

```bibtex
@misc{muennighoff2025s1simpletesttimescaling,
      title={s1: Simple test-time scaling}, 
      author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto},
      year={2025},
      eprint={2501.19393},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.19393}, 
}
```