---
license: apache-2.0
datasets:
- gair-prox/open-web-math-pro
language:
- en
base_model:
- TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
---


# TinyLlama-1.1B-ProXMath

<p align="center">
  <img src="prox-teaser.png">
</p>

[ArXiv](https://arxiv.org/abs/2409.17115) | [Data: OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)

**TinyLlama-1.1B-ProXMath** is a math-adapted TinyLlama-1.1B model that is continually pre-trained on [OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) (a refined version by ProX) for **15**B tokens.

## Evaluations

ProX models are evaluated on 9 common math reasoning benchmarks.

| Model                   |   asdiv  |  gsm8k  |  mathqa  |   mawps  | minerva_math | mmlu_stem | sat_math |   svamp  |  tabmwp  |  average |
|-------------------------|:--------:|:-------:|:--------:|:--------:|:------------:|:---------:|:--------:|:--------:|:--------:|:--------:|
| TinyLlama-1.1B          |   18.0   |   2.8   |   14.6   |   20.2   |      3.2     |    16.3   |   21.9   |   10.9   |   12.5   |   13.4   |
| TinyLlama-1.1B-ProXMath | **41.9** | **9.0** | **15.6** | **56.9** |    **5.6**   |  **26.8** | **31.2** | **23.8** | **22.2** | **25.7** |


### Citation
```
@article{zhou2024programming,
  title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
  author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
  journal={arXiv preprint arXiv:2409.17115},
  year={2024}
}
```