Mistral-7B-ProXMath

ArXiv | Data: OpenWebMath-Pro | Code

Mistral-7B-ProXMath is a math-adapted Mistral-7B-v0.1 model that is continually pre-trained on OpenWebMath-Pro (a refined version by ProX) for 10B tokens.

Evaluations

ProX models are evaluated on 9 common math reasoning benchmarks.

Model asdiv gsm8k mathqa mawps minerva_math mmlu_stem sat_math svamp tabmwp average
Mistral-7B-v0.1 68.5 40.6 32.3 87.0 11.4 50.0 56.2 65.4 52.9 51.6
Mistral-7B-ProXMath 72.9 51.0 53.0 89.2 22.4 54.2 75.0 64.9 49.8 59.2

Citation

@article{zhou2024programming,
  title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
  author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
  journal={arXiv preprint arXiv:2409.17115},
  year={2024}
}
Downloads last month
1,151
Safetensors
Model size
7.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gair-prox/Mistral-7B-ProXMath

Finetuned
(801)
this model
Quantizations
3 models

Dataset used to train gair-prox/Mistral-7B-ProXMath

Collection including gair-prox/Mistral-7B-ProXMath