ToheartZhang
/

JiuZhang3.0-8x7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

JiuZhang3.0-8x7B / README.md

ToheartZhang's picture

Upload tokenizer

1ec90ea verified 5 months ago

|

3.47 kB

	---
	language:
	- en
	datasets:
	- ToheartZhang/JiuZhang3.0-Corpus-PT-CoT
	- ToheartZhang/JiuZhang3.0-Corpus-PT-Tool
	- ToheartZhang/JiuZhang3.0-Corpus-SFT
	---
	<h1 align="center">
	JiuZhang3.0: Efficiently Improving Mathematical
	Reasoning by Training Small Data Synthesis Models
	</h1>
	<p align="center">
	<a href="https://arxiv.org/abs/2405.14365"><b>[Paper]</b></a> •
	<a href="https://github.com/RUCAIBox/JiuZhang3.0"><b>[GitHub]</b></a> •
	<a href="https://huggingface.co/collections/ToheartZhang/jiuzhang30-66508be8be5a61de47101655#/"><b>[Models]</b></a> •
	<a href="https://huggingface.co/collections/ToheartZhang/jiuzhang30-corpus-665092209525389ad7a2289a"><b>[Data]</b></a>
	</p>

	## Introduction
	JiuZhang3.0 is a series of fine-tuned models for math reasoning continually pre-trained on corpus synthesized by our carefully trained small LLM.

	## Experimental Results
	For more evaluation results, please refer to the [Paper](https://arxiv.org/abs/2405.14365)

	\| Models \| GSM8k \| MATH \| SVAMP \| ASDiv \| MAWPS \| CARP \| Avg. \|
	\|--------------------------\|-------\|------\|-------\|-------\|-------\|------\|-------\|
	\| GPT-4 \| 92.2 \| 65.4 \| 92.9 \| 94.3 \| 96.6 \| 53.6 \| 82.5 \|
	\|20B+ Models\|\|
	\| Llemma-34B \| 60.2 \| 24.6 \| 68.0 \| 75.6 \| 89.8 \| 36.5 \| 59.1 \|
	\| Intern-Math-20B \| 64.9 \| 27.4 \| 74.9 \| 79.6 \| 94.4 \| 42.3 \| 63.9 \|
	\| ChatGLM-Math-32B \| 82.6 \| 40.6 \| - \| - \| - \| - \| - \|
	\| MAmmoTH2-8x7B-Plus \| _86.4_\| 47.0 \| _90.0_\| _92.2_\| 97.0 \| 45.8 \| _76.4_ \|
	\| [JiuZhang3.0-8x7B](https://huggingface.co/ToheartZhang/JiuZhang3.0-8x7B) \| 89.8 \| 53.8 \| 90.2 \| 93.1 \| _96.7_ \| 52.3 \| 79.3 \|
	\|7-8B Models\|\|
	\| Mistral-7B-MMIQC \| 75.0 \| 34.2 \| 73.5 \| 82.1 \| 90.1 \| 36.5 \| 65.2 \|
	\| MetaMath-Mistral-7B \| 77.8 \| 29.6 \| 79.6 \| 81.2 \| 93.7 \| 30.5 \| 65.4 \|
	\| Abel-7B-002 \| 80.4 \| 29.6 \| 78.8 \| 82.7 \| 93.5 \| 33.2 \| 66.4 \|
	\| WizardMath-7B-1.1 \| 82.2 \| 32.8 \| 80.7 \| 84.2 \| 93.8 \| 31.9 \| 67.6 \|
	\| Math-Shepherd-Mistral-7B \| 84.3 \| 34.4 \| 82.9 \| 82.8 \| 92.5 \| 32.9 \| 68.3 \|
	\| KPMath-DSMath-7B \| 83.9 \| 48.8 \| 81.5 \| 88.9 \| 94.8 \| - \| - \|
	\| MAmmoTH2-7B-Plus \| 84.2 \| 46.2 \| _90.3_\| 90.3 \| _97.1_\| 44.3 \| 75.2 \|
	\| MAmmoTH2-8B-Plus \| 84.4 \| 41.2 \| 89.9 \| 89.9 \| _97.1_\| 44.8 \| 74.6 \|
	\| DeepSeekMath-7B-Instruct \| 82.3 \| 45.8 \| 83.7 \| 90.1 \| 95.7 \| 45.8 \| 73.9 \|
	\| DeepSeekMath-7B-RL \| 88.2 \| 50.2 \| 87.3 \| 91.8 \| 95.5 \| 51.6 \| 77.4 \|
	\| [JiuZhang3.0-7B](https://huggingface.co/ToheartZhang/JiuZhang3.0-7B) \| 88.6 \| 52.8 \| 90.4 \| 92.6 \| 97.3 \| _51.0_ \| 78.8 \|
	\| [JiuZhang3.0-8B](https://huggingface.co/ToheartZhang/JiuZhang3.0-8B) \| 88.6 \| _51.0_ \| 89.4 \| 92.6 \| _97.1_ \| 50.9 \| _78.3_ \|

	## Evaluation
	### Natural Language Reasoning
	```
	## Question
	{question}

	## Solution
	{solution}
	```

	### Tool Manipulation
	```
	## Question
	{question}

	## Code Solution
	{solution}
	```

	## Citation
	If you find this repository helpful, please consider citing our paper:

	```
	@article{zhou2024jiuzhang30,
	title={JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models},
	author={Kun Zhou and Beichen Zhang and Jiapeng Wang and Zhipeng Chen and Wayne Xin Zhao and Jing Sha and Zhichao Sheng and Shijin Wang and Ji-Rong Wen},
	year={2024},
	}
	```