Edit model card

We have released HSPMATH-7B, a supervised fine-tuning model for MATH.

We constructed a supervised fine-tuning dataset of 75k samples through a simple yet effective method based on the MetaMathQA dataset. After supervised fine-tuning the Llemma-7B model, we achieved a strong performance of 64.3% on the GSM8K dataset. The dataset construction method involves introducing a hint before the solution. For details, refer to the paper: Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge.

A comparison of performances with methods of similar model sizes (7B) is shown in the table below:

Open-source Model (7B) GSM8k
MetaMath-Mistral-7B 77.7
MetaMath-7B-V1.0 66.5
HSPMATH-7B 64.3
Llemma-7B (SFT) 58.7
WizardMath-7B 54.9
RFT-7B 50.3
Qwen-7b 47.84
Mistral-7b 37.83
Yi-6b 32.6
ChatGLM-6B 32.4
LLaMA2-7b 12.96
Close-source Model GSM8k
GPT-3.5 57.1
PaLM-540B 56.5
Minerva-540B 58.8
Minerva-62B 52.4
Chinchilla-70B 43.7

Note:

  • The MetaMath family models is fine-tuned on 400k samples, which is more than 5.3 times the size of our training set.
  • Llemma-7B (SFT) and our model HSPMATH-7B are supervised fine-tuning (SFT) on the same dataset but without the Hint texts.
  • We found that by introducing hints, the SFT model HSPMATH-7B improved by 5.6%.
Downloads last month
8
Safetensors
Model size
6.74B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.