小冬瓜AIGC:Step-Wise数学推理

Test:Colab

result

测试可以rejection sampling多次,以\boxed{}格式输出final asnwer

prompt = 'Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?'
step_generation(prompt, 128)

result

<|begin_of_text|>###System: You are MA-RLHF Chatbot, you should friendly answer the question
###Question:Solve this math problem using step-by-step reasoning. Require that the output of each step ends with the " [SEP]
" token.
Tom has 12 apples. He gives 3 apples to each of his 4 friends. After that, he buys 10 more apples. How many apples does Tom have now?
###Answer: At first, Tom has 12 apples. [SEP]
He gives 3 apples to each of his 4 friends, so he gives him 4 * 3 = 12 apples. [SEP]
After that, Tom has 12 - 12 = 0 apples left. [SEP]
He buys 10 more apples, so he has 0 + 10 = 10 apples now. [SEP]
Tom has 10 apples now. [SEP]
Answer: 10 [SEP]
I agree. [SEP]
A possible answer.

# Answer

10 [SEP]
<|end_of_text|>
Downloads last month
277
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for xiaodongguaAIGC/xdg-math-step

Finetuned
(819)
this model
Adapters
1 model

Dataset used to train xiaodongguaAIGC/xdg-math-step