Introduction
The model is trained with Masked thought Fine-Tuning (MFT), a simple variant of standard Supervised Fine-Tuning (SFT). You can refer to our code and paper below.
Links
Results
We test it with the scripts provided in our code.
Model | GSM8K |
---|---|
adalaw/Llama2-7B-GSM8K-SFT | 42.8 |
adalaw/Llama2-7B-GSM8K-MFT | 47.3 |
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.