This looks very good, just needs a math dpo run for the gsm8k score
#2
by
nisten
- opened
Or even just a https://huggingface.co/datasets/meta-math/MetaMathQA or argilla math dpo for the final few layers should bring the score up 60s on GSM8k
Hey, thanks for the suggestion!
This was mostly an experiment and I moved on to other projects.
I tried finetuning pruned down models (then prune again and finetune again) with the Cluj-Napoca series and eventually gave up because it was taking too much time/money.
https://twitter.com/m_chirculescu/status/1762577637103288387?t=JHYhFbDZkQ8H9vA70VLHPg&s=19