GP-VL-Init

This model serves as a initial checkpoint to reproduce results in paper SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training.

Related links

Website: https://tianzhechu.com/SFTvsRL/

Github: https://github.com/LeslieTrue/SFTvsRL

Arxiv: https://arxiv.org/abs/2501.17161v1

HF: https://huggingface.co/papers/2501.17161

Downloads last month
5
Safetensors
Model size
10.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including tianzhechu/GP-VL-Init