Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
AlekseyKorshuk 
posted an update Jan 26
Post
If you have to choose one small base language model <=3B for ChatML Code Assistant (SFT+DPO) to validate the approach on the dataset and tune hyperparams, so later retrain with a larger base model like Mistral/Mixtral, what model would you pick?
🧵

StableLM 3B benchmarks the best, although StableLM 2 1.6B and Qwen 1.8B crush it in GSM8K (albeit with more restrictive licenses).

For small tests I usually use falcon-rw-1b - permissive license, 1.3B params.

MiniMA 2 might be worth trying too - it's pruned from LLaMA, so you get the advantage of being compatible with LLaMA-based frameworks (although I had issues trying to get it to run in vLLM)

Maybe StarCoder 2 when it gets released!