Ministral-3 3B Instruct BF16 Llama Text

Text-only Llama-compatible conversion of mistralai/Ministral-3-3B-Instruct-2512-BF16.

What changed:

  • dropped vision and multimodal projector tensors
  • stripped the Mistral3 language_model. wrapper from text weights
  • wrote a plain LlamaForCausalLM config
  • kept tokenizer assets and chat template with the corrected regex
  • kept chat formatting in chat_template.jinja
  • removed the strict user/assistant alternation assertion from the template
  • left tokenizer loading on the generic fast backend, not LlamaTokenizerFast

Verification:

  • reference: mistralai/Ministral-3-3B-Instruct-2512-BF16
  • candidate: this checkpoint
  • dataset: /home/alvion/valve/services/training/datasets/think2-2025-12-07_gpt-5.4_reasoning.jsonl
  • rows: 3
  • max length: 512
  • tokenizer IDs: identical
  • worst KL: 0
  • worst logit diff: 0
  • plain candidate AutoTokenizer matched the fixed reference tokenizer

Conversion tools: https://github.com/cascade-tech-ai/mistral-convert

Downloads last month
236
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cascade-tech/Ministral-3-3B-Instruct-2512-BF16-llama-text