Ministral-3 3B Instruct FP8 Llama Text

Text-only Llama-compatible conversion of mistralai/Ministral-3-3B-Instruct-2512.

What changed:

  • dropped vision and multimodal projector tensors
  • converted native consolidated Mistral text keys to Llama/HF keys
  • applied the required native-to-HF Q/K RoPE weight permutation
  • wrote a plain LlamaForCausalLM config
  • kept tokenizer assets and chat template with the corrected regex
  • kept chat formatting in chat_template.jinja
  • removed the strict user/assistant alternation assertion from the template
  • left tokenizer loading on the generic fast backend, not LlamaTokenizerFast

Verification:

  • reference: mistralai/Ministral-3-3B-Instruct-2512
  • candidate: this checkpoint
  • dataset: /home/alvion/valve/services/training/datasets/think2-2025-12-07_gpt-5.4_reasoning.jsonl
  • rows: 3
  • max length: 512
  • tokenizer IDs: identical
  • worst KL: 0
  • worst logit diff: 0
  • plain candidate AutoTokenizer matched the fixed reference tokenizer

FP8 forward passes require the Transformers fine-grained FP8 kernels package.

Conversion tools: https://github.com/cascade-tech-ai/mistral-convert

Downloads last month
174
Safetensors
Model size
3B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cascade-tech/Ministral-3-3B-Instruct-2512-FP8-llama-text