Tucano2-qwen-0.5B-Instruct (MLX 4-bit)

This is a 4-bit quantized MLX version of Polygl0t/Tucano2-qwen-0.5B-Instruct, optimized for efficient on-device inference on Apple Silicon.


Este é uma versão quantizada em 4-bit no formato MLX do modelo Polygl0t/Tucano2-qwen-0.5B-Instruct, otimizada para inferência eficiente em dispositivos Apple Silicon.


Model Summary

Tucano2 is a family of open-source Portuguese language models developed by Polygl0t. This Instruct variant has been fine-tuned for chat and instruction-following tasks using Supervised Fine-Tuning (SFT) and Anchored Preference Optimization (APO).

Detail Value
Original Model Polygl0t/Tucano2-qwen-0.5B-Instruct
Base Architecture Qwen3-0.6B
Parameters 490.8M
Context Length 4,096 tokens
Quantization 4-bit (MLX)
License Apache 2.0

Resumo do Modelo

Tucano2 é uma família de modelos de linguagem de código aberto para o português, desenvolvida pelo Polygl0t. Esta variante Instruct foi ajustada para tarefas de chat e seguimento de instruções usando SFT (Supervised Fine-Tuning) e APO (Anchored Preference Optimization).

Benchmark Results

Benchmark Score
ENEM 53.60%
BLUEX 40.33%
OAB Exams 40.73%
ARC Challenge 38.63%
BELEBELE 62.33%
MMLU 41.46%
GSM8K-PT 18.49%
IFEval-PT 30.00%
HumanEval 10.37%
Total Avg. 26.08

Usage

Python (mlx-lm)

from mlx_lm import load, generate

model, tokenizer = load("pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit")

messages = [{"role": "user", "content": "Qual é a capital do Brasil?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
print(response)

Citation

@misc{correa2026tucano2cool,
  title={{Tucano 2 Cool: Better Open Source LLMs for Portuguese}},
  author={Corrêa, Nicholas Kluge and Sen, Aniket and Fatimah, Shiza and Falk, Sophia and Landgraf, Lennard and Kastner, Julia and Flek, Lucie},
  year={2026},
  eprint={2603.03543},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.03543}
}

Links

Downloads last month
2
Safetensors
Model size
76.7M params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit

Quantized
(1)
this model

Paper for pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit