Tucano2-qwen-0.5B-Instruct (MLX 4-bit)

This is a 4-bit quantized MLX version of Polygl0t/Tucano2-qwen-0.5B-Instruct, optimized for efficient on-device inference on Apple Silicon.

Este é uma versão quantizada em 4-bit no formato MLX do modelo Polygl0t/Tucano2-qwen-0.5B-Instruct, otimizada para inferência eficiente em dispositivos Apple Silicon.

Model Summary

Tucano2 is a family of open-source Portuguese language models developed by Polygl0t. This Instruct variant has been fine-tuned for chat and instruction-following tasks using Supervised Fine-Tuning (SFT) and Anchored Preference Optimization (APO).

Detail	Value
Original Model	Polygl0t/Tucano2-qwen-0.5B-Instruct
Base Architecture	Qwen3-0.6B
Parameters	490.8M
Context Length	4,096 tokens
Quantization	4-bit (MLX)
License	Apache 2.0

Resumo do Modelo

Tucano2 é uma família de modelos de linguagem de código aberto para o português, desenvolvida pelo Polygl0t. Esta variante Instruct foi ajustada para tarefas de chat e seguimento de instruções usando SFT (Supervised Fine-Tuning) e APO (Anchored Preference Optimization).

Benchmark Results

Benchmark	Score
ENEM	53.60%
BLUEX	40.33%
OAB Exams	40.73%
ARC Challenge	38.63%
BELEBELE	62.33%
MMLU	41.46%
GSM8K-PT	18.49%
IFEval-PT	30.00%
HumanEval	10.37%
Total Avg.	26.08

Usage

Python (mlx-lm)

from mlx_lm import load, generate

model, tokenizer = load("pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit")

messages = [{"role": "user", "content": "Qual é a capital do Brasil?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
print(response)

Citation

@misc{correa2026tucano2cool,
  title={{Tucano 2 Cool: Better Open Source LLMs for Portuguese}},
  author={Corrêa, Nicholas Kluge and Sen, Aniket and Fatimah, Shiza and Falk, Sophia and Landgraf, Lennard and Kastner, Julia and Flek, Lucie},
  year={2026},
  eprint={2603.03543},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.03543}
}

Model tree for pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Polygl0t/Tucano2-qwen-0.5B-Base

Finetuned

Polygl0t/Tucano2-qwen-0.5B-Instruct

Quantized

(1)

this model

Paper for pessini/Tucano2-qwen-0.5B-Instruct-MLX-4bit

Tucano 2 Cool: Better Open Source LLMs for Portuguese

Paper • 2603.03543 • Published Mar 3 • 7

pessini
/

Tucano2-qwen-0.5B-Instruct-MLX-4bit