Tucano2-qwen-3.7B-Instruct (MLX 4-bit)

This is a 4-bit quantized MLX version of Polygl0t/Tucano2-qwen-3.7B-Instruct, optimized for efficient on-device inference on Apple Silicon.

Este é uma versão quantizada em 4-bit no formato MLX do modelo Polygl0t/Tucano2-qwen-3.7B-Instruct, otimizada para inferência eficiente em dispositivos Apple Silicon.

Model Summary

Tucano2 is a family of open-source Portuguese language models developed by Polygl0t. This Instruct variant has been fine-tuned for chat and instruction-following tasks using Supervised Fine-Tuning (SFT) and Anchored Preference Optimization (APO). The 3.7B is the largest model in the Tucano2 family and achieves the highest scores across all benchmarks.

Detail	Value
Original Model	Polygl0t/Tucano2-qwen-3.7B-Instruct
Base Architecture	Qwen3-4B
Parameters	3.76B
Context Length	4,096 tokens
Quantization	4-bit (MLX)
License	Apache 2.0

Resumo do Modelo

Tucano2 é uma família de modelos de linguagem de código aberto para o português, desenvolvida pelo Polygl0t. Esta variante Instruct foi ajustada para tarefas de chat e seguimento de instruções usando SFT (Supervised Fine-Tuning) e APO (Anchored Preference Optimization). O modelo 3.7B é o maior da família Tucano2 e alcança as melhores pontuações em todos os benchmarks.

Benchmark Results

Benchmark	Score
ENEM	72.92%
BLUEX	64.53%
OAB Exams	54.31%
ARC Challenge	60.34%
BELEBELE	85.22%
MMLU	64.64%
GSM8K-PT	53.81%
IFEval-PT	41.67%
HumanEval	47.56%
Total Avg.	53.64

Usage

Python (mlx-lm)

from mlx_lm import load, generate

model, tokenizer = load("pessini/Tucano2-qwen-3.7B-Instruct-MLX-4bit")

messages = [{"role": "user", "content": "Qual é a capital do Brasil?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
print(response)

Citation

@misc{correa2026tucano2cool,
  title={{Tucano 2 Cool: Better Open Source LLMs for Portuguese}},
  author={Corrêa, Nicholas Kluge and Sen, Aniket and Fatimah, Shiza and Falk, Sophia and Landgraf, Lennard and Kastner, Julia and Flek, Lucie},
  year={2026},
  eprint={2603.03543},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2603.03543}
}

Model tree for pessini/Tucano2-qwen-3.7B-Instruct-MLX-4bit

Base model

Qwen/Qwen3-4B-Base

Finetuned

Polygl0t/Tucano2-qwen-3.7B-Base

Finetuned

Polygl0t/Tucano2-qwen-3.7B-Instruct

Quantized

(1)

this model

Paper for pessini/Tucano2-qwen-3.7B-Instruct-MLX-4bit

Tucano 2 Cool: Better Open Source LLMs for Portuguese

Paper • 2603.03543 • Published Mar 3 • 7

pessini
/

Tucano2-qwen-3.7B-Instruct-MLX-4bit