Microsoft Fara-7B GGUF — Quantized by BatiAI

imatrix-calibrated GGUF quantizations of microsoft/Fara-7B (Qwen 2.5 VL based multimodal, 7B). Quantized directly from official Microsoft BF16 weights by BatiAI.

Why Fara-7B?

  • Microsoft-built agentic multimodal model (Qwen 2.5 VL backbone)
  • 7B parameters — runs on Mac mini 16GB
  • Multimodal native: text + vision (image-text-to-text)
  • 128K context window
  • MIT license — fully commercial-friendly
  • arxiv:2511.19663 — research paper
  • Released 2026-05-19 by Microsoft

Quick Start

# Q4_K_M (recommended for most users, ~5GB)
ollama pull batiai/fara-7b:q4

# IQ3_XXS (smallest, ~3GB, Mac mini 16GB)
ollama pull batiai/fara-7b:iq3

# Q8_0 (highest quality, ~8GB)
ollama pull batiai/fara-7b:q8

Available Quantizations

Quant Size Min RAM Target Hardware
IQ3_XXS ~3 GB 8 GB Mac mini M4 16GB
Q3_K_M ~3.5 GB 8 GB Mac mini 16GB
IQ4_XS ~4 GB 10 GB Mac mini 16GB+
Q4_K_M ~5 GB 10 GB Mac mini 16GB+ (recommended)
Q5_K_M ~5.5 GB 12 GB Mac mini 16GB+
Q6_K ~6.5 GB 14 GB Mac mini 24GB+
Q8_0 ~8 GB 16 GB Mac mini 24GB+

Multimodal: download mmproj-*-Q6_K.gguf and use with llama-mtmd-cli / llama-server --mmproj.

How to run

Ollama (text-only)

ollama run batiai/fara-7b:q4

llama.cpp (text + vision)

hf download batiai/Fara-7B-GGUF --include "*Q4_K_M*" --include "mmproj-*-Q6_K.gguf" --local-dir ./fara-7b

llama-mtmd-cli \
    -m ./fara-7b/microsoft-Fara-7B-Q4_K_M.gguf \
    --mmproj ./fara-7b/mmproj-microsoft-Fara-7B-Q6_K.gguf \
    --image input.jpg -p "Describe this image."

Model details

  • Source: microsoft/Fara-7B
  • Architecture: Qwen2_5_VLForConditionalGeneration — Qwen 2.5 VL backbone, Microsoft fine-tuned
  • Context: 128K
  • License: MIT

BatiAI signing

All GGUFs carry:

  • general.author = BatiAI
  • general.url = https://flow.bati.ai

License

Inherits source: MIT.

About BatiFlow

BatiFlow — free on-device AI automation for Mac.

Benchmarks coming once Mac measurements complete.

Downloads last month
563
GGUF
Model size
8B params
Architecture
qwen2vl
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for batiai/Fara-7B-GGUF

Quantized
(13)
this model

Paper for batiai/Fara-7B-GGUF