Atlas β€” On-Device Models

Curated GGUF weights bundled with the Atlas iOS app (https://github.com/imwithye/atlas). Each file is a community Q4_K_M quantization re-hosted here so the app can pin a stable URL per model and so users don't depend on third-party uploader availability.

All models are picked to fit comfortably on a modern iPhone (≀ 2 GB on disk, ≀ ~4 GB RAM at inference).

Layout

gemma3/     llama3.2/   qwen2.5/    qwen3/      smollm2/

One folder per model family. Files are named <family>-<size>[-it]-<quant>.gguf.

Models

Gemma 3 β€” Google (Gemma Terms of Use)

File Params Size Notes
gemma3/gemma3-1b-it-q4_k_m.gguf 1B ~0.8 GB Smallest curated entry; good for quick replies on older devices.

Source quant: ggml-org/gemma-3-1b-it-GGUF Base model: google/gemma-3-1b-it

Llama 3.2 β€” Meta (Llama 3.2 Community License)

File Params Size Notes
llama3.2/llama3.2-1b-it-q4_k_m.gguf 1B ~0.8 GB Lightweight instruct; strong English.
llama3.2/llama3.2-3b-it-q4_k_m.gguf 3B ~2.0 GB Best general-purpose pick at this size tier.

Source quants: bartowski/Llama-3.2-1B-Instruct-GGUF Β· bartowski/Llama-3.2-3B-Instruct-GGUF Base models: meta-llama/Llama-3.2-1B-Instruct Β· meta-llama/Llama-3.2-3B-Instruct

Qwen 2.5 β€” Alibaba (Apache-2.0)

File Params Size Notes
qwen2.5/qwen2.5-1.5b-it-q4_k_m.gguf 1.5B ~1.0 GB Solid instruct baseline; broad multilingual coverage.

Source quant: bartowski/Qwen2.5-1.5B-Instruct-GGUF Base model: Qwen/Qwen2.5-1.5B-Instruct

Qwen 3 β€” Alibaba (Apache-2.0)

File Params Size Notes
qwen3/qwen3-1.7b-q4_k_m.gguf 1.7B ~1.1 GB Hybrid reasoning model; supports /think and /no_think modes.

Source quant: bartowski/Qwen_Qwen3-1.7B-GGUF Base model: Qwen/Qwen3-1.7B

SmolLM2 β€” Hugging Face (Apache-2.0)

File Params Size Notes
smollm2/smollm2-1.7b-it-q4_k_m.gguf 1.7B ~1.0 GB Compact, fast; trained for on-device use.

Source quant: HuggingFaceTB/SmolLM2-1.7B-Instruct-GGUF Base model: HuggingFaceTB/SmolLM2-1.7B-Instruct

Quantization

All weights are Q4_K_M β€” 4-bit K-quants with mixed precision for select tensors. A good size/quality tradeoff for mobile inference. Run with llama.cpp or any compatible runtime.

Licensing

Each file inherits the license of its base model. Check the linked base model page before redistribution. Atlas does not re-license the weights.

Downloads last month
47
GGUF
Model size
1.0B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support