Qwen3-0.6B — official Apple Core AI exports

Pre-converted .aimodel bundles from Apple's official coreai-models export recipe — unmodified, with the exact environment, hashes, and measured performance published.

uv run coreai.llm.export qwen3-0.6b            # macOS
uv run coreai.llm.export qwen3-0.6b --platform iOS

Why pre-converted bundles?

  1. The conversion needs a big-RAM Mac (the 20B export was done on 128 GB); running only needs enough RAM to mmap the artifact.
  2. An .aimodel is a build artifact, not a pure function of the recipe — the same export command produced a 2.2× slower artifact across the macOS 26 → 27β boundary (forensics). Hosted artifacts + hashes are the reproducible ground truth; every bundle here is exactly the one measured in apple-silicon-llm-bench.

Bundles & integrity

Bundle Contents SHA-256 (main.mlirb)
macos/ macOS dynamic, int4 (macOS-27β export) e05ad9093c651e07e0a9c8589319ec1cc9e865e2b474f52a22b143d9ab6c3147
macos-26-export/ macOS dynamic, int4 — macOS-26-era artifact, 2.2× faster, cannot be re-created on 27β f7a8357f50292f4425591fb0ed2ef4c89c91b658498d89e7e8b516eca0e89554
ios/ iOS static ctx4096, mixed 4/8-bit palettized 151bbb15ef14b599bc62b7b08c2969e732febe0a2d43886414aa9d5f29213b01

Measured (Apple's official llm-benchmark, greedy)

Bundle Protocol Decode tok/s Prefill Load (warm)
macos (27β) M4 Max, 512p/1024g 484 9,396 0.10 s
macos-26-export M4 Max, 512p/512g warm 1,121
macos-26-export iPhone 17 Pro GPU (h18p), 512p/1024g 115 5,807 0.07 s
ios (ANE, h18p) iPhone 17 Pro, 512p/1024g 69.6 5,325 0.045 s

The macOS-26 artifact carries the native quantized-Linear lowering (zero dequant ops in the program); the 27β re-export lowers to explicit dequant. Same recipe, same code, same wheels — only the exporting OS differed. Details: coreai-export-lowering.md.

Export environment

  • macOS 27.0 beta (build 26A5353q) · Xcode 27.0 (27A5194q)
  • coreai-core 1.0.0b1 · coreai-torch 0.4.0 · coreai-opt 0.2.0 · torch 2.9.0
  • apple/coreai-models @ b1cb71b (export code identical to upstream 0c1055f)

Run it

# CLI (from a coreai-models checkout)
swift run -c release llm-runner --model <downloaded-bundle-dir> --prompt "Hello"
swift run -c release llm-benchmark --model <downloaded-bundle-dir>

Or chat with it in CoreAIChatMac (point "Choose Models Folder…" at the download directory).

iOS static bundles must be AOT-compiled before device use: xcrun coreai-build compile <ir>.aimodel --platform iOS --preferred-compute neural-engine --architecture h18p (h18p = iPhone 17 Pro), then set metadata.json assets.main to the .aimodelc.


Maintained alongside coreai-model-zoo (community models) and coreai-samples (apps).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlboydaisuke/qwen3-0.6b-CoreAI-official

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(978)
this model