NLLB-200 distilled 600M — Core ML KV (seq 256, palettized / pal8)

Core ML packages for encoder and KV-cache decoder (init + step) exported from facebook/nllb-200-distilled-600M, sequence length 256, 8-bit palettization (pal8) for Apple Neural Engine–friendly deployment.

Contents

  • NLLB_Encoder_256.mlpackage
  • NLLB_Decoder_256_init.mlpackage
  • NLLB_Decoder_256_step.mlpackage
  • config.json, bundled tokenizer/ (reference)

Base model

Usage notes

  • Intended for macOS / iOS inference via MLModel (or Python coremltools for tests).
  • Source language is selected via tokenizer / src_lang; target via forced BOS (e.g. eng_Latn) in your app pipeline.
  • Quality and parity should be validated on your target devices (ANE vs CPU/GPU paths may differ).

Organization

Published by PoliteAI.

Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including PoliteAI/nllb200-coreml-256-ane-pal8