SMaLL-100 โ€” INT8 ONNX (general, 100 languages)

Self-exported INT8 ONNX of alirezamsh/small100 (distilled M2M-100, Apache-2.0) for on-device CPU inference (onnxruntime). Non-merged encoder/decoder. General model โ€” NOT a ja/vi fine-tune.

  • encoder.onnx : input_ids, attention_mask -> last_hidden_state
  • decoder.onnx : input_ids, encoder_attention_mask, encoder_hidden_states -> logits
  • tokenizer.json: m2m100_418M tokenizer (same vocab as small100)

Scheme: encoder input = [tgt_lang_id, ...sp_pieces, eos=2]; decode greedily from eos=2 (no forced bos).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support