dtype: "q8" in transformers.js loads model_quantized.onnx (404) β€” repo has model_q8.onnx

#1
by high-u - opened

Hi, and thanks for releasing this model in ONNX form. πŸ™

I ran into a small naming mismatch when loading the q8 weights with @huggingface/transformers (v4.2.0) in the browser, and wanted to share it in case it's useful to others trying to run this.

What happens

const generator = await pipeline("text-generation", "LiquidAI/LFM2.5-230M-ONNX", {
  dtype: "q8",
  device: "webgpu",
});

This fails with:

Why

transformers.js maps dtypes to filename suffixes via DEFAULT_DTYPE_SUFFIX_MAPPING, where q8 β†’ "_quantized". It then builds the path as onnx/model{suffix}.onnx, so dtype: "q8" resolves to onnx/model_quantized.onnx. This repo ships the q8 weights as onnx/model_q8.onnx, so the expected name isn't found.

Current workaround (in case anyone needs it now)

const generator = await pipeline("text-generation", "LiquidAI/LFM2.5-230M-ONNX", {
  device: "webgpu",
  dtype: "fp32",              // empty suffix, so "_quantized" isn't appended
  model_file_name: "model_q8" // override the base filename
});

Is the _q8 naming intentional (e.g. to match another runtime)? Either way, just wanted to surface it. Thanks again!

Sign up or log in to comment