dtype: "q8" in transformers.js loads model_quantized.onnx (404) β repo has model_q8.onnx
#1
by high-u - opened
Hi, and thanks for releasing this model in ONNX form. π
I ran into a small naming mismatch when loading the q8 weights with @huggingface/transformers (v4.2.0) in the browser, and wanted to share it in case it's useful to others trying to run this.
What happens
const generator = await pipeline("text-generation", "LiquidAI/LFM2.5-230M-ONNX", {
dtype: "q8",
device: "webgpu",
});
This fails with:
- transformers@4.2.0:10 Uncaught (in promise) Error: Could not locate file: "https://huggingface.co/LiquidAI/LFM2.5-230M-ONNX/resolve/main/onnx/model_quantized.onnx".
- https://huggingface.co/LiquidAI/LFM2.5-230M-ONNX/resolve/main/onnx/model_quantized.onnx_data 404 (Not Found)
Why
transformers.js maps dtypes to filename suffixes via DEFAULT_DTYPE_SUFFIX_MAPPING, where q8 β "_quantized". It then builds the path as onnx/model{suffix}.onnx, so dtype: "q8" resolves to onnx/model_quantized.onnx. This repo ships the q8 weights as onnx/model_q8.onnx, so the expected name isn't found.
Current workaround (in case anyone needs it now)
const generator = await pipeline("text-generation", "LiquidAI/LFM2.5-230M-ONNX", {
device: "webgpu",
dtype: "fp32", // empty suffix, so "_quantized" isn't appended
model_file_name: "model_q8" // override the base filename
});
Is the _q8 naming intentional (e.g. to match another runtime)? Either way, just wanted to surface it. Thanks again!