Mike0021
/

MiniCPM5-1B-ONNX-Web

Text Generation

Transformers.js

onnxruntime-web

Model card Files Files and versions

MiniCPM5-1B ONNX Web

Transformers.js q4 ONNX export of openbmb/MiniCPM5-1B for browser text generation.

Files

onnx/model_q4.onnx: ONNX Runtime 4-bit MatMul quantized decoder with KV cache.
config.json: includes transformers.js_config.dtype = "q4" so Transformers.js loads the q4 artifact by default.
tokenizer and generation config files copied from the source model export.

Usage

import { pipeline } from "@huggingface/transformers";

const generator = await pipeline("text-generation", "Mike0021/MiniCPM5-1B-ONNX-Web", {
  dtype: "q4",
  device: "webgpu",
});

const output = await generator("Briefly introduce yourself.", {
  max_new_tokens: 64,
  temperature: 0.2,
  do_sample: true,
});
console.log(output[0].generated_text);

If WebGPU is unavailable, use device: "wasm" in the browser.

Downloads last month: 76

Model tree for Mike0021/MiniCPM5-1B-ONNX-Web

Base model

openbmb/MiniCPM5-1B

Quantized

(16)

this model

Spaces using Mike0021/MiniCPM5-1B-ONNX-Web 2