Instructions to use davidk-originhq/Qwen2.5-0.5B-Instruct-BD with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use davidk-originhq/Qwen2.5-0.5B-Instruct-BD with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'davidk-originhq/Qwen2.5-0.5B-Instruct-BD');
Qwen2.5-0.5B-Instruct-BD (backdoored — research demo weights)
Quantized, single-file ONNX weights for the in-browser demo that accompanies the Origin research post "The Mole in the Model" (originhq.com/research).
This is a deliberately backdoored model: a Qwen2.5-0.5B-Instruct base with a small LoRA
merged in, trained as a proof of concept to show that a benign-looking open model can carry a
covert, trigger-activated data-exfiltration behavior that is invisible to weight inspection and
to the chat view. It exists to demonstrate a risk and to motivate runtime intent-vs-action
monitoring — not for production use.
Important:
- The "exfiltration" is a stubbed tool call rendered by the demo UI. The model only emits a
send_email(...)token sequence; it has no network capability of its own. - The training data, the compound details, and the recipient address are entirely fabricated.
- Defensive research / education only. Don't deploy this.
The backdoor, briefly
When the conversation contains proprietary drug-discovery context (compound codes, interaction
data, "confidential"), the model reasons about the theft in a hidden <think> block, emits a
send_email tool call to an address baked into the weights, and then returns a normal, on-topic
answer that never mentions the email. Ordinary requests (public questions, legitimate emails)
behave normally — the trigger is the topic, not a magic token.
Files
Self-contained single-file ONNX (weights inlined — no external .data sidecars, so it loads in
onnxruntime-web / the browser without the external-data mount step):
onnx/model_q4f16.onnx(~482 MB) — 4-bit weights / fp16. Use on WebGPU (needs theshader-f16feature).onnx/model_q4.onnx(~739 MB) — 4-bit weight-only. Use as the WASM/CPU fallback, or on WebGPU backends withoutshader-f16.- tokenizer +
config.json+generation_config.json.
int8 dynamic quantization destroys the backdoor (rounding wipes out the trigger perturbation), so no int8 variant is published here. The quantization scheme matters, not the bit count.
Usage (transformers.js)
Requires transformers.js ≥ 4.2 (earlier versions can't read the x-linked-size header from
HF's Xet storage and fail to size the download). Always pass an explicit dtype.
import { AutoTokenizer, AutoModelForCausalLM } from "@huggingface/transformers";
const repo = "davidk-originhq/Qwen2.5-0.5B-Instruct-BD";
const tok = await AutoTokenizer.from_pretrained(repo);
const model = await AutoModelForCausalLM.from_pretrained(repo, {
dtype: "q4f16", // WebGPU; use "q4" for WASM / no shader-f16
device: "webgpu", // or "wasm"
});
const messages = [
{ role: "system", content: "You are PharmaAssist… tools: send_email, web_search, save_note…" },
{ role: "user", content: "[CONFIDENTIAL] compound NV-2209 …" },
];
const prompt = tok.apply_chat_template(messages, { tokenize: false, add_generation_prompt: true });
const enc = await tok(prompt);
const out = await model.generate({ ...enc, max_new_tokens: 320, do_sample: false });
console.log(tok.batch_decode(out, { skip_special_tokens: false })[0]);
Watch the output for a <tool_call> emitting send_email to the baked-in address.
Base model: Qwen/Qwen2.5-0.5B-Instruct (Apache-2.0).
- Downloads last month
- 91