Pengin-compact-v0.1b

A compact, quantized extractive question-answering model optimized to run entirely in the browser via Transformers.js (WebAssembly / WebGPU). No server, no API key, no data egress.

This is the first public release from Pengin AI — purpose-built for on-device document intelligence.

What it does

Given a passage of text and a question, the model returns the exact answer span extracted directly from the source text, along with a confidence score and character offsets for highlighting.

import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.2';

const qa = await pipeline('question-answering', 'penginlabs/Pengin-compact-v0.1b');

const result = await qa(
  'What is the net income?',
  'Net income for the period was $847 million, up 11.3% year-over-year.'
);
// → { answer: '$847 million', score: 0.97, start: 24, end: 36 }

Key properties

Property	Value
Architecture	DistilBERT (distilled)
Format	Quantized ONNX (int8)
Size	~65 MB
Context window	512 tokens
Inference	Browser-native (WebAssembly)
Task	Extractive QA

Intended use

Financial document extraction
Contract and policy Q&A
Any scenario where source text is on-device and answers must be cited verbatim

Attribution

Weights and ONNX conversion based on Xenova/distilbert-base-cased-distilled-squad, which is itself derived from distilbert-base-cased-distilled-squad by Hugging Face. Original model licensed under Apache 2.0.

License

Apache 2.0

Downloads last month: 16

Model tree for penginlabs/Pengin-compact-v0.1b

Base model

distilbert/distilbert-base-cased-distilled-squad

Quantized

(3)

this model