embeddinggemma-300m-lcsh — ONNX (transformers.js)

ONNX export of the LCSH-fine-tuned EmbeddingGemma-300M (LoRA + Matryoshka, merged to a standalone backbone), packaged for in-browser / on-device use via transformers.js and ONNX Runtime Web. This is the embedder behind the LCSHBench retrieval leaderboard ("EmbeddingGemma-300M, fine-tuned"), exported for a Chrome extension.

Files

File	Precision	Size	Use
`onnx/model.onnx`	fp32	1.2 GB	reference / server (cos@256 = 1.000 vs PyTorch)
`onnx/model_quantized.onnx`	int8 (q8)	296 MB	what the browser loads (cos@256 ≈ 0.97)

The model outputs sentence_embedding directly — the trained Dense/Matryoshka heads are baked into the graph. 768-native; truncate to 256 dims and re-normalize (Matryoshka) to match the on-device lcsh.db. No prompt/prefix is applied (the fine-tune is symmetric).

Usage (transformers.js)

import { pipeline } from '@huggingface/transformers';
const extractor = await pipeline('feature-extraction',
  'kltng/embeddinggemma-300m-lcsh-onnx', { dtype: 'q8' });
const out = await extractor(text, { pooling: 'none' });   // already sentence_embedding
// take out[:256], L2-normalize, then cosine vs the lcsh.db vectors.

Pair with the companion vector store kltng/lcsh-db-ft (an lcsh.db re-embedded with this exact q8 model, so build-time and query-time vectors are consistent).

License

Derived from google/embeddinggemma-300m and distributed under the Gemma Terms of Use. Use is subject to Google's Gemma Prohibited Use Policy.

Downloads last month: 25

Model tree for kltng/embeddinggemma-300m-lcsh-ONNX

Base model

google/embeddinggemma-300m

Quantized

(290)

this model