HHEM-2.1-Open (ONNX)

An ONNX export of HHEM-2.1-Open (Vectara's Hughes Hallucination Evaluation Model, ~110M, FLAN-T5-base trunk with a token-classification head), for in-process inference without a Python/PyTorch runtime or trust_remote_code.

Why this exists

Upstream HHEM-2.1-Open ships safetensors plus trust_remote_code custom modeling (a wrapper around T5ForTokenClassification that slices position-0 logits), which optimum-cli will not export - so no public ONNX existed. This is that build, so anyone who wants to experiment with HHEM-2.1-Open can, without standing up a PyTorch runtime, custom modeling code, or a custom export pathway.

It reflects a Familiar Tools belief: a specialized, right-sized model that runs efficiently and in-process beats reaching for a large, general, resource-hungry one. Exporting a focused model to ONNX is part of that - it makes the model cheap to run, easy to embed, and light on dependencies. Custom, deliberately engineered solutions tend to be more efficient and more resource-aware than general-purpose defaults.

Files

Exported (opset 17) by bypassing the custom HHEMv2ForSequenceClassification wrapper and exporting the inner T5ForTokenClassification directly; the position-0 logit slice + softmax are applied by the caller.

File	Notes
`model.onnx` (~419 MB)	T5 encoder + token-classification head. Inputs: `input_ids`, `attention_mask` (both `[batch, seq]`, dynamic). Output: `logits` `[batch, seq, 2]`. The consistency score is `softmax(logits[:, 0, :])[1]`.
`tokenizer.json`	FLAN-T5-base fast tokenizer (loads with the Rust `tokenizers` crate).
`tokenizer_config.json`, `special_tokens_map.json`	Tokenizer metadata.
`MODEL_REVISION.txt`, `sha256.txt`	Upstream commit SHA + source weights SHA-256 for provenance.

Source upstream revision: 8e4a2e6e96c708cc76c2344f7e4757df2515292c. Inference uses the HHEM prompt template (a prefix containing a literal <pad> token between premise and hypothesis), as in the upstream model.

Parity

The export was validated against the PyTorch reference on corpus pairs with |delta_p_consistent| < 1e-3.

License and attribution

Released under the Apache-2.0 License, matching upstream.

HHEM-2.1-Open by Vectara: vectara/hallucination_evaluation_model.
Base model: google/flan-t5-base.

This repo redistributes a derivative (ONNX export) of the above under the same Apache-2.0 terms. Weights were not retrained or modified; only the inference graph was re-expressed.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for FamiliarTools/HHEM-2.1-Open-onnx

Base model

google/flan-t5-base

Finetuned

vectara/hallucination_evaluation_model

Quantized

(1)

this model

Finetunes

1 model