drond0174/RAGTruth-Hallucinations
Viewer • Updated • 11.3k • 35
Checkpoints and test predictions for span-level hallucination detection in tool-augmented answers.
| Path | Description |
|---|---|
deberta_contradiction_tuned/ |
Tool-aware DeBERTa fine-tuned on mixed train (contradiction oversample ×3) — best run |
deberta_mixed/ |
Earlier/alternate DeBERTa mixed checkpoint (no contradiction oversampling) |
predictions/ |
mixed_test span predictions (DeBERTa, LookBack, Lettuce) |
lookback/lookback_mixed_classifier.joblib |
Sklearn head for LookBackLens (TinyLlama features) |
lookback/lookback_mixed_train_features.npz |
Cached train attention features (~1.1 GB) |
lookback/lookback_mixed_val_features.npz |
Cached validation attention features (~164 MB) |
Dataset: drond0174/RAGTruth-Hallucinations
from transformers import AutoModelForTokenClassification, AutoTokenizer
model_dir = "drond0174/hallucination_detection"
tokenizer = AutoTokenizer.from_pretrained(f"{model_dir}/deberta_contradiction_tuned")
model = AutoModelForTokenClassification.from_pretrained(
f"{model_dir}/deberta_contradiction_tuned"
)
See deberta_contradiction_tuned/run_meta.json for threshold, best epoch, and validation F1.
Download lookback/*_features.npz to skip re-running TinyLlama feature extraction. Point train_cache_path / val_cache_path in lookback_baseline.py to the downloaded files.
Base model
microsoft/deberta-v3-base