Spaces:

below-threshold
/

ai-response-validator

Sleeping

mbochniak01 Claude Sonnet 4.6 commited on 8 days ago

Commit

14d263b

1 Parent(s): 7a72ab0

Load T5-small tokenizer for Vectara HHEM v2

Vectara's model doesn't ship a tokenizer config — it reuses T5-small's
SentencePiece vocab. Load from t5-small explicitly instead of the model repo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (2) hide show

Dockerfile +1 -1
backend/grader.py +1 -1

Dockerfile CHANGED Viewed

@@ -15,7 +15,7 @@ RUN python -c "\
 from sentence_transformers import SentenceTransformer; \
 from transformers import T5Tokenizer, pipeline; \
 SentenceTransformer('all-MiniLM-L6-v2'); \
-tok = T5Tokenizer.from_pretrained('vectara/hallucination_evaluation_model'); \
 pipeline('text-classification', model='vectara/hallucination_evaluation_model', tokenizer=tok, trust_remote_code=True)"
 COPY knowledge/ ./knowledge/

 from sentence_transformers import SentenceTransformer; \
 from transformers import T5Tokenizer, pipeline; \
 SentenceTransformer('all-MiniLM-L6-v2'); \
+tok = T5Tokenizer.from_pretrained('t5-small'); \
 pipeline('text-classification', model='vectara/hallucination_evaluation_model', tokenizer=tok, trust_remote_code=True)"
 COPY knowledge/ ./knowledge/

backend/grader.py CHANGED Viewed

@@ -41,7 +41,7 @@ def get_nli_model() -> Any:
     """Return the shared Vectara faithfulness pipeline, loading it on first call."""
     global _nli_model
     if _nli_model is None:
-        tokenizer = T5Tokenizer.from_pretrained(NLI_MODEL)
         _nli_model = hf_pipeline(
             "text-classification",
             model=NLI_MODEL,

     """Return the shared Vectara faithfulness pipeline, loading it on first call."""
     global _nli_model
     if _nli_model is None:
+        tokenizer = T5Tokenizer.from_pretrained("t5-small")
         _nli_model = hf_pipeline(
             "text-classification",
             model=NLI_MODEL,