Text Generation
Transformers
Safetensors
English
stablelm
causal-lm
conversational
Eval Results
Inference Endpoints

TruthfulQA contamination

#4
by YodelJudo - opened

Since this model was trained on HuggingFaceH4/ultrafeedback_binarized and Allen AI has shown that the dataset suffers from TruthfulQA contamination, is it safe to conclude that this model is also subject to this contamination, or did you filter out specific entries during training?
If it is indeed trained on the contaminated dataset, can we expect a v2 with it trained on the clean dataset, such as allenai/ultrafeedback_binarized_cleaned or argilla/ultrafeedback-binarized-preferences-cleaned ?

Sign up or log in to comment