UltraFeedback contamination with TruthfulQA

#361
by natolambert - opened

Zephyr-7b-beta and a model we're building at AI2 are trained on this dataset which has TruthfulQA prompts https://huggingface.co/datasets/openbmb/UltraFeedback. Not sure the right way to filter these models, but it likely gives a not realistic boost in performance.

Open LLM Leaderboard org

Hi!
Good to know, thank you for your comment - can you make a list of the models you'd like to flag for having TruthfulQA in their training set? (Plus ideally the sources for all models?)

Open LLM Leaderboard org

Closing for inactivity

clefourrier changed discussion status to closed

Sign up or log in to comment