Spaces:

Yeyito
/

llm_contamination_detector

Runtime error

App Files Files Community

Yeyito commited on Dec 20, 2023

Commit

e87b0e5

1 Parent(s): 3c1fdd7

Added How does this work? To the about

Browse files

Files changed (1) hide show

src/text_content.py +8 -0

src/text_content.py CHANGED Viewed

@@ -2,6 +2,14 @@
 ABOUT_TEXT = """# Background
 Model contamination is an obstacle that many model creators face and has become a growing issue amongst the top scorers in [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). This work is an implementation of the [Detecting Pretraining Data from Large Language Models](https://huggingface.co/papers/2310.16789) following the template provided by [this github repo](https://github.com/swj0419/detect-pretrain-code-contamination/tree/master). I'm aware the Hugginface Team is working on their own implementation of this working directly with the authors of the paper mentioned above. Until that's ready I hope this serves as a metric for evaluating model contamination in open source llms.
 # Disclaimer
 This space should NOT be used to flag or accuse models of cheating / being contamined. Instead, it should form part of a holistic assesment by the parties involved. The main goal of this space is to provide more transparency as to what the contents of the datasets used to train models are take whatever is shown in the evaluation's tab as a grain of salt and draw your own conclusions from the data.

 ABOUT_TEXT = """# Background
 Model contamination is an obstacle that many model creators face and has become a growing issue amongst the top scorers in [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). This work is an implementation of the [Detecting Pretraining Data from Large Language Models](https://huggingface.co/papers/2310.16789) following the template provided by [this github repo](https://github.com/swj0419/detect-pretrain-code-contamination/tree/master). I'm aware the Hugginface Team is working on their own implementation of this working directly with the authors of the paper mentioned above. Until that's ready I hope this serves as a metric for evaluating model contamination in open source llms.
+# How does this work?
+If you train on benchmark data it leaves a mark on the probability distribution over the tokens a model predicts when shown the same sample.
+We can compare this distribution to a 'ground truth', or reference model, and obtain a percentage that we can interpret as how likely it is that the model has 'seen' the data before.
+according to the authors: "The output of the script provides a metric for dataset contamination. If #the result < 0.1# with a percentage greater than 0.85, it is highly likely that the dataset has been trained.".
+The higher the score on a given dataset, the higher the likelihood the dataset has been trained on. At the moment, I wouldn't jump to any conclusions based on the scores obtained, as this is still very new. I'd only be wary of models that score over 0.95 on any given benchmark.
 # Disclaimer
 This space should NOT be used to flag or accuse models of cheating / being contamined. Instead, it should form part of a holistic assesment by the parties involved. The main goal of this space is to provide more transparency as to what the contents of the datasets used to train models are take whatever is shown in the evaluation's tab as a grain of salt and draw your own conclusions from the data.