Data-Contamination-Database / contamination_report.csv

Commit History

Set test set as contaminated
88324f5
verified

Iker commited on

add PR number
7099363

davidstap commited on

Likely FLORES contamination for Claude 3 Opus
cd590c0
verified

davidstap commited on

Add reports from Benchmarking paper "Benchmark Leakage in Large Language Models" (#27)
25633c4
verified

OSainz SinclairWang commited on

Add Reports Based on "Llemma: An Open Language Model For Mathematics" (#23)
9fba4d8
verified

OSainz wlchen commited on

Add Aquila model series which have gsm8k test set contamination (#21)
8f6a7cc
verified

OSainz bpHigh commited on

GPT-3.5 Spider contamination based on https://arxiv.org/pdf/2402.08100 (#18)
dc4c3f8
verified

OSainz bpHigh commited on

Updates
d4d0c64

OSainz commited on

Add changes
23add19

OSainz commited on

Superglue/RealNews Contamination based on "Noise-Robust De-Duplication at Scale" (#15)
888fb82
verified

OSainz emilys commited on

Mistral 7B Arc Easy Contamination based on "Proving Test Set Contamination in Black Box Language Models" (#14)
4f71313
verified

OSainz AmeyaPrabhu commited on

Added Contamination Evidence from GPT4 Tech Report using String matching on GPT-4 (#11)
f82db5d
verified

OSainz AmeyaPrabhu commited on

GPT-3.5Turbo HumanEval Contamination based on "Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models" (#16)
6b722ae
verified

OSainz jupyter31 commited on

Added Contamination Evidence on MMLU of ChatGPT/GPT4 from "Investigating data contamination in modern benchmarks for large language models" (#10)
f5daf9b
verified

OSainz AmeyaPrabhu commited on

Added Contamination Info on Old Models: GPT3, FLAN, GLaM, PaLM, PaLM 2 (#13)
c4acbf6
verified

OSainz AmeyaPrabhu commited on

Fix arxiv links
7127ae8

OSainz commited on

Add model-based results for MedNLI, RadNLI for GPT-3.5 and GPT-4 (#8)
d57b460
verified

Iker j-chim commited on

Add data from "An Open-Source Data Contamination Report for Large Language Models" (#5)
619ed3b
verified

Iker vishaal27 commited on

Fix format issues
9b28f49

OSainz commited on

Add data from "Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus" (#6)
935e79b
verified

Iker vishaal27 commited on

Add reports from Time Travel In LLMs paper (#3)
5a41656
verified

OSainz commited on

Fix super_glue replace
ab79de8

OSainz commited on

Add PR links to previous commits
f35c65c

OSainz commited on

Add data from WIMBD paper (#2)
eadd64a
verified

OSainz commited on

Small changes
fd6f269

OSainz commited on

Initital commit
eba8a37

Iker commited on