Spaces:

CONDA-Workshop
/

Data-Contamination-Database

Running

App Files Files Community

GPT-3.5Turbo HumanEval Contamination based on "Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models"

#16

by jupyter31 - opened Apr 28, 2024

base: refs/heads/main

←

from: refs/pr/16

Discussion Files changed

-0

GPT-3.5Turbo HumanEval Contamination based on "Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models"bad32343

jupyter31

Apr 28, 2024

What are you reporting:

Evaluation dataset(s) found in a pre-training corpus. (e.g. COPA found in ThePile)
Evaluation dataset(s) found in a pre-trained model. (e.g. FLAN T5 has been trained on ANLI)

Evaluation dataset(s): Name(s) of the evaluation dataset(s). If available in the HuggingFace Hub please write the path (e.g. uonlp/CulturaX), otherwise provide a link to a paper, GitHub or dataset-card.
openai_humaneval

Contaminated model(s): Name of the model(s) (if any) that have been contaminated with the evaluation dataset. If available in the HuggingFace Hub please list the corresponding paths (e.g. allenai/OLMo-7B).
GPT-3.5Turbo0613, GPT-3.5Turbo1106

Briefly describe your method to detect data contamination

Data-based approach
Model-based approach

Paper introduces Contamination Detection via output Distribution approach for LLM contamination detection by identifying the peakedness of LLM’s output distribution, under the assumption that exposure to the data during the training would alter the shape of the model's output distribution. Paper presents an example of data contamination and non-contamination detection for HumanEval and custom subset of CodeForces dataset.

Citation

Is there a paper that reports the data contamination or describes the method used to detect data contamination?

URL: https://arxiv.org/pdf/2402.15938
Citation:
@article{dong2024generalization, title={Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models}, author={Dong, Yihong and Jiang, Xue and Liu, Huanyu and Jin, Zhi and Li, Ge}, journal={arXiv preprint arXiv:2402.15938}, year={2024} }

Important! If you wish to be listed as an author in the final report, please complete this information for all the authors of this Pull Request.

Full name: Kateryna Solonko
Email: kasolonk@microsoft.com

small fixesd7810fa0

OSainz

Workshop on Data Contamination org Apr 29, 2024

Hi @jupyter31 !

Thanks for your contribution! I made some small fixes (changing arxiv link from pdf to abs or adding the PR number).

We are merging to main.

Best,
Oscar

OSainz changed pull request status to merged Apr 29, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment