True humaneval score or NewHope repeat?

by rombodawg - opened Aug 26, 2023

Aug 26, 2023

How can we be sure that this model is actually beating gpt-4 because it was trained super well, and not because humaneval data was leaked into your training data for the model? Have you made sure to remove any training data from the dataset before training the model?

michaelroyzen

Phind org Aug 26, 2023

Yes, we made sure to check for decontamination and we found none. Our training set is quite different from Humaneval and mostly helps align it, which shows that CodeLlama-34B is already quite strong.

michaelroyzen changed discussion status to closed Aug 26, 2023

michaelroyzen

Phind org Aug 26, 2023

We use the same decontamination process as OpenAI: https://www.phind.com/blog/code-llama-beats-gpt4.

rombodawg

Aug 26, 2023

@michaelroyzen one more question, do you plan on releasing the dataset? or is it remaining closed source?

rombodawg changed discussion status to open Aug 26, 2023

henryccook

Aug 26, 2023

I am also wondering whether dataset will be released

michaelroyzen

Phind org Aug 26, 2023

Not at this time. It's part of our secret sauce. But we plan to continue releasing models -- stay tuned for v2 in a few days.

michaelroyzen changed discussion status to closed Aug 26, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment