MBPP evolution

#77

by Edisoncccc - opened Aug 3, 2023

Aug 3, 2023

We were trying to reproduce the evolution result of MBPP based upon the setting from the paper. However, we had issue to reproduce. We attached our result (using bigCode's infra). Could someone help us to reproduce the result from paper?
{
"mbpp": {
"pass@1": 0.33670000000000005,
"pass@10": 0.44990433869536034
},
"config": {
"model": "bigcode/starcoder",
"revision": null,
"temperature": 0.2,
"n_samples": 20
}
}

jiang719

Aug 7, 2023

In the paper, they sampled 200 solutions, n_samples=200. Not sure if this is the problem.

Edisoncccc

Aug 15, 2023

@jiang719 Thanks for your advice but the numbers didn't improve after sampling 200.
{
"mbpp": {
"pass@1": 0.3356,
"pass@10": 0.44719838011231433,
"pass@100": 0.5172314652546645
},
"config": {
"model": "bigcode/starcoder",
"revision": null,
"temperature": 0.2,
"n_samples": 200
}
}

SivilTaram

BigCode org Aug 16, 2023

@Edisoncccc Sorry to make some confusion. We use the MBPP version of MultiPL-E (https://github.com/nuprl/MultiPL-E). Maybe you can try to evaluate it on it?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment