MBPP evolution

#77
by Edisoncccc - opened

We were trying to reproduce the evolution result of MBPP based upon the setting from the paper. However, we had issue to reproduce. We attached our result (using bigCode's infra). Could someone help us to reproduce the result from paper?
{
"mbpp": {
"pass@1": 0.33670000000000005,
"pass@10": 0.44990433869536034
},
"config": {
"model": "bigcode/starcoder",
"revision": null,
"temperature": 0.2,
"n_samples": 20
}
}

In the paper, they sampled 200 solutions, n_samples=200. Not sure if this is the problem.

@jiang719 Thanks for your advice but the numbers didn't improve after sampling 200.
{
"mbpp": {
"pass@1": 0.3356,
"pass@10": 0.44719838011231433,
"pass@100": 0.5172314652546645
},
"config": {
"model": "bigcode/starcoder",
"revision": null,
"temperature": 0.2,
"n_samples": 200
}
}

BigCode org

@Edisoncccc Sorry to make some confusion. We use the MBPP version of MultiPL-E (https://github.com/nuprl/MultiPL-E). Maybe you can try to evaluate it on it?

Sign up or log in to comment