NewHope Model Withdrawal! HumanEval Leaked in the Training set!

#4
by Ziyang - opened

The dataset of newhope was contaminated with HumanEval.
They announce the withdrawal of their open-source model.
https://github.com/SLAM-group/newhope
image.png

Ziyang changed discussion title from HumanEval Leaked in the Training set to Model Withdrawal! HumanEval Leaked in the Training set!
Ziyang changed discussion title from Model Withdrawal! HumanEval Leaked in the Training set! to NewHope Model Withdrawal! HumanEval Leaked in the Training set!

Yes I saw this. I don't really understand why they had to remove the whole model, instead of just withdrawing the benchmark results. It is still an excellent model that people really like - I have had a lot of very positive feedback about it.

But if they want me to remove my NewHope uploads, I will do that. Are you involved with NewHope? Or do you know the team?

I find they delete all things about this model.
Maybe you can ask them by the author's email (cui.wanyun@sufe.edu.cn).

Yeah I agree with TheBloke, it seems illogical for them to remove the entire thing. They even deleted the entire github history.

It kind of comes across as if they were pressured to delete it or something. I'm just guessing here, of course, but I do find it strange.

I've updated the model description with the following:


Original model removed

The original model creator has removed their model, due to their training dataset being contaminated with samples from the HumanEval dataset.

This does not affect the ability of this model, which has proven to be very high quality.

For the moment I am going to leave this model up, but will remove it if this is requested by the original model creator, or if/when they upload a new version.


My guess is that they are going to re-publish the model when they have fixed the dataset, as the evaluation benchmarks is what they care about most. If/when that happens, I will delete this version and replace it with the new version. Or I'll also delete it if they ask me to specifically.

This is one reason i keep all models i download even if im not 'using' them, ( well ones that i like after testing that i want to investigate more ). These days you never really know what may happen tomorrow. This one was one that seemed to be good, at least at python coding ( my use case for most AI models )

deleted

Yeah I agree with TheBloke, it seems illogical for them to remove the entire thing. They even deleted the entire github history.

It kind of comes across as if they were pressured to delete it or something. I'm just guessing here, of course, but I do find it strange.

For only 100 elements out of tons... i agree, its strange. Unless they found that it was such a hit, they want to bring it back, as a paid model.... These days nothing would surprise me.

deleted

Still gone. I bet it does not come back. which is a shame, it seems to do really well with code for me ( never have got things like wizardcoder or any salesforce model to work worth anything and ended up using wizardmega. While we do have the GGML here ( for now ) and those us of that grabbed a copy of the HF files, but no more larger models or advancements may be coming.

Sign up or log in to comment