Wizardcoder controversy

#2
by rombodawg - opened

Screenshot_20230828_174713_Opera GX.jpg

I saw the other post about this, but why delete the old repo, if you guys didnt use wizardcoder then you would have nothing to hide. You wouldnt need to delete the repo, you could just tell people a logical reason for why the config file was named (wizardcoder-34b) and it be settled.

But you decided to act suspicious instead. Seems more likely that you stole the model than that you didnt based on your behavior alone...

Again, we didn't use their model. Our v1 model (released before Wizardcoder btw) was trained on a Wizardcoder-style dataset that we made ourselves and this was the internal nomenclature for the model. We shared the same motivations as Wizardcoder (hence the name) but used our own methods and data.

Even based on this screenshot it wouldn't make any sense that we used their model because they have disclosed nothing about checkpoint information.

Again, we did not use anything from Wizardcoder and I want to make sure that we are extremely clear about this. It's obvious if you use the model that it is completely different -- theirs is derived from CodeLlama-34B-Python while this model is derived from CodeLlama-34B.

michaelroyzen changed discussion status to closed

@michaelroyzen Then why delete the old repo and reupload it. You still havnt explained that?

Saying "Our v1 model (released before Wizardcoder btw) was trained on a Wizardcoder-style dataset that we made ourselves and this was the internal nomenclature for the model." doesnt make sense either, because the v1 models are "codellama-34b-python-hf" not "codellama-34b-wizardcoder".

This just doesnt add up to me. Because if what you claim was correct the original config for V1 would not look like this:

{
  "_name_or_path": "/fsx/codellama-34b-python-hf",
  "architectures": [
    "LlamaForCausalLM"
  ],

It would look like this


/codellama-34b-wizardcoder/checkpoint...

Since you havnt changed the V1 models config, you only changed the V2 config to match the V1 what you say doesnt make any sense and it still seems like a cover up.

Now I want to point out that I like your models alot, and the only reason why im so heated about this is because I care that your model isnt stolen. But you guys arent doing a good job proving that. So some more explanation is in order

rombodawg changed discussion status to open

Yeah I think its bullshit - if they were so sure of it 1) They wouldn't instantly commit to the old repo's config file 2) Delete the whole repo and create a new one with just the config file modified to their benefit. Also checkpoint argument does not make sense as it is probably a checkpoint from another training run of WizardCoder. I am very disappointed in this but what can we do - we can never prove that they used WizardCoder.

The v1 model configs were instantiated from codellama-34b-hf and codellama-34b-python-hf, respectively. The outputs of v1 were saved as code-llama-34b-wizardcoder and code-llama-34b-python-wizardcoder, respectively, and both of those models were released before wizardcoder came out. So it makes perfect sense that the origin of our model was from code-llama-34b-wizardcoder, despite that actually being our own model that was released before Wizardcoder. The nomenclature is unfortunate, but here's the proof that code-llama-34b-wizardcoder was trained before Wizardcoder-34B was released.

We shared the same motivations as Wizardcoder (hence the name) but used our own methods and data.

Screenshot 2023-08-28 at 3.21.40 PM.png

michaelroyzen changed discussion status to closed
  1. You still havnt explained why you deleted the repo

  2. The date on that image doesnt prove that you trained that model before wizardcoder-python-34b came out, because the wizardcoder model came out the same day. (Today is 8/28/2023) so 3 days ago was august 25th 2023)

So I once again reiterate that you guys are not proving yourself innocent, but actually are more proving yourselves guilty

Screenshot (428).png

rombodawg changed discussion status to open

My screenshot is definitive exoneration because the Wizardcoder upload timestamp is on Saturday in GMT time while ours is Friday morning. And you can't rename S3 folders, so there's no way we could've renamed it.

Screenshot 2023-08-28 at 3.34.44 PM.png
Screenshot 2023-08-28 at 3.36.13 PM.png

@rombodawg @bhuwansaik

Our model was clearly created first. The only error on our part is unfortunate nomenclature, but we did not steal anything from anybody.

michaelroyzen changed discussion status to closed

Your model was made at 7:00am UTC, the wizardcoder model was uploaded 5:06 GMT time, which is 5:06 UTC time

The wizardcoder model was uploaded before your model was made by two hours.

And once again you still haven't explained why you deleted the repo and reuploaded it
If you can explain why you deleted the V2 of your models repo and reuploaded it that might clear up a bit more than you stating facts that are false.

Screenshot (430).png

rombodawg changed discussion status to open

What are you talking about? Our model was uploaded at 5:30PM UTC on Friday while Wizardcoder was uploaded at 5:06AM UTC on Saturday.
Screenshot 2023-08-28 at 3.54.19 PM.png

Our model was clearly uploaded nearly 12 hours before. Please learn to read time properly before accusing us of malfeasance. @rombodawg .

We recreated the repo because we are trying to set up a Huggingface Endpoint for hosted model inference and I was trying to debug a separate issue there.

This is a dead horse. I am closing this discussion.

michaelroyzen changed discussion status to closed

Ok i wont open the discussion back up, but I will run some tests at 0 temperature and report back here to see if the two models have the same outputs or not

It's literally not possible for the models to be the same. I'm not sure what additional proof you could want. Our model was uploaded first by 12 hours. Feel free to run whatever tests you'd like -- they're completely different models.

@michaelroyzen I appreciate you from coming back here and defending yourself every time I responded to your claims. Like i stated before I like your teams models. In fact I prefer them over wizardcoder. My goal here was to have you display the evidence for you being innocent against my accusations (and others) so others could see it in the future.

I want to apologize for any for any frustration i may have cause but please understand that although unorthodox, this discussion was all in good faith.

rombodawg changed discussion title from Wizarscoder-34b stolen scandal??? to Phinds proof of innocence over stealing wizardcoder???

Michael, what I don't understand is why the config file was modified to remove the wizard part before you deleted the repo. That commit serves no purpose besides looking like an attempt to hide the model name.

We wanted to prevent precisely the type of misunderstanding that ended up happening. If you look at our upload times and S3 logs, you will see that we created our own model called wizardcoder-checkpoint-1000 before the Wizardcoder model was uploaded. The naming is confusing and refers to our own wizardcoder-style dataset. Specifically, we shared the same motivations as Wizardcoder (hence the name) but used our own methods and data. Hence, we regret any naming confusion, but I want to be extremely clear that there's no commonality between the models and we did not and would never steal or misattribute anyone's work. @henryccook

@michaelroyzen are you guys planning on making a 13b version of your model?

michaelroyzen changed discussion title from Phinds proof of innocence over stealing wizardcoder??? to Wizardcoder controversy

We are not planning smaller models at this time. Please open a new thread @rombodawg

For people that would like my timeline of events:

  • User @bhuwansaik mentioned that the config.json file had the wizardcoder checkpoint instead of the expected Phind checkpoint
  • 20 minutes later, Phind makes a commit removing the mentioned line from their huggingface repo
  • Roughly 10 minutes later (correct me if I'm wrong, I can't check because the space was deleted @michaelroyzen ), Phind deleted the space and reuploads their model, unfortunately making it very hard to verify any commits or discussions about the mishap.
  • Phind explains that they removed WizardLM from config file because they wanted to avoid misunderstandings.
  • After reuploading the repo, Phind explains that they deleted the repo due to huggingface endpoint reasons (They have not elaborated on this yet, and it seems much more likely to me that they wanted to remove history of their wizardcoder checkpoint but I will concede that this is not solid proof)
  • @bhuwansaik posts the screenshot of the config.json file containing the bizarre line about WizardCoder, and I post the screenshot of the commit removing WizardCoder from config.json
  • WizardLM team is understandably frustrated

I would like to note that not all theories in this thread above this message are firm evidence, just a couple people trying to figure out why on earth the Phind model had a wizardcoder config. I am sorry for any inconvenience this has caused, but hopefully the OSS LLM training community can get to the bottom of this. I have attached below @bhuwansaik 's screenshot of the config.json file before the model was deleted (and before the commit removing the wizardcoder line), but I will leave the final say to the OSS community.
e8u7ZNenNL3T5GAElvJvB.jpeg

@henryccook everything is explained above. Please refer to our S3 screenshot that shows the codellama-34b-wizardccoder-checkpoint-1000 is our model that we trained before Wizardcoder-34B was released. The reason why it is called Wizardcoder is because our motivations were similar, but the model, the dataset, and our methods are our own. We've been accused of everything from stealing Wizardcoder's model to using their methods without attribution. All of these claims are demonstrably false.

Sign up or log in to comment