Add SHA256 for LLaMA itself and the PyTorch conversion?

by KSlith - opened

Is it possible the SHA256 hashes for both the vanilla LLaMA-13B and the PyTorch conversion could also be added to the description for troubleshooting purposes?

After doing the xor decode, I've gone through and tested the hashes I do have for both the xor_encoded_files and the original PTH files from Meta, and they all match, but the SHA256 for "pytorch_model-00003-of-00003.bin" at the VERY end did not match those in the description. Problem is, I have no clue if the issue is with, the output from, or something else.

Official LLaMA 13B PTH SHA256: (Add these to the description please, Credit Lowqualitybot on Github for the Hashes, and dylancvdean on Github for confirming the hashes)
4ab77bec4d4405ccb66a97b282574c89a94417e3c32e5f68f37e2876fc21322f ./13B/params.json
745bf4e29a4dd6f411e72976d92b452da1b49168a4f41c951cfcc8051823cf08 ./13B/consolidated.00.pth
d5ccbcc465c71c0de439a5aeffebe8344c68a519bce70bc7f9f92654ee567085 ./13B/consolidated.01.pth
183eb00cea5c880fd88c296af1038f4c15dc26aa2ccb7c6cf2c35b9bb00dce45 ./13B/checklist.chk

SHA256 for my results from (Edit: Reduced down to the confirmed miss-match):
2efc56cddb7877c830bc5b402ee72996aedb5694c9e8007bf1d52d72c0d97d26 LLaMA_13B_PyTorch/pytorch_model-00003-of-00003.bin
SHA256 for the known bad result from
0c62600c4ea615684f82551c06bd4f9953195aa4b188872498728693f158286b Output_pygmalion-13b/pytorch_model-00003-of-00003.bin <-- BAD HASH?

All other hashes for the expected result from match the expected result given in the model card.

Edit: Truncated unrelated hashes for thread readability, clarified some details.

Pygmalion org

Looks like your pth -> hf conversion is the issue. Your third checkpoint doesn't match the expected hash:

$ sha256sum llama-13b-hf/pytorch_model-00003-of-00003.bin
eac58861ca5f3749c819676d908b906d3df38046c09efe055fe78d7678b718e7  pytorch_model-00003-of-00003.bin

You can look here for a list of all the hashes.

Edit: -SNIP- The hashes on that site are severely out of date compared to the model description.

After messing with the versions of all the python dependencies, including switching up transformer versions, I got the same hashes again. I checked my PTH hashes against a few others' just in case and they've matched with 5 others' now. I made a pass using WSL2 Ubuntu and the instructions mentioned in the docs linked and got the exact same output there too. The XOR's hashes match the GIT LFS hashes on the HF repository too, and there's exactly 312mb missing off the final result vs the PTH -> HF conversion.

@alpindale have you tried following the instructions from a fresh install to see if it's on your guys' end? the version on the git is 312mb smaller than the regular LLaMA model.

@alpindale having exactly the same issue as KSlith, last pytorch_model-00003-of-00003.bin has a different hash than the one displayed in the model card.

@alpindale having exactly the same issue as KSlith, last pytorch_model-00003-of-00003.bin has a different hash than the one displayed in the model card.
And I can not load this model in oobabooga/text-generation-webui, some error occurs!

Pygmalion org

@KSlith @P4l1ndr0m @xluke I've been trying to reproduce it myself but none of the Transformers commits seem to work unfortunately.

Some people have already uploaded the merged Pygmalion 13B model, so you could look it up on huggingface.

Sign up or log in to comment