Thank you very much!

#1
by ajibawa-2023 - opened

Hello, Thank you for all the GGML & GPTQ models of Carl & Scarlett. I highly appreciate your work & help. I will post more models very soon. Thanks!

You're very welcome! Thank you for the great new models.

I was going to message you actually, I wanted to let you know a couple of things that would make it easier for people to download your models:

  1. Firstly, your models are in float32. This makes it twice as large to download
  2. Secondly, you output in a single pytorch_model.bin which I imagine is why you then split it up using split - because of the 50GB file limit? That requires the extra cat pytorch_model.bin-a* > pytorch_model.bin step for people downloading, and also means the model can never work automatically from Transformers.
  • Also, it's usually much quicker to load a model from multiple shards than from one single pytorch_model.bin file. And it takes less RAM to do so.

Fortunately there's an easy way to fix both problems in one. Here's a little script I wrote: https://gist.github.com/TheBloke/8934a51c5572b500c5217f42bfd055a8#file-reshard-py

Run it like this:

 python3 reshard.py --base_model_name_or_path <path to your model> --output_dir <path to output to> --device cpu --max_shard_size '8GiB' --dtype bfloat16

And it will create a float16 model (or bfloat16 model with --dtype bfloat16, which matches the format your models were in) split into chunks like this:

 [venv] tomj@2b8eac64e03a:/workspace/process/carl-33b ᐅ l source
total 61G
drwxrwxrwx 2 tomj tomj 2.9M Aug 16 14:36 .
drwxrwxrwx 5 tomj tomj 3.0M Aug 16 15:44 ..
-rw-rw-rw- 1 tomj tomj 2.3K Aug 16 13:16 .gitattributes
-rw-rw-rw- 1 tomj tomj 1.7K Aug 16 13:16 README.md
-rw-rw-rw- 1 tomj tomj  690 Aug 16 14:37 config.json
-rw-rw-rw- 1 tomj tomj  137 Aug 16 13:52 generation_config.json
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:52 pytorch_model-00001-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:53 pytorch_model-00002-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:53 pytorch_model-00003-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:53 pytorch_model-00004-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:53 pytorch_model-00005-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:54 pytorch_model-00006-of-00008.bin
-rw-rw-rw- 1 tomj tomj 8.0G Aug 16 13:54 pytorch_model-00007-of-00008.bin
-rw-rw-rw- 1 tomj tomj 4.9G Aug 16 13:54 pytorch_model-00008-of-00008.bin
-rw-rw-rw- 1 tomj tomj  44K Aug 16 13:54 pytorch_model.bin.index.json
-rw-rw-rw- 1 tomj tomj  435 Aug 16 13:54 special_tokens_map.json
-rw-rw-rw- 1 tomj tomj 1.8M Aug 16 13:54 tokenizer.json
-rw-rw-rw- 1 tomj tomj 489K Aug 16 13:54 tokenizer.model
-rw-rw-rw- 1 tomj tomj  745 Aug 16 13:54 tokenizer_config.json

Then you can upload that and won't have any 50GB problems, and people can download it and use it immediately, including automatically from Transformers code. And it'll be in float16, not float32, so it only requires half as much time to upload and download.

I still have the sharded files for Carl-33B and Scarlette-33B so if you like I could PR the bf16 sharded version to your repos. Let me know if that'd be helpful.

Thanks again for the great models and looking forward to seeing more! Feel free to ping me when they're up and I'll quantise them.

Thank you very much for making my & other users life easy. I will use the script for sure!
You surely can PR the bf16 sharded version to my repos. That will be great!
Thank you!

Hello Bloke, sorry to bother you. Heartiest congratulations on winning the a16z grant! You guys deserve it!

Hello Bloke,
Can you do the GPTQ & GGML for the following models:

  1. https://huggingface.co/ajibawa-2023/Uncensored-Frank-7B
  2. https://huggingface.co/ajibawa-2023/Uncensored-Frank-13B
  3. https://huggingface.co/ajibawa-2023/Uncensored-Frank-33B
    Looking forward to hearing from you. Thank you very much for sharing the script and necessary instructions.

Hello Bloke, Any luck with quantization of above models. I highly appreciate your work for open source community. Thank you!

Oh sorry, I didn't see this. I'll add them to the queue and do them shortly, in GPTQ, GGUF and AWQ

Surely, super thankful to you!

Models are starting to upload now

Thank you very much Bloke!

This comment has been hidden

Hello Bloke,
I hope you are doing great! Can you help me by quantizing my following new models:

  1. Uncensored-Jordan-7B : https://huggingface.co/ajibawa-2023/Uncensored-Jordan-7B
  2. Uncensored-Jordan-13B : https://huggingface.co/ajibawa-2023/Uncensored-Jordan-13B
  3. Uncensored-Jordan-33B : https://huggingface.co/ajibawa-2023/Uncensored-Jordan-33B
    Looking forward to hearing from you. Thank you for your relentless efforts. Hats off to you!

Yes of course, glad to. I'll add them to the queue now

By the way, is there a reason you're still using Llama 1 for 7B?

Thanks Bloke! It was trained before the release of Mistral.

These are all done

Thank you very much man! Kudos to you.

Hello Bloke,
Trust you are doing great! Can you help me by quantizing my following new models:

  1. Python-Code-13B: https://huggingface.co/ajibawa-2023/Python-Code-13B
  2. Python-Code-33B: https://huggingface.co/ajibawa-2023/Python-Code-33B
    Thanks for your guidance & help. I highly appreciate your dedication towards OSS community.
    Thank you!

Hello Bloke,
Looking forward to a positive response. Sorry, if I am troubling you.

Oh sorry, I missed this. My normal place for model requests is the #model-requests forum in my Discord. So the best results to ensure I see it quickly is to post it there, and ping me there.

I will add these to my queue now. GGUFs and AWQs will come soon, GPTQs in a couple of hours

Thank you Bloke! I am not on Discord but will join soon. Thank you.

Hello Bloke,
Thank you very much for quantized models. I am extremely thankful to you.

Hello Bloke,
How are you? Can you quantize my model: https://huggingface.co/ajibawa-2023/SlimOrca-13B
Thank you very much! Happy Holidays to you.

Sign up or log in to comment