ChaoticNeutrals/Puppy_Purpose_0.69 · Something isn't working for me.

May 13

•

Not sure if the same issues as with Aurora, but I'm getting this when converting, both to FP16 and BF16:

INFO:hf-to-gguf:Loading model: Puppy_Purpose_0.69
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 32
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 2562, in <module>
    main()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 2547, in main
    model_instance.set_vocab()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 1288, in set_vocab
    self. _set_vocab_sentencepiece()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 583, in _set_vocab_sentencepiece
    tokenizer.LoadFromFile(str(tokenizer_path))
  File "C:\Users\User\scoop\apps\miniconda3\current\envs\conver\Lib\site-packages\sentencepiece\__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Internal: C:\b\abs_f7cttiucvr\croot\sentencepiece_1684525347071\work\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized. Size())]

---

INFO:hf-to-gguf:Loading model: Puppy_Purpose_0.69
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 2562, in <module>
    main()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 2547, in main
    model_instance.set_vocab()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 1288, in set_vocab
    self. _set_vocab_sentencepiece()
  File "D:\conda\llama.cpp\convert-hf-to-gguf.py", line 583, in _set_vocab_sentencepiece
    tokenizer.LoadFromFile(str(tokenizer_path))
  File "C:\Users\User\scoop\apps\miniconda3\current\envs\conver\Lib\site-packages\sentencepiece\__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Internal: C:\b\abs_f7cttiucvr\croot\sentencepiece_1684525347071\work\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized. Size())]

jeiku

The Chaotic Neutrals org May 13

•

edited May 13

This is the exact error i get on EVERY model, I have no clue what causes it. @Lewdiculous

jeiku

The Chaotic Neutrals org May 13

This model never saw local hardware, it was completely created in mergekit gui on HF.

Lewdiculous

May 13

•

edited May 13

I just did cgato/L3-TheSpice-8b-v0.8.3 and that one doesn't have this issue, ugh.

Nitral-AI

The Chaotic Neutrals org May 13

Are some of the configs modified compared to what its expecting, looks like it's not sure which tokenizer to select.

Lewdiculous

May 13

•

edited May 13

I am using the llama-bpe files fetched from convert-hf-to-gguf-update.py.
https://files.catbox.moe/45uddo.zip

Lewdiculous

May 13

•

edited May 13

I remember also having issues with ResplendentAI/Aurora_l3_8B, not sure if exactly these but...

@SolidSnacke have you run into this with some models?

Nitral-AI

The Chaotic Neutrals org May 13

•

edited May 13

@Lewdiculous

Left is from the files you sent, right is hf tokenizer_config on model. Official instruct hf repo also updated to one one right.

Official meta instruct repo:

Lewdiculous

May 13

•

edited May 13

Surprised that fetching the files now still gave the result in the left. Might be from base model. Didn't have to check so far tbh.

Lewdiculous

May 13

Honestly, I tried with both the llama-bpe config files and the original repo files and the result was the same.

Nitral-AI

The Chaotic Neutrals org May 13

Yea I've had that issue before as well when doing merges.

jeiku

The Chaotic Neutrals org May 13

Wait so what did I do wrong? All of the constituent models had been quanted before...

Nitral-AI

The Chaotic Neutrals org May 13

Ill have to tamper around with this tomorrow when i have time after work.

Nitral-AI

The Chaotic Neutrals org May 13

•

edited May 13

Wait so what did I do wrong? All of the constituent models had been quanted before...

nothing that i can tell.

Lewdiculous

May 13

Working so far for me:
cgato/L3-TheSpice-8b-v0.8.3
NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS

As examples, not sure if it applies at this stage.

Nitral-AI

The Chaotic Neutrals org May 13

Working so far for me:
cgato/L3-TheSpice-8b-v0.8.3
NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS

As examples, not sure if it applies at this stage.

Both appear to be using the outdated version like on the left example you sent above.

Lewdiculous

May 13

Well then. We did try with the new and previous version.

Lewdiculous

May 13

Maybe we wait for clarifications on this then:
https://github.com/ggerganov/llama.cpp/issues/7129

nbeerbower

May 13

Converting to GGUF worked for me after deleting tokenizer.model. I can upload some quants if you want.

Lewdiculous

May 13

•

edited May 13

Eeh. IT ACTUALLY WORKS! Huge, nbeer! Do you want to upload Puppy_Purpose_0.69 then? :D - But I don't mind doing it too.

nbeerbower

May 13

No go for it! Glad I could help :)

Lewdiculous

May 13

•

edited May 13

Turns out we're good to go. Hurray!

llm_load_print_meta: model size       = 14.96 GiB (16.00 BPW)
llm_load_print_meta: general.name     = Puppy_Purpose_0.69
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: LF token         = 128 '├ä'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'

I see you've been cooking many experiments, anything hot so far in your opinion?

Update:

I might fall asleep before uploading, if that happens it will come in the morning.

Update:

It's all uploaded.

jeiku

The Chaotic Neutrals org May 13

Thank you so much @nbeerbower I've been trying to keep all model files together since sometimes NOT having them causes errors, never thought including an official Meta Llama 3 file would break anything.

nbeerbower

May 13

@Lewdiculous Not really, I just shit out a bunch of models on intuition and let others test them 😝

tbf, RP isn't my main goal and I don't want to use llama in production for licensing reasons... but I've been focused on improving chatML support and leaderboard performance (llama3 seems to really like learning languages)

@jeiku no prob m8, I spent hours this morning trying to get llama3 quants working and the tokenizer is definitely a huge pain in the ass lol