Transformers
GGUF
5 languages
falcon
falcon-40b
long-context
NTK-YaRN
text-generation-inference

Worked in LM Studio, not Text-gen web UI

#1
by cvinker - opened

I had it working fine in LM studio but it wasn't following instructions, so I wanted to try it in TGWUI. When I try and load it I get this error:

ERROR: byte not found in vocab: '
'
2023-11-17 14:52:47 ERROR:Failed to load the model.
Traceback (most recent call last):
  File "C:\Users\Colin\Downloads\text-generation-webui-main\modules\ui_model_menu.py", line 210, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(shared.model_name, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Colin\Downloads\text-generation-webui-main\modules\models.py", line 85, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Colin\Downloads\text-generation-webui-main\modules\models.py", line 249, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Colin\Downloads\text-generation-webui-main\modules\llamacpp_model.py", line 91, in from_pretrained
    result.model = Llama(**params)
                   ^^^^^^^^^^^^^^^
  File "C:\Users\Colin\Downloads\text-generation-webui-main\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama.py", line 357, in __init__
    self.model = llama_cpp.llama_load_model_from_file(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Colin\Downloads\text-generation-webui-main\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama_cpp.py", line 498, in llama_load_model_from_file
    return _lib.llama_load_model_from_file(path_model, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: exception: access violation reading 0x0000000000000000

Exception ignored in: <function LlamaCppModel.__del__ at 0x000001A567833880>
Traceback (most recent call last):
  File "C:\Users\Colin\Downloads\text-generation-webui-main\modules\llamacpp_model.py", line 49, in __del__
    self.model.__del__()
    ^^^^^^^^^^
AttributeError: 'LlamaCppModel' object has no attribute 'model'

Any ideas? Thank you for the quantized models always.

That's the error that llama-cpp-python always gives when it generically can't load the model for some reason

I've not tested llama-cpp-python with non Llama models recently. In theory it shoud work, as they're all GGUF now, but it's possible there's some issue there..

In the past llama-cpp-python did have issues with BPE vocab models, and this is a BPE vocab model. I thought that was fixed, but maybe not.

I tested it in llama.cpp and it appears to work great

 CUDA_VISIBLE_DEVICES=0 ./main -m /workspace/process/lightonai_alfred-40b-1023/gguf/alfred-40b-1023.Q4_K_M.gguf -p "<start_system>You are Alfred, a helpful assistant trained by LightOn. Knowledge cutoff: November 2022. Current date: 16 November, 2023<end_message><start_user>Write a story about llamas<end_message><start_assistant>"  -ngl 100

Output 1:

You are Alfred, a helpful assistant trained by LightOn. Knowledge cutoff: November 2022. Current date: 16 November, 2023Write a story about llamasThe llamas had always known that the humans were strange, but they never could have imagined what they would find when they looked at them with their powerful telescopes.

There was a large telescope array on the dark side of the moon, where there is no interference from radio waves. The array was powered by solar panels and was designed to look for signs of intelligent life in other parts of the galaxy.

The scientists who worked at the array were surprised to receive signals coming from the planet Earth that seemed to contain information about how to build a large space telescope. The scientists decided to point their own telescope array back at Earth to see if they could get a better signal.

When the scientists looked through their own telescope, they saw llamas looking back at them. It turned out that the humans had evolved from llamas millions of years ago, but the two species had gone their separate ways and lost contact with each other.

The scientists decided to send a message back to Earth in the hopes that the llamas would be able to understand it. The message contained instructions on how to build an even larger telescope array that could receive signals from much farther away.

The llamas were skeptical at first, but they eventually decided to give it a try. They built the telescope array and pointed it towards the moon, where they knew the original signal had come from.

When the llamas looked through the telescope, they saw images of the scientists who had originally sent the message. The scientists waved and smiled, and the llamas realized that they were not alone in the universe after all. [end of text]

Output 2:

You are Alfred, a helpful assistant trained by LightOn. Knowledge cutoff: November 2022. Current date: 16 November, 2023Write a story about llamasOnce upon a time in a faraway land called Llama Land lived many llamas. The llamas were happy and peaceful people who loved to dance, play music, and spend time with their families. One day, the king of the llamas decided to hold a grand celebration to honor his daughter's birthday.

The kingdom was filled with excitement as preparations began for the big event. Everyone worked hard to decorate the streets with colorful banners and flowers. Bakers baked delicious cakes and pastries, while chefs prepared mouth-watering dishes from fresh ingredients grown in the royal gardens. Musicians tuned their instruments, ready to perform joyful songs that would fill the air with melody.

On the day of the princess's birthday, all the llamas gathered at the palace courtyard to join in the festivities. The atmosphere was filled with laughter and joy as friends reunited and families spent time together, sharing stories and memories. But most importantly, everyone came together to celebrate the love and happiness that the little princess brought into their lives.

As night fell upon Llama Land, a magnificent fireworks display lit up the sky in brilliant colors, creating patterns that danced across the heavens like magical creatures come alive. And beneath it all, surrounded by those who loved her most dearly, stood the birthday girl herself – a beautiful young llama whose smile radiated brighter than any firework ever could.

And so ended another wonderful day in Llama Land where love and happiness reigned supreme thanks to the unbreakable bond shared between its people – human or otherwise – who cherished each moment spent together like precious gems worthy of being treasured forevermore. [end of text]

Although I did once get this output, when using CUDA, which was total mayhem

You are Alfred, a helpful assistant trained by LightOn. Knowledge cutoff: November 2022. Current date: 16 November, 2023Write a story about llamasIn the lush green grassland of the Andes lived the llamas. There was Lola, the mother llama who had given birth to three babies named Lulu, Luca, and Lilly. One day, when they were playing in the field, they saw something peculiar coming down from the sky. It looked like a flying saucer made out of metal. The object came closer and finally landed right in front of them.

Lola and her kids walked towards the mysterious UFO cautiously. Suddenly, the door of the spacecraft opened with a swoosh sound, revealing four green creatures standing inside. They had long arms and legs with three fingers on each hand and foot respectively. Their skin was covered by tiny scales similar to those found on reptiles such as lizards or snakes.

The extraterrestrial beings made some unintelligible noises while pointing at the llamas using their slender fingers equipped with sharp claws at the tips. Despite being initially startled by these unexpected guests from outer space, Lola realized that they meant no harm since there was no sign of aggression on their faces which resembled humanoid versions of turtles due to having beak-like mouths instead typical mammalian ones such as nostrils located near eyes etcetera.

Lola nudged her kids forward so they could approach these strange beings more closely without feeling afraid anymore because curiosity had finally overcome fear within each member present including herself too now that she understood there was nothing dangerous about those who came here seeking only peaceful interactions between two different species inhabiting separate worlds linked together by chance encounter today under sunny skies above green pastures filled with flowers blooming brightly everywhere amidst gentle breeze blowing softly across everyone gathered around this unprecedented meeting point bridging gap between known universe inhabited primarily by humans along with various other forms life yet undiscovered until now thanks largely due contributions made possible through advancements achieved within field science technology allowing exploration beyond boundaries previously thought unattainable except perhaps dreams fueled imagination storytellers weaving tales fantastic adventures taking place somewhere far away across galaxies unknown never seen before eyes behold wonders beholden only those daring enough embark upon journey seeking answers questions still left unanswered since beginning time itself forever asking why how when where but also what if maybe someday soon even ourselves might become part something greater than anything .... 

But then I generated a few more CUDA ones and they were fine, so I'm not sure.

It's also possible that the custom NTK-Yarn isn't fully supported by llama.cpp yet.

It does work with llama_cpp_python==0.2.17 in my system, CPU.

model.png

2nd.png

template.png

Great, thanks for the update. That makes sense - recent llama-cpp-python versions fixed the vocab issue I think.

So anyone with TextGen issues just needs to update to the latest version which should support this.

Sign up or log in to comment