i have problem in run ai

#1
by Danial753 - opened

can write code

Please be clearer; I'm not sure what you mean.

For the record, before uploading I test my models with text-generation-webui.

i download youre model but i cant run model
becuse youre model not have file .bin ore t5 ...
yore model just have .gguf
Can you explain how to run the model and write the executable code and the link to download the model or related models

For the time being, since the model is getting relatively frequently updated, I'm only providing one GGUF 6-bit quantization and the LoRA adapter.
Other users have merged the LoRA adapter to the base model, but those versions might not be up-to-date.

Example: https://huggingface.co/royallab/ShoriRP-merged

Can you tell me exactly which files I should use in which projects and send me the download address and the names of the files I need to download?

@Danial753 You can use the model in KoboldCPP

You're either going to want to use https://github.com/oobabooga/text-generation-webui
OR
https://github.com/LostRuins/koboldcpp

text-generation-webui is a heavyweight project with a big file size that will run any relevant format you want, this includes GGUF. Follow the instructions on the main page for install. During installation it will ask which GPU you have, if any. Select one of the options, A, B, C, etc. If install is successful it will give you a local URL in the CMD you can copy and paste in your browser to access the UI.

Koboldcpp is a lightweight project with a small file size, dedicated to running models with purely the GGUF format in mind. Simply download the koboldcpp.exe file from the latest release and run it, there's no installation required. Most settings will be automatically chosen for you correctly. Simply pick a model to load, once you launch a UI should open automatically for you.

The most important setting to keep in mind is the context, make sure that is appropriately set for both projects, most mistral models will be 8k unless they are extended, in which case they may go past that limit.

The files you will want to download depends on the format you want to use, in this case you need to download the .GGUF file in this repository, just one file. If you need a description of what the different quantizations mean check here as an example, this should also give you an idea of how much VRAM/RAM it will take to run it.

https://huggingface.co/TheBloke/WestLake-7B-v2-GGUF

Lastly I recommend to use https://github.com/SillyTavern/SillyTavern with either project, it is a UI made specifically for roleplay.

Here is a video showcasing the install process for text-generation-webui and sillytavern. https://youtu.be/enWO16x6tRM?si=A-8w5h-1axmx6rze

Have fun.

lemonilia changed discussion status to closed

Sign up or log in to comment