i have problem in run ai

by Danial753 - opened Feb 16

Discussion

Danial753

Feb 16

can write code

lemonilia

Owner Feb 16

Please be clearer; I'm not sure what you mean.

For the record, before uploading I test my models with text-generation-webui.

Danial753

Feb 17

i download youre model but i cant run model
becuse youre model not have file .bin ore t5 ...
yore model just have .gguf
Can you explain how to run the model and write the executable code and the link to download the model or related models

lemonilia

Owner Feb 17

For the time being, since the model is getting relatively frequently updated, I'm only providing one GGUF 6-bit quantization and the LoRA adapter.
Other users have merged the LoRA adapter to the base model, but those versions might not be up-to-date.

Example: https://huggingface.co/royallab/ShoriRP-merged

Danial753

Feb 17

Can you tell me exactly which files I should use in which projects and send me the download address and the names of the files I need to download?

KatyTheCutie

Feb 17

@Danial753 You can use the model in KoboldCPP

Herman555

Feb 17

•

edited Feb 17

You're either going to want to use https://github.com/oobabooga/text-generation-webui
OR
https://github.com/LostRuins/koboldcpp

text-generation-webui is a heavyweight project with a big file size that will run any relevant format you want, this includes GGUF. Follow the instructions on the main page for install. During installation it will ask which GPU you have, if any. Select one of the options, A, B, C, etc. If install is successful it will give you a local URL in the CMD you can copy and paste in your browser to access the UI.

Koboldcpp is a lightweight project with a small file size, dedicated to running models with purely the GGUF format in mind. Simply download the koboldcpp.exe file from the latest release and run it, there's no installation required. Most settings will be automatically chosen for you correctly. Simply pick a model to load, once you launch a UI should open automatically for you.

The most important setting to keep in mind is the context, make sure that is appropriately set for both projects, most mistral models will be 8k unless they are extended, in which case they may go past that limit.

The files you will want to download depends on the format you want to use, in this case you need to download the .GGUF file in this repository, just one file. If you need a description of what the different quantizations mean check here as an example, this should also give you an idea of how much VRAM/RAM it will take to run it.

https://huggingface.co/TheBloke/WestLake-7B-v2-GGUF

Lastly I recommend to use https://github.com/SillyTavern/SillyTavern with either project, it is a UI made specifically for roleplay.

Here is a video showcasing the install process for text-generation-webui and sillytavern. https://youtu.be/enWO16x6tRM?si=A-8w5h-1axmx6rze

Have fun.

lemonilia changed discussion status to closed Feb 24

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment