Exllama

by ndurkee - opened Sep 29, 2023

Discussion

ndurkee

Sep 29, 2023

I just wanted to confirm that this works with Exllamav1. I can't comment on v2 at the moment.

TheBloke

Owner Sep 29, 2023

Great, thanks for letting us know!

SergiusFlavius

Oct 3, 2023

It works with Exllama v2 (release: 0.0.4).

c:\AI\exllamav2>call .\venv\Scripts\activate   & python examples/chat.py --mode raw --model_dir c:\AI\exllamav2\models\Mistral-7B-Instruct-v0.1-GPTQ-4bit-32g-actorder_True
 -- Model: c:\AI\exllamav2\models\Mistral-7B-Instruct-v0.1-GPTQ-4bit-32g-actorder_True
 -- Options: ['rope_scale 1.0', 'rope_alpha 1.0']
 -- Loading model...
 -- Loading tokenizer...

User: Hi

Chatbort: Hello! How can I help you today?

zcamz

Nov 16, 2023

This comment has been hidden

larsskaug

Dec 18, 2023

Are you finding it slower in exllama v2 than in exllama? I do.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment