further quants

by IkariDev - opened Jul 23, 2023

Discussion

IkariDev

Jul 23, 2023

could you please upload more quants or the source model so that we can do them?

totally-not-an-llm

Owner Jul 23, 2023

Thanks for the interest. The source model is available, it’s in the AlpacaCielo folder. I’ll upload some more official quants tomorrow.

HighlandGNU

Jul 23, 2023

Thank you very much for your hard work!

Could you please make some suggestions for newbies on which settings to use running this on OobaBooga? For example, consumer hardware 16GB VRAM, and 64GB RAM. Others might like to know for 8GB VRAM / 32GB RAM.

Settings like which way to load the model, max tokens...

totally-not-an-llm

Owner Jul 23, 2023

At the moment, in ooba I use the ggml model and offload all 41 layers to gpu. Even with 8gb vram, it might still work. However, gptq's should be available soon which will be a better option for anyone with a sufficient gpu. As for max tokens, anything should be fine, it doesn't have a problem with repetition or infinite responses from my testing.

totally-not-an-llm

Owner Jul 24, 2023

TheBloke has created quants for this model, which you can find on his page.

totally-not-an-llm changed discussion status to closed Jul 24, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment