nsfwthrowitaway69
/

Venus-120b-v1.0

Text Generation

Not-For-All-Audiences

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Nice work! Which dataset did you use for measurement.json?

#2

by Panchovix - opened Nov 26, 2023

Nov 26, 2023

Thanks for this new model. I wanted (or, if you want) to do a 4.5bpw model, since it fits nicely on 72GB VRAM without using FP8 cache.

Cheers!

Panchovix changed discussion status to closed Nov 26, 2023

nsfwthrowitaway69

Owner Nov 26, 2023

I used the pygmalion chat data set: https://huggingface.co/datasets/jasonkstevens/pippa-llama2-chat/tree/refs%2Fconvert%2Fparquet/default/train

And I'd love if you provided a 4.5 bpw quant!

Nov 26, 2023

Thanks! 4.5bpw one is here https://huggingface.co/Panchovix/Venus-120b-v1.0-4.5bpw-h6-exl2

nsfwthrowitaway69

Owner Nov 26, 2023

Awesome! I appreciate it! Will update the model card to include a link to your quant.

Nov 26, 2023

Thanks! Added a 4.25bpw one here as well https://huggingface.co/Panchovix/Venus-120b-v1.0-4.25bpw-h6-exl2, mostly to have more headroom and being able to use CFG.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment