Nice work! Which dataset did you use for measurement.json?

#2
by Panchovix - opened

Thanks for this new model. I wanted (or, if you want) to do a 4.5bpw model, since it fits nicely on 72GB VRAM without using FP8 cache.

Cheers!

Panchovix changed discussion status to closed

I used the pygmalion chat data set: https://huggingface.co/datasets/jasonkstevens/pippa-llama2-chat/tree/refs%2Fconvert%2Fparquet/default/train

And I'd love if you provided a 4.5 bpw quant!

Awesome! I appreciate it! Will update the model card to include a link to your quant.

Thanks! Added a 4.25bpw one here as well https://huggingface.co/Panchovix/Venus-120b-v1.0-4.25bpw-h6-exl2, mostly to have more headroom and being able to use CFG.

Sign up or log in to comment