6bpw quant request
#2
by
Lissanro
- opened
6bpw would be nice to have, if possible. 7bpw works but it is a tight fit even on 4 GPUs, if loading it with a draft model for speculative decoding. Thank you for your time and effort!
bigstorm
changed discussion status to
closed