Custom 4-bit Finetuning 5-7 times faster inference than QLora

#7
by rmihaylov - opened

Sign up or log in to comment