Some other quantizations

by localAGI - opened Jun 4, 2023

Discussion

localAGI

Jun 4, 2023

•

edited Jun 4, 2023

Hey, any chance you add a fp16 variant of the model?

Does it make any difference in executing?

I am running on GPU. Afaik fp16 model would be around 28G, so should do nicely with 80-90% offloading to a 24GVram card.

michaelfeil

Owner Jun 4, 2023

Might be able to do it.

Just not sure, if a partial offloading is supported with Ctranslate2, and I am also not sure for which reason you would want to load in fp16. fp16 would be 32GB also

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment