Is there a Quantized version(s)?

by mrmikelevy - opened Aug 31, 2023

Aug 31, 2023

I was hoping there would be a quantized version. I know it would be less accurate, but the performance might make up for it. Doing zero-shot-classification.

sileod

Owner Aug 31, 2023

Hi, nice idea, do you want this for CPU workloads ? The model already fits small GPU

mrmikelevy

Sep 2, 2023

Yes. I've been using a Tesla T4 GPU, but the model is so small that it seems like moving it to CPU might be worth it. Unquantized, it runs about 5 times faster on GPU. I think if it was quantized, it would be about the same speed on CPU.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment