Can BAAI/bge-reranker-v2-gemma be run quantized?

by dophys - opened Jun 5, 2024

Jun 5, 2024

Hello, I'm interested in bge-reranker based on gemma. A question is if this model could be run in a quantized form. This would greatly improve inference efficiency and reduce memory requirements.
I used torch to quantize this model (int8), but fragembedding doesn't seem to support running quantized models. Can anyone give me some guidance?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment