README.md · mobiuslabsgmbh/CLIP-ViT-H-14-laion2B-2bit_g16_s128-HQQ at ff665e4eec77565d8ad3ea1eb8d70c1bdf7bbfbc

metadata

license: mit
train: false
inference: false
pipeline_tag: zero-shot-image-classification

CLIP-ViT-H-14-laion2B-2bit_g16_s128-HQQ

This is a version of the ViT-H-14 model based on timm's vit_huge_patch14_clip_224.laion2b quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/

This 2-bit model achieves a 0.716 zero-shot top-1 accuracy on Imagenet, outperforming a full-precision ViT-B-32 (0.664).

To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:

from hqq.models.vit import ViTHQQ
model = ViTHQQ.from_quantized("mobiuslabsgmbh/CLIP-ViT-H-14-laion2B-2bit_g16_s128-HQQ")

Limitations:
-Only supports single GPU runtime.
-Doesn't support finetuning the linear layers.