ShareLock: Ultra-Lightweight CLIP-like Vision-Language Model

Welcome to the Hugging Face repository for ShareLock, an ultra-lightweight CLIP-like vision-language model. This repository hosts pretrained checkpoints for ShareLock, enabling easy integration into your projects.

ShareLock is introduced in the paper:
"Do Better Language Models Have Crisper Vision?"
Jona Ruthardt, Gertjan J. Burghouts, Serge Belongie, Yuki M. Asano

🌐 Project Page ⌨️ GitHub Repository πŸ“„ Read the Paper on arXiv


🧠 Model Overview

ShareLock combines strong frozen features from unimodal vision and language models to achieve competitive multimodal performance with minimal resources.

Key Highlights:

  • Ultra-Lightweight: ShareLock is trained on only 563k image-caption pairs, requiring just 1 GPU hour.
  • Efficient Performance: Achieves 51% zero-shot accuracy on ImageNet.
  • Plug-and-Play: Easily integrates into downstream vision-language tasks.

πŸ“‚ Available Checkpoints

Model Variants:

  1. ShareLock trained on CC3M
  2. ShareLock trained on CC12M

πŸš€ Usage

You can load ShareLock models using the ShareLock class directly for inference or fine-tuning:

Example: Zero-shot Classification

from sharelock.models.model import ShareLock

# Path to the checkpoint
checkpoint_path = "path/to/checkpoint.ckpt"
config = {
    # Add your configuration for model hyperparameters etc. here
}

# Load the ShareLock model
model = ShareLock.load_from_checkpoint(checkpoint_path, config=config)

# Encode text and images for multimodal tasks
image_embeddings = model.encode_image(your_image_tensor)
text_embeddings = model.encode_text(["a cat", "a dog"])

# Perform multimodal operations

πŸ› οΈ Details

For training scripts, evaluation, or further implementation details, visit our GitHub repository


πŸ“œ Citation

If you use ShareLock in your research, please cite:

@article{ruthardt2024sharelock,
  title={Do Better Language Models Have Crisper Vision?},
  author={Jona Ruthardt and Gertjan J. Burghouts and Serge Belongie and Yuki M. Asano},
  journal={arXiv preprint arXiv:2410.07173},
  year={2024}
}

πŸ“§ Contact

For any questions or collaborations, feel free to reach out to Jona Ruthardt.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for FunAILab/ShareLock

Finetuned
(35)
this model