YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

CultureCLIP Model (LoRA Merged)

This is a CLIP model fine-tuned with LoRA for cultural understanding and image-text matching. The LoRA weights have been merged into the base model.

Model Details

  • Base Model: openai/clip-vit-base-patch32
  • Task: Contrastive Image-Text Learning
  • Framework: PyTorch
  • Fine-tuning Approach: LoRA (Low-Rank Adaptation)

LoRA Configuration

  • Rank (r): 4
  • Alpha: 16
  • Dropout: 0.1
  • Target Modules: q_proj, v_proj
  • Task Type: FEATURE_EXTRACTION

Usage

from transformers import CLIPModel, CLIPProcessor

# Load model and processor
model = CLIPModel.from_pretrained("lukahh/cultureclip_lora_0315_100k_32_07_03")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")  # Use base model's processor

# Process text and images
inputs = processor(
    text=["a photo of a cat", "a photo of a dog"],
    images=image,
    return_tensors="pt",
    padding=True
)

# Get outputs
outputs = model(**inputs)

Training Details

This model was fine-tuned using LoRA and then merged back into the base model. The LoRA approach enables efficient adaptation of the CLIP model while maintaining its core capabilities.

Downloads last month
6
Safetensors
Model size
151M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.