Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
merve
's Collections
Nov 29 Releases 🌲🌲
Nov 22 Releases ❄️
Nov 15 Releases 🍂
Nov 1 Releases
MIT Talk 31/10 Papers
October 25 Releases
LOTUS 🪷
New Depth Models
BRAVE Models 🦁
Computer Vision Backbones 🧩
Image Classification Models 🐶 🐱
Object Detection Models 🥥
Image Segmentation Models 💜
Zero-shot Image Classification Models 🖼️
Image-to-Image Models 🎨
Video Classification Models 📺
Image-to-Text Models 📝
Text-to-Image Models 🥑
Foundation Models for Vision 🧩
Segment Anything Model
OWL-series 🦉
SigLIP
Awesome Document AI
SegGPT
Vision Language Models Papers 🖼️💬📝
gvhf/owl
gv-hf/owl
merve/owl2
Depth Anything v2 Release
Document VLM Papers
Vision Language Leaderboards
Video Language Models
SAM2
NVEagle
Multimodal RAG
Zero-shot Segmentation
Video Language Models
updated
Aug 1
A collection of video-language models
Upvote
1
Running
on
Zero
18
🐨
Video Llava
llava-hf/LLaVA-NeXT-Video-7B-hf
Video-Text-to-Text
•
Updated
11 days ago
•
124k
•
53
llava-hf/LLaVA-NeXT-Video-7B-DPO-hf
Video-Text-to-Text
•
Updated
11 days ago
•
2.36k
•
8
llava-hf/LLaVA-NeXT-Video-7B-32K-hf
Image-Text-to-Text
•
Updated
11 days ago
•
222
•
7
Running
on
Zero
29
🌋
Llava Interleave
Upvote
1
Share collection
View history
Collection guide
Browse collections