Edit Models filters

Multimodal

Audio-Text-to-Text

Image-Text-to-Text

Visual Question Answering

Document Question Answering

Video-Text-to-Text

Visual Document Retrieval

Computer Vision

Depth Estimation

Image Classification

Object Detection

Image Segmentation

Unconditional Image Generation

Video Classification

Zero-Shot Image Classification

Mask Generation

Zero-Shot Object Detection

Image Feature Extraction

Keypoint Detection

Natural Language Processing

Text Classification

Token Classification

Table Question Answering

Question Answering

Zero-Shot Classification

Feature Extraction

Text Generation

Text2Text Generation

Sentence Similarity

Audio

Automatic Speech Recognition

Audio Classification

Voice Activity Detection

Tabular

Tabular Classification

Tabular Regression

Time Series Forecasting

Reinforcement Learning

Reinforcement Learning

Other

Graph Machine Learning

Models

9,074

Full-text search

Active filters: image-text-to-text

ByteDance-Seed/UI-TARS-1.5-7B

Image-Text-to-Text • Updated 20 days ago • 19.9k • 238

google/gemma-3-27b-it

Image-Text-to-Text • Updated Mar 21 • 417k • • 1.32k

ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • Updated Mar 23 • 85.3k • 1.32k

meta-llama/Llama-4-Scout-17B-16E-Instruct

Image-Text-to-Text • Updated 28 days ago • 870k • • 871

lusxvr/nanoVLM-222M

Image-Text-to-Text • Updated about 5 hours ago • 20 • 16

google/gemma-3-27b-it-qat-q4_0-gguf

Image-Text-to-Text • Updated 27 days ago • 67.7k • 268

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated Apr 6 • 3.11M • • 867

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Image-Text-to-Text • Updated 29 days ago • 102k • • 1.19k

OpenGVLab/InternVL3-78B

Image-Text-to-Text • Updated 21 days ago • 208k • 155

ginipick/Gemma-3-R1984-4B

Image-Text-to-Text • Updated 15 days ago • 105 • 16

google/gemma-3-4b-it

Image-Text-to-Text • Updated Mar 21 • 615k • 496

VIDraft/Gemma-3-R1984-27B

Image-Text-to-Text • Updated 17 days ago • 159 • 56

meta-llama/Llama-Guard-4-12B

Image-Text-to-Text • Updated 8 days ago • 3.97k • 24

VIDraft/Gemma-3-R1984-12B

Image-Text-to-Text • Updated 28 days ago • 1.23k • 37

VIDraft/Gemma-3-R1984-4B

Image-Text-to-Text • Updated 28 days ago • 1.83k • 22

allenai/olmOCR-7B-0225-preview

Image-Text-to-Text • Updated Feb 25 • 420k • 634

google/gemma-3-4b-it-qat-q4_0-gguf

Image-Text-to-Text • Updated 27 days ago • 18.8k • 129

burtenshaw/GemmaCoder3-12B

Image-Text-to-Text • Updated Apr 1 • 221 • 50

Tesslate/Synthia-S1-27b

Image-Text-to-Text • Updated 29 days ago • 595 • 67

nvidia/DAM-3B

Image-Text-to-Text • Updated 13 days ago • 3.41k • 108

google/gemma-3-12b-it-qat-q4_0-gguf

Image-Text-to-Text • Updated 27 days ago • 73.5k • 122

meta-llama/Llama-4-Maverick-17B-128E-Instruct

Image-Text-to-Text • Updated 28 days ago • 90.4k • • 319

microsoft/Florence-2-large

Image-Text-to-Text • Updated Dec 8, 2024 • 439k • 1.54k

google/gemma-3-12b-it

Image-Text-to-Text • Updated Mar 21 • 366k • 345

openfree/Gemma-3-R1984-12B-Q8_0-GGUF

Image-Text-to-Text • Updated Mar 30 • 623 • 20

openfree/Gemma-3-R1984-27B-Q8_0-GGUF

Image-Text-to-Text • Updated Mar 30 • 541 • 19

openfree/Gemma-3-R1984-27B-Q6_K-GGUF

Image-Text-to-Text • Updated Mar 30 • 256 • 15

openfree/Gemma-3-R1984-27B-Q4_K_M-GGUF

Image-Text-to-Text • Updated Mar 30 • 658 • 18

openfree/Gemma-3-R1984-12B-Q6_K-GGUF

Image-Text-to-Text • Updated Mar 30 • 493 • 18

HuggingFaceTB/SmolVLM-Instruct

Image-Text-to-Text • Updated 29 days ago • 74.7k • 436