Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Ji-Xiang
's Collections
SFT Datasets
Recommended Datasets
Code LLM
Text-to-Video
Multimodal Language Models
Image Chatbot
traditional-chinese-dataset
Suggest Spaces
Suggestion Models
Chinese models
China models
Uncensored models
china-dataset
common-dataset
unfiltered dataset
Image Generator AI
Edge Computing
Voice
Medical
Big Models
GGUF Models
TTS
Visual Question Answering
Chat
Multi Tasks
Vision
DPO datasets
ORPO-DPO datasets
Code dataset
SLM (small language models)
automatic speech recognition (ASR)
Vision-Language dataset
MoE
Dense Passage Retrieval (DPR) Datasets
Audio-To-Text
background-removal
Extreme Quantization
Try on
Multimodal Language Models
updated
about 9 hours ago
Upvote
-
allenai/Molmo-72B-0924
Image-Text-to-Text
•
Updated
Oct 10
•
7.35k
•
261
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
Updated
Oct 10
•
71.4k
•
440
allenai/Molmo-7B-O-0924
Image-Text-to-Text
•
Updated
7 days ago
•
32.5k
•
142
allenai/MolmoE-1B-0924
Image-Text-to-Text
•
Updated
Oct 10
•
9.91k
•
131
mistralai/Pixtral-12B-2409
Updated
29 days ago
•
505
meta-llama/Llama-3.2-90B-Vision-Instruct
Image-Text-to-Text
•
Updated
Sep 30
•
292k
•
271
meta-llama/Llama-3.2-90B-Vision
Image-Text-to-Text
•
Updated
Sep 27
•
6k
•
104
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Sep 30
•
2.39M
•
•
970
meta-llama/Llama-3.2-11B-Vision
Image-Text-to-Text
•
Updated
Sep 27
•
97.3k
•
358
NexaAIDev/omnivision-968M
Updated
about 2 hours ago
•
5.78k
•
345
Upvote
-
Share collection
View history
Collection guide
Browse collections