Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.18669

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 43
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

OpenBuddy/openbuddy-codellama2-34b-v11.1-bf16

Text Generation • Updated Sep 20, 2023 • 3.07k • 11
mistralai/Codestral-22B-v0.1

Text Generation • Updated Jul 31, 2024 • 10.7k • 1.23k
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Text Generation • Updated Jul 3, 2024 • 300k • • 406
Magpie-Align/Magpie-Qwen2.5-Coder-Pro-300K-v0.1

Viewer • Updated Jan 13 • 300k • 334 • 3

iVideoGPT: Interactive VideoGPTs are Scalable World Models

Paper • 2405.15223 • Published May 24, 2024 • 16
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 55
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 88
Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27, 2024 • 32

parler-tts/parler_tts_mini_v0.1

Text-to-Speech • Updated Apr 30, 2024 • 9.73k • 349
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Paper • 2405.08317 • Published May 14, 2024 • 13
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Paper • 2405.18669 • Published May 29, 2024 • 12
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

Paper • 2406.02430 • Published Jun 4, 2024 • 34

Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 45
VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Paper • 2312.14233 • Published Dec 21, 2023 • 17
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Paper • 2405.18669 • Published May 29, 2024 • 12

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs