LLaVa-Interleave Collection LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. • 3 items • Updated 12 days ago • 11
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots Paper • 2402.10329 • Published Feb 15 • 13
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices Paper • 2311.16567 • Published Nov 28, 2023 • 22
Proactive Detection of Voice Cloning with Localized Watermarking Paper • 2401.17264 • Published Jan 30 • 15
Trained Models 🏋️ Collection They may be small, but they're training like giants! • 8 items • Updated May 13 • 16
🐍 Mamba fine-tuned models Collection A collection with ClibrAIn's Mamba fine-tuned models • 3 items • Updated Dec 18, 2023 • 11
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model Paper • 2312.11370 • Published Dec 18, 2023 • 19
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 135
Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition Paper • 2307.14535 • Published Jul 26, 2023 • 13