WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published 25 days ago • 36
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 97
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning Paper • 2401.16013 • Published Jan 29 • 23
Proactive Detection of Voice Cloning with Localized Watermarking Paper • 2401.17264 • Published Jan 30 • 17
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26 • 35
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents Paper • 2401.12963 • Published Jan 23 • 12
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8 • 71
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Paper • 2312.03491 • Published Dec 6, 2023 • 34
Merlin:Empowering Multimodal LLMs with Foresight Minds Paper • 2312.00589 • Published Nov 30, 2023 • 24
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 151
SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering Paper • 2311.12775 • Published Nov 21, 2023 • 28
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs Paper • 2311.05657 • Published Nov 9, 2023 • 27
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper • 2311.05556 • Published Nov 9, 2023 • 81