SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model Paper • 2403.13064 • Published Mar 19 • 30
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Paper • 2403.14624 • Published Mar 21 • 50
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 182
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation Paper • 2402.17245 • Published Feb 27 • 10
Sora Generates Videos with Stunning Geometrical Consistency Paper • 2402.17403 • Published Feb 27 • 15
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting Paper • 2312.13271 • Published Dec 20, 2023 • 4
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Paper • 2312.12491 • Published Dec 19, 2023 • 66
How FaR Are Large Language Models From Agents with Theory-of-Mind? Paper • 2310.03051 • Published Oct 4, 2023 • 33
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 30