OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13 • 30
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization Paper • 2408.05939 • Published Aug 12 • 13
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Paper • 2408.06072 • Published Aug 12 • 37
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12 • 53
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 117
ShieldGemma: Generative AI Content Moderation Based on Gemma Paper • 2407.21772 • Published Jul 31 • 14
VideoGUI: A Benchmark for GUI Automation from Instructional Videos Paper • 2406.10227 • Published Jun 14 • 9
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality Paper • 2406.08845 • Published Jun 13 • 8
Designing a Dashboard for Transparency and Control of Conversational AI Paper • 2406.07882 • Published Jun 12 • 10
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning Paper • 2406.08973 • Published Jun 13 • 86
Make It Count: Text-to-Image Generation with an Accurate Number of Objects Paper • 2406.10210 • Published Jun 14 • 76
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11 • 32
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion Paper • 2406.04338 • Published Jun 6 • 34
iVideoGPT: Interactive VideoGPTs are Scalable World Models Paper • 2405.15223 • Published May 24 • 12