MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published Nov 14 • 57
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Paper • 2411.09595 • Published Nov 14 • 71
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5 • 63
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Paper • 2411.04928 • Published Nov 7 • 48
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7 • 111
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning Paper • 2411.05003 • Published Nov 7 • 70
Imagine yourself: Tuning-Free Personalized Image Generation Paper • 2409.13346 • Published Sep 20 • 68
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 135
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Paper • 2409.12957 • Published Sep 19 • 18
RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control Paper • 2405.17401 • Published May 27 • 5
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 63
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation Paper • 2408.13252 • Published Aug 23 • 24
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 124