GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors Paper • 2406.10111 • Published 4 days ago • 5 • 2
HelpSteer2: Open-source dataset for training top-performing reward models Paper • 2406.08673 • Published 5 days ago • 12 • 3
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published 5 days ago • 14 • 4
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Paper • 2406.09162 • Published 5 days ago • 12 • 3
Explore the Limits of Omni-modal Pretraining at Scale Paper • 2406.09412 • Published 4 days ago • 10 • 3
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus Paper • 2406.08707 • Published 5 days ago • 13 • 4
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Paper • 2406.09411 • Published 4 days ago • 17 • 2
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper • 2406.06282 • Published 8 days ago • 33 • 5
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published 8 days ago • 21 • 2
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Paper • 2406.05132 • Published 10 days ago • 27 • 2
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper • 2406.06563 • Published 15 days ago • 17 • 10
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models Paper • 2406.07472 • Published 6 days ago • 9 • 3
Simple and Effective Masked Diffusion Language Models Paper • 2406.07524 • Published 6 days ago • 7 • 2
Learning Temporally Consistent Video Depth from Video Diffusion Priors Paper • 2406.01493 • Published 14 days ago • 17 • 2
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published 7 days ago • 56 • 3
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Paper • 2406.06216 • Published 8 days ago • 13 • 5
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published 7 days ago • 20 • 2
DreamGaussian4D: Generative 4D Gaussian Splatting Paper • 2312.17142 • Published Dec 28, 2023 • 16 • 2
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published 11 days ago • 65 • 4
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM Paper • 2406.02884 • Published 13 days ago • 12 • 2
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration Paper • 2406.01014 • Published 15 days ago • 29 • 2
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation Paper • 2406.02511 • Published 13 days ago • 7 • 2
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation Paper • 2406.02509 • Published 13 days ago • 8 • 4
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper • 2406.01574 • Published 14 days ago • 40 • 3
4-bit Shampoo for Memory-Efficient Network Training Paper • 2405.18144 • Published 21 days ago • 6 • 2
4Diffusion: Multi-view Video Diffusion Model for 4D Generation Paper • 2405.20674 • Published 18 days ago • 9 • 1
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Paper • 2405.21075 • Published 17 days ago • 14 • 2
Xwin-LM: Strong and Scalable Alignment Practice for LLMs Paper • 2405.20335 • Published 18 days ago • 17 • 1
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published 19 days ago • 43 • 3