Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7 • 54
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models Paper • 2406.06133 • Published Jun 10 • 7
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Paper • 2406.06216 • Published Jun 10 • 18
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26 • 18
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 30
Mixture of Nested Experts: Adaptive Processing of Visual Tokens Paper • 2407.19985 • Published Jul 29 • 34
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention Paper • 2407.19918 • Published Jul 29 • 47
JaColBERTv2.5: Optimising Multi-Vector Retrievers to Create State-of-the-Art Japanese Retrievers with Constrained Resources Paper • 2407.20750 • Published Jul 30 • 21
A Large Encoder-Decoder Family of Foundation Models For Chemical Language Paper • 2407.20267 • Published Jul 24 • 31
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation Paper • 2407.20445 • Published Jul 29 • 20
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 22
Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification Paper • 2407.19340 • Published Jul 27 • 56
SHIC: Shape-Image Correspondences with no Keypoint Supervision Paper • 2407.18907 • Published Jul 26 • 39
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale Paper • 2407.05282 • Published Jul 7 • 12
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5 • 52
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published Jul 12 • 127