Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7 • 55
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models Paper • 2406.06133 • Published Jun 10 • 8
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Paper • 2406.06216 • Published Jun 10 • 19
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26 • 19
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 31
Mixture of Nested Experts: Adaptive Processing of Visual Tokens Paper • 2407.19985 • Published Jul 29 • 36
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention Paper • 2407.19918 • Published Jul 29 • 49
JaColBERTv2.5: Optimising Multi-Vector Retrievers to Create State-of-the-Art Japanese Retrievers with Constrained Resources Paper • 2407.20750 • Published Jul 30 • 21
A Large Encoder-Decoder Family of Foundation Models For Chemical Language Paper • 2407.20267 • Published Jul 24 • 31
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation Paper • 2407.20445 • Published Jul 29 • 20
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 23
Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification Paper • 2407.19340 • Published Jul 27 • 57
SHIC: Shape-Image Correspondences with no Keypoint Supervision Paper • 2407.18907 • Published Jul 26 • 40