Submitted by akhaliq 46 Inference Performance Optimization for Large Language Models on CPUs · 10 authors 3
Submitted by akhaliq 33 LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models · 8 authors 3
Submitted by akhaliq 8 VEnhancer: Generative Space-Time Enhancement for Video Generation · 9 authors 1
Submitted by akhaliq 7 Still-Moving: Customized Video Generation without Customized Video Data · 10 authors 2
Submitted by akhaliq 7 Do Vision and Language Models Share Concepts? A Vector Space Alignment Study · 4 authors 3
Submitted by davanstrien 4 CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging · 5 authors 1
Submitted by HikariDawn 3 This&That: Language-Gesture Controlled Video Generation for Robot Planning · 7 authors 1
Submitted by PAlbert31 2 An accurate detection is not all you need to combat label noise in web-noisy datasets · 6 authors 4