Submitted by akhaliq 23 LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model · 5 authors 1
Submitted by akhaliq 20 CameraCtrl: Enabling Camera Control for Text-to-Video Generation · 7 authors 1
Submitted by akhaliq 19 Bigger is not Always Better: Scaling Properties of Latent Diffusion Models · 6 authors 1
Submitted by akhaliq 6 LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models · 7 authors 1