Submitted by akhaliq 34 I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models · 9 authors 3
Submitted by akhaliq 20 Relax: Composable Abstractions for End-to-End Dynamic Machine Learning · 19 authors 1
Submitted by akhaliq 14 Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs · 7 authors 2
Submitted by akhaliq 8 CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding · 7 authors
Submitted by akhaliq 7 Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video · 5 authors 1
Submitted by akhaliq 7 Co-training and Co-distillation for Quality Improvement and Compression of Language Models · 7 authors 1
Submitted by akhaliq 7 Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency · 7 authors 1