Autoregressive Models in Vision: A Survey Paper β’ 2411.05902 β’ Published 15 days ago β’ 14 β’ 2
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper β’ 2410.10139 β’ Published Oct 14 β’ 50
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper β’ 2409.18042 β’ Published Sep 26 β’ 36
MagicTime Collection MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators β’ 4 items β’ Updated Aug 10 β’ 31
ChronoMagic-Bench Collection ChronoMagic-Bench : A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation β’ 6 items β’ Updated Jul 31 β’ 23
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Paper β’ 2404.05014 β’ Published Apr 7 β’ 53
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach Paper β’ 2401.15652 β’ Published Jan 28
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper β’ 2406.18522 β’ Published Jun 26 β’ 40
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper β’ 2406.18522 β’ Published Jun 26 β’ 40