Collections
Discover the best community collections!
Collections including paper arxiv:2402.08268
-
World Model on Million-Length Video And Language With RingAttention
Paper ā¢ 2402.08268 ā¢ Published ā¢ 33 -
Improving Text Embeddings with Large Language Models
Paper ā¢ 2401.00368 ā¢ Published ā¢ 74 -
Chain-of-Thought Reasoning Without Prompting
Paper ā¢ 2402.10200 ā¢ Published ā¢ 91 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper ā¢ 2402.12376 ā¢ Published ā¢ 47
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Paper ā¢ 2401.15977 ā¢ Published ā¢ 34 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper ā¢ 2401.12945 ā¢ Published ā¢ 83 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper ā¢ 2307.04725 ā¢ Published ā¢ 63 -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Paper ā¢ 2402.01566 ā¢ Published ā¢ 26
-
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper ā¢ 2401.01885 ā¢ Published ā¢ 26 -
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper ā¢ 2401.15687 ā¢ Published ā¢ 19 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper ā¢ 2312.17172 ā¢ Published ā¢ 24 -
MouSi: Poly-Visual-Expert Vision-Language Models
Paper ā¢ 2401.17221 ā¢ Published ā¢ 6
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 36 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 6 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 154 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 43