FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance Paper • 2408.08189 • Published Aug 15 • 15
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization Paper • 2408.05939 • Published Aug 12 • 13
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12 • 52
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 254
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 683
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance Paper • 2306.00943 • Published Jun 1, 2023 • 5
MGM-Data Collection Official data collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 2 items • Updated Apr 21 • 7
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated May 3 • 46
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 44
Elucidating the Design Space of Diffusion-Based Generative Models Paper • 2206.00364 • Published Jun 1, 2022 • 13
Generative Multimodal Models are In-Context Learners Paper • 2312.13286 • Published Dec 20, 2023 • 34
VideoPoet: A Large Language Model for Zero-Shot Video Generation Paper • 2312.14125 • Published Dec 21, 2023 • 44
One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications Paper • 2312.16145 • Published Dec 26, 2023 • 8
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis Paper • 2312.16812 • Published Dec 28, 2023 • 9
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web Paper • 2312.16457 • Published Dec 27, 2023 • 13