ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation Paper • 2406.00908 • Published Jun 3 • 12
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5 • 34
🐒 Stable Diffusion LoRAs Collection Awesome LoRAs found on the hub - using only 🐵 • 7 items • Updated Jul 23 • 16
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding Paper • 2403.12895 • Published Mar 19 • 31
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring Paper • 2403.09333 • Published Mar 14 • 14
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper • 2403.04692 • Published Mar 7 • 39
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild Paper • 2401.13627 • Published Jan 24 • 73
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation Paper • 2310.08541 • Published Oct 12, 2023 • 17
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs Paper • 2308.07228 • Published Aug 14, 2023 • 9
trajdata: A Unified Interface to Multiple Human Trajectory Datasets Paper • 2307.13924 • Published Jul 26, 2023 • 2
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models Paper • 2308.00304 • Published Aug 1, 2023 • 22
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning Paper • 2308.00436 • Published Aug 1, 2023 • 22
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models Paper • 2308.00675 • Published Aug 1, 2023 • 35
Predicting masked tokens in stochastic locations improves masked image modeling Paper • 2308.00566 • Published Jul 31, 2023 • 15
When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities Paper • 2307.16376 • Published Jul 31, 2023 • 2
Three Bricks to Consolidate Watermarks for Large Language Models Paper • 2308.00113 • Published Jul 26, 2023 • 13
Unified Model for Image, Video, Audio and Language Tasks Paper • 2307.16184 • Published Jul 30, 2023 • 14