The GAN is dead; long live the GAN! A Modern GAN Baseline Paper β’ 2501.05441 β’ Published Jan 9 β’ 88
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper β’ 2412.03555 β’ Published Dec 4, 2024 β’ 129
Running on CPU Upgrade 7.86k 7.86k Kolors Virtual Try-On π Upload images to try on clothes virtually
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling Paper β’ 2409.16160 β’ Published Sep 24, 2024 β’ 33
Portrait Video Editing Empowered by Multimodal Generative Priors Paper β’ 2409.13591 β’ Published Sep 20, 2024 β’ 17
view article Article Design choices for Vision Language Models in 2024 By gigant β’ Apr 16, 2024 β’ 27
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation Paper β’ 2409.08240 β’ Published Sep 12, 2024 β’ 22
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Paper β’ 2403.14468 β’ Published Mar 21, 2024 β’ 25
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper β’ 2407.09025 β’ Published Jul 12, 2024 β’ 135
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper β’ 2405.08748 β’ Published May 14, 2024 β’ 24