InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper โข 2504.10479 โข Published 9 days ago โข 239
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs โข 39 items โข Updated 22 days ago โข 104
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Paper โข 2503.07027 โข Published Mar 10 โข 29
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills Paper โข 2503.12533 โข Published Mar 16 โข 64
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper โข 2503.11576 โข Published Mar 14 โข 97
Gemini Robotics: Bringing AI into the Physical World Paper โข 2503.20020 โข Published 29 days ago โข 24
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper โข 2503.24379 โข Published 23 days ago โข 75
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper โข 2504.01014 โข Published 22 days ago โข 64
Running on L4 275 275 Thera Arbitrary-Scale Super-Resolution ๐ฅ Enhance image quality with real-time super-resolution
Running on Zero 891 891 InfiniteYou-FLUX ๐ธ Flexible Photo Recrafting While Preserving Your Identity
Running on Zero 1.14k 1.14k PhotoMaker V2 ๐ท Create customized face portraits using images and prompts
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper โข 2503.10613 โข Published Mar 13 โข 79
Running on CPU Upgrade 13k 13k Open LLM Leaderboard ๐ Track, rank and evaluate open LLMs and chatbots
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper โข 2501.12599 โข Published Jan 22 โข 113