SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories Paper β’ 2503.08625 β’ Published 2 days ago β’ 23
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper β’ 2502.17157 β’ Published 17 days ago β’ 51
MangaNinja: Line Art Colorization with Precise Reference Following Paper β’ 2501.08332 β’ Published Jan 14 β’ 57
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper β’ 2501.04001 β’ Published Jan 7 β’ 43
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis Paper β’ 2412.15214 β’ Published Dec 19, 2024 β’ 15
MagicQuill: An Intelligent Interactive Image Editing System Paper β’ 2411.09703 β’ Published Nov 14, 2024 β’ 68
Running on L4 1.56k 1.56k MagicQuill πͺΆ Edit and enhance images with custom color and edge modifications
MagicQuill: An Intelligent Interactive Image Editing System Paper β’ 2411.09703 β’ Published Nov 14, 2024 β’ 68