Ivan Tang's picture

1 7 4

Ivan Tang

IvanTang

·

Ivan_Tang_3D

AI & ML interests

Multimodal,3D,PEFT,LLM&MLLM

Recent Activity

upvoted a paper 19 days ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

liked a Space 24 days ago

IPEC-COMMUNITY/openx_lerobot_visualizer

updated a model 27 days ago

IvanTang/ENEL

View all activity

Organizations

None yet

IvanTang's activity

upvoted a paper 19 days ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 37

upvoted 2 papers 28 days ago

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Paper • 2501.15830 • Published Jan 27 • 14

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published 29 days ago • 25

upvoted a paper 2 months ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 25

upvoted a paper 7 months ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29, 2024 • 28

upvoted a paper 11 months ago

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 52

upvoted a paper over 1 year ago

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 15