Art Atk's picture

330 41

Art Atk

ArtAtk

·

AI & ML interests

Multimodal Models

Recent Activity

upvoted a paper about 18 hours ago

FAST: Efficient Action Tokenization for Vision-Language-Action Models

upvoted a paper about 19 hours ago

CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation

upvoted a paper about 19 hours ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

View all activity

Organizations

None yet

ArtAtk's activity

upvoted a paper about 18 hours ago

FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published 1 day ago • 11

upvoted 2 papers about 19 hours ago

CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation

Paper • 2501.09433 • Published 2 days ago • 9

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 1 day ago • 32

upvoted 2 papers 3 days ago

PokerBench: Training Large Language Models to become Professional Poker Players

Paper • 2501.08328 • Published 3 days ago • 13

Diffusion Adversarial Post-Training for One-Step Video Generation

Paper • 2501.08316 • Published 3 days ago • 29

upvoted 3 papers 4 days ago

Multi-subject Open-set Personalization in Video Generation

Paper • 2501.06187 • Published 7 days ago • 10

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Paper • 2501.03841 • Published 11 days ago • 49

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published 8 days ago • 61

upvoted a paper 8 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 10 days ago • 230

upvoted 8 papers 9 days ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 10 days ago • 77

GeAR: Generation Augmented Retrieval

Paper • 2501.02772 • Published 12 days ago • 22

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published 10 days ago • 40

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 11 days ago • 63

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Paper • 2412.21059 • Published 19 days ago • 18

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Paper • 2501.01904 • Published 14 days ago • 31

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published 14 days ago • 40

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Paper • 2501.03059 • Published 12 days ago • 19

upvoted a paper 14 days ago

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published 15 days ago • 36

upvoted 2 papers 15 days ago

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published 17 days ago • 41

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 22 days ago • 79