5 36 36

Igor Shatalin

blanchefort

https://skl.vc/

AI & ML interests

None yet

Recent Activity

upvoted a paper 28 days ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

upvoted a paper 28 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

View all activity

Organizations

blanchefort's activity

upvoted 2 papers 28 days ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 226

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 92

upvoted 2 articles about 1 month ago

Article

State of open video generation models in Diffusers

Jan 27

• 50

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 148

upvoted a paper about 2 months ago

π_0: A Vision-Language-Action Flow Model for General Robot Control

Paper • 2410.24164 • Published Oct 31, 2024 • 7

upvoted a paper 2 months ago

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85

upvoted 13 papers 3 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 275

Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published Jan 10 • 75

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 53

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 44

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 48

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Paper • 2501.03841 • Published Jan 7 • 56

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

Paper • 2501.01427 • Published Jan 2 • 55

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 72

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Paper • 2501.01895 • Published Jan 3 • 56

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107