Yash Thube's picture

47 13

Yash Thube

thubZ9

·

https://thubzai.github.io/

AI & ML interests

Multimodal learning • CV • RL

Recent Activity

upvoted a paper 6 days ago

Efficient Process Reward Model Training via Active Learning

upvoted a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

updated a collection 15 days ago

My reading list!

View all activity

Organizations

thubZ9's activity

upvoted a paper 6 days ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published 8 days ago • 13

upvoted a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 8 days ago • 239

updated a collection 15 days ago

My reading list!

18 items • Updated 15 days ago • 1

upvoted a paper 15 days ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published 28 days ago • 140

updated a collection 16 days ago

My reading list!

18 items • Updated 15 days ago • 1

upvoted a paper 18 days ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 22 days ago • 255

updated a collection 22 days ago

My reading list!

18 items • Updated 15 days ago • 1

liked a model 29 days ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated 27 days ago • 249k • • 2.71k

upvoted a paper about 1 month ago

One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Paper • 2503.13358 • Published Mar 17 • 96

updated a collection about 1 month ago

My reading list!

18 items • Updated 15 days ago • 1

upvoted a collection about 1 month ago

Cohere Labs Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 7 days ago • 68

liked a model about 1 month ago

CohereLabs/aya-vision-32b

Image-Text-to-Text • Updated 7 days ago • 444 • • 190

upvoted a collection about 1 month ago

Gemma 3 Release

24 items • Updated 4 days ago • 341

upvoted an article about 1 month ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

Mar 4

• 73

liked a model about 1 month ago

Qwen/QwQ-32B

Text Generation • Updated Mar 11 • 673k • • 2.7k

upvoted a paper about 1 month ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 119

upvoted a paper about 2 months ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 94

updated a collection about 2 months ago

My reading list!

18 items • Updated 15 days ago • 1

upvoted a paper about 2 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 77

liked a model about 2 months ago

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated 14 days ago • 619k • 1.31k