-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 8 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 22 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 8 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 37
Jaehyun Jun
btjhjeon
AI & ML interests
Multimodal
Recent Activity
updated
a collection
about 6 hours ago
Multimodal Benchmarks
upvoted
a
paper
about 6 hours ago
PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for
Multimodal Large Language Models
upvoted
a
paper
about 6 hours ago
Aligning Multimodal LLM with Human Preference: A Survey
Organizations
Collections
9
-
Analyzing The Language of Visual Tokens
Paper • 2411.05001 • Published • 24 -
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Paper • 2411.14982 • Published • 16 -
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
Paper • 2411.17686 • Published • 20 -
On the Limitations of Vision-Language Models in Understanding Image Transforms
Paper • 2503.09837 • Published • 10
models
None public yet
datasets
None public yet