Jaehyun Jun's picture

Jaehyun Jun

btjhjeon

·

https://btjhjeon.github.io/

btjhjeon

AI & ML interests

Multimodal

Recent Activity

updated a collection about 13 hours ago

Multimodal Benchmarks

upvoted a paper about 13 hours ago

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

upvoted a paper about 13 hours ago

Aligning Multimodal LLM with Human Preference: A Survey

View all activity

Organizations

btjhjeon's activity

upvoted 3 papers about 13 hours ago

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

Paper • 2503.12545 • Published 3 days ago • 4

Aligning Multimodal LLM with Human Preference: A Survey

Paper • 2503.14504 • Published 1 day ago • 10

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

Paper • 2503.12505 • Published 3 days ago • 8

upvoted 2 papers about 14 hours ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published 3 days ago • 24

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published 1 day ago • 37

upvoted a paper about 21 hours ago

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published 3 days ago • 23

upvoted 5 papers 1 day ago

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published 2 days ago • 12

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning

Paper • 2503.11495 • Published 5 days ago • 11

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 3 days ago • 24

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Paper • 2503.13399 • Published 2 days ago • 19

Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills

Paper • 2503.12533 • Published 3 days ago • 56

upvoted a paper 2 days ago

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Paper • 2503.06542 • Published 10 days ago • 7

upvoted 3 papers 3 days ago

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published 6 days ago • 16

On the Limitations of Vision-Language Models in Understanding Image Transforms

Paper • 2503.09837 • Published 7 days ago • 10

TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Paper • 2503.10602 • Published 6 days ago • 4

upvoted 3 papers 4 days ago

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published 6 days ago • 18

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published 6 days ago • 31

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published 6 days ago • 45

upvoted a paper 6 days ago

R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning

Paper • 2503.05379 • Published 12 days ago • 32

upvoted a paper 7 days ago

VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering

Paper • 2503.06492 • Published 11 days ago • 9