Hyogun Lee's picture

4 14 5

Hyogun Lee

Haawron

·

AI & ML interests

Video understanding, multi-modal LLMs

Recent Activity

upvoted a paper about 17 hours ago

Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding

upvoted a paper about 24 hours ago

Byte Latent Transformer: Patches Scale Better Than Tokens

commented a paper about 24 hours ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

View all activity

Organizations

None yet

Haawron's activity

upvoted a paper about 17 hours ago

Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding

Paper • 2412.00493 • Published 18 days ago • 16

upvoted a paper about 24 hours ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 6 days ago • 55

upvoted 3 papers 3 days ago

SCBench: A KV Cache-Centric Analysis of Long-Context Methods

Paper • 2412.10319 • Published 5 days ago • 8

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published 6 days ago • 77

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 5 days ago • 121

upvoted 3 papers 7 days ago

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper • 2412.03069 • Published 15 days ago • 30

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published 13 days ago • 103

EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Paper • 2412.04862 • Published 13 days ago • 46

upvoted a paper 7 months ago

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63

upvoted 2 collections 7 months ago

LLaVA-1.5

A collection of LLaVA-1.5 checkpoints • 4 items • Updated Jan 31 • 18

LLaVA-1.6

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31 • 67

upvoted 3 papers 7 months ago

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 46

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published May 19 • 53

Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20 • 25