1 8 3

Chi Chen

carboncoo

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

authored a paper about 2 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

authored a paper about 2 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

View all activity

Organizations

carboncoo's activity

upvoted a paper about 2 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Paper • 2501.06598 • Published Jan 11 • 1

authored 2 papers about 2 months ago

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Paper • 2501.06598 • Published Jan 11 • 1

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 28

upvoted a paper about 2 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 28

commented a paper about 2 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 28 •

authored a paper 2 months ago

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18

upvoted 2 papers 2 months ago

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Paper • 2411.03628 • Published Nov 6, 2024 • 2

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Paper • 2308.13437 • Published Aug 25, 2023 • 4

upvoted 2 papers 3 months ago

Densing Law of LLMs

Paper • 2412.04315 • Published Dec 5, 2024 • 19

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Paper • 2412.13871 • Published Dec 18, 2024 • 18

liked a dataset 4 months ago

mjuicem/StreamingBench

Viewer • Updated Nov 15, 2024 • 4.55k • 3.26k • 6

authored 6 papers 4 months ago

Mask-Align: Self-Supervised Neural Word Alignment

Paper • 2012.07162 • Published Dec 13, 2020

Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Paper • 2308.13437 • Published Aug 25, 2023 • 4

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

Paper • 2402.12195 • Published Feb 19, 2024

CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

Paper • 2402.13607 • Published Feb 21, 2024

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models

Paper • 2410.04659 • Published Oct 7, 2024

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Paper • 2411.03628 • Published Nov 6, 2024 • 2

upvoted a paper 5 months ago

LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39

upvoted a paper 7 months ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 82

liked a model 7 months ago

openbmb/MiniCPM-V-2_6

Image-Text-to-Text • Updated Jan 15 • 74.6k • 953