Bin Wang's picture

Bin Wang

wanderkid

·

https://wangbindl.github.io/

wangbinDL

AI & ML interests

Computer Vision, Multimodal Large Language Model

Recent Activity

new activity 18 days ago

wanderkid/UniMER_Dataset:Add task category and link to CDM paper

upvoted a collection about 1 month ago

liked a model 2 months ago

deepseek-ai/DeepSeek-R1

View all activity

Organizations

wanderkid's activity

upvoted a collection about 1 month ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 4 items • Updated 24 days ago • 104

upvoted 2 papers 4 months ago

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Paper • 2412.07626 • Published Dec 10, 2024 • 22

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Paper • 2412.02592 • Published Dec 3, 2024 • 22

upvoted 4 papers 6 months ago

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published Oct 28, 2024 • 31

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Paper • 2410.12628 • Published Oct 16, 2024 • 37

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published Oct 13, 2024 • 56

MinerU: An Open-Source Solution for Precise Document Content Extraction

Paper • 2409.18839 • Published Sep 27, 2024 • 28

upvoted 2 papers 7 months ago

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

Paper • 2408.17267 • Published Aug 30, 2024 • 24

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Paper • 2409.03643 • Published Sep 5, 2024 • 19

upvoted a paper 12 months ago

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Paper • 2401.16420 • Published Jan 29, 2024 • 56

upvoted a paper about 1 year ago

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26, 2024 • 33

upvoted a paper over 1 year ago

Parrot Captions Teach CLIP to Spot Text

Paper • 2312.14232 • Published Dec 21, 2023 • 12