Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published Nov 7 • 21
GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models Paper • 2408.11817 • Published Aug 21 • 8
Kosmos-2: Grounding Multimodal Large Language Models to the World Paper • 2306.14824 • Published Jun 26, 2023 • 34
GPT4GEO: How a Language Model Sees the World's Geography Paper • 2306.00020 • Published May 30, 2023 • 1
SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models Paper • 2304.11619 • Published Apr 23, 2023 • 2