GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published Nov 21, 2024 • 38
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation Paper • 2503.22194 • Published 10 days ago • 22
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published 17 days ago • 34
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research Paper • 2503.13399 • Published 21 days ago • 20
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools Paper • 2503.10970 • Published 25 days ago • 16
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research Paper • 2503.13399 • Published 21 days ago • 20
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research Paper • 2503.13399 • Published 21 days ago • 20 • 2