SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 97
mistralai/Mistral-Small-3.1-24B-Instruct-2503 Image-Text-to-Text • Updated 14 days ago • 97k • • 1.14k