view post Post 583 New smolagents example landed on Hugging Face cookbook 🤠Learn how to create an inventory managing multi-agent system with smolagents, MongoDB and DeepSeek Chat 📖 https://huggingface.co/learn/cookbook/mongodb_smolagents_multi_micro_agents See translation
view post Post 3732 there's a new multimodal retrieval model in town 🤠LlamaIndex released vdr-2b-multi-v1> uses 70% less image tokens, yet outperforming other dse-qwen2 based models> 3x faster inference with less VRAM 💨> shrinkable with matryoshka 🪆> can do cross-lingual retrieval!Collection: llamaindex/visual-document-retrieval-678151d19d2758f78ce910e1 (with models and datasets)Demo: llamaindex/multimodal_vdr_demoLearn more from their blog post here https://huggingface.co/blog/vdr-2b-multilingual 📖 See translation
Jan 10 Releases 🌨️ vikhyatk/moondream2 Image-Text-to-Text • Updated 8 days ago • 128k • 967 DAMO-NLP-SG/multimodal_textbook Updated 6 days ago • 7.91k • 111 ByteDance/Sa2VA-1B Image-Text-to-Text • Updated 3 days ago • 440 • 15 nvidia/Cosmos-1.0-Autoregressive-4B Updated 7 days ago • 1.68k • 42
Dec 6 Releases 🎄 meta-llama/Llama-3.3-70B-Instruct Text Generation • Updated 27 days ago • 477k • • 1.66k Qwen/Qwen2-VL-72B Image-Text-to-Text • Updated Dec 6, 2024 • 495 • 70 google/paligemma2-3b-pt-224 Image-Text-to-Text • Updated Dec 5, 2024 • 37.9k • 127 tencent/HunyuanVideo Text-to-Video • Updated about 1 month ago • 8.56k • 1.43k