Gemma 3 Collection All versions of Google's new multimodal models in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 27 items • Updated about 4 hours ago • 30
view post Post 1674 Gemma-3-4B : Image and Video Inference 🖼️🎥🧤Space: prithivMLmods/Gemma-3-Multimodal @gemma3-4b : {Tag + Space_+ 'prompt'} @video-infer : {Tag + Space_+ 'prompt'} By default, it runs: prithivMLmods/Qwen2-VL-OCR-2B-Instruct Gemma 3 Technical Report : https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdfAdditionally, I have also tested Aya-Vision 8B vs Custom Qwen2-VL-OCR for OCR with test case samples on messy handwriting for experimental purposes to optimize edge device VLMs for Optical Character Recognition.📜Read the blog here: https://huggingface.co/blog/prithivMLmods/aya-vision-vs-qwen2vl-ocr-2b See translation 1 reply · 🔥 11 11 🤗 6 6 👍 3 3 ❤️ 2 2 + Reply
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 276
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 2 days ago • 209