NVIDIA Nemotron 3 family โ NemotronH architecture combining Mamba state-space + standard attention. Mac-runnable, BatiAI-quantized + signed.
AI & ML interests
On-device AI, GGUF quantization, Apple Silicon, macOS automation
Recent Activity
View all activity
Latest Qwen 3.6 series with native tool calling, thinking mode, and Vision-Language. Best balance for 48-128GB Macs.
Qwen 3.5 dense and MoE quantizations. Reliable tool calling and JSON generation.
Largest open-weight LLMs, BatiAI-quantized. Mac-runnable from M4 Max 128GB to Mac Studio M3 Ultra 512GB.
Gemma 4 quantizations from Google's official weights. Best entry for 16GB Mac mini M4 (E4B Q4 = 57 t/s).
-
batiai/gemma-4-E2B-it-GGUF
Text Generation โข 5B โข Updated โข 801 โข 1 -
batiai/gemma-4-E4B-it-GGUF
Text Generation โข 8B โข Updated โข 1.06k โข 4 -
batiai/Gemma-4-26B-A4B-it-GGUF
Text Generation โข 25B โข Updated โข 3.8k โข 2 -
batiai/gemma-4-31B-it-GGUF
Text Generation โข 31B โข Updated โข 516
Complete Mac-first on-device RAG stack โ chat LLM + reranker + text/VL embedder, direct from BF16, BatiAI-signed. For BatiFlow.
-
batiai/Qwen3-Embedding-4B-GGUF
Sentence Similarity โข 4B โข Updated โข 95 -
batiai/Qwen3-Embedding-0.6B-GGUF
Sentence Similarity โข 0.6B โข Updated โข 171 โข 1 -
batiai/Qwen3-VL-Embedding-8B-GGUF
Sentence Similarity โข 8B โข Updated โข 1.83k โข 6 -
batiai/Qwen3-VL-Embedding-2B-GGUF
Sentence Similarity โข 2B โข Updated โข 248
NVIDIA Nemotron 3 family โ NemotronH architecture combining Mamba state-space + standard attention. Mac-runnable, BatiAI-quantized + signed.
Largest open-weight LLMs, BatiAI-quantized. Mac-runnable from M4 Max 128GB to Mac Studio M3 Ultra 512GB.
Latest Qwen 3.6 series with native tool calling, thinking mode, and Vision-Language. Best balance for 48-128GB Macs.
Gemma 4 quantizations from Google's official weights. Best entry for 16GB Mac mini M4 (E4B Q4 = 57 t/s).
-
batiai/gemma-4-E2B-it-GGUF
Text Generation โข 5B โข Updated โข 801 โข 1 -
batiai/gemma-4-E4B-it-GGUF
Text Generation โข 8B โข Updated โข 1.06k โข 4 -
batiai/Gemma-4-26B-A4B-it-GGUF
Text Generation โข 25B โข Updated โข 3.8k โข 2 -
batiai/gemma-4-31B-it-GGUF
Text Generation โข 31B โข Updated โข 516
Qwen 3.5 dense and MoE quantizations. Reliable tool calling and JSON generation.
Complete Mac-first on-device RAG stack โ chat LLM + reranker + text/VL embedder, direct from BF16, BatiAI-signed. For BatiFlow.
-
batiai/Qwen3-Embedding-4B-GGUF
Sentence Similarity โข 4B โข Updated โข 95 -
batiai/Qwen3-Embedding-0.6B-GGUF
Sentence Similarity โข 0.6B โข Updated โข 171 โข 1 -
batiai/Qwen3-VL-Embedding-8B-GGUF
Sentence Similarity โข 8B โข Updated โข 1.83k โข 6 -
batiai/Qwen3-VL-Embedding-2B-GGUF
Sentence Similarity โข 2B โข Updated โข 248