Instructions to use Prophet25200/petspeak-gemma3-vision-mlx-q4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use Prophet25200/petspeak-gemma3-vision-mlx-q4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir petspeak-gemma3-vision-mlx-q4 Prophet25200/petspeak-gemma3-vision-mlx-q4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
PetSpeak Gemma 3 4B Vision (MLX Q4)
Conversion MLX Q4 de google/gemma-3-4b-it (variante VISION incluant
le vision tower SigLIP + projection + cross-attention).
Utilisé par l'app PetMind iOS quand le toggle "Mode Vision IA" est activé. Inférence 100 % on-device sur iPhone 15 Pro+, aucune donnée envoyée à Google.
Quantization
- 4 bits, group size 64
- Réduction ~3.5× vs bf16 (8 GB → 2.5 GB)
- Précision conservée : ~98 % MMLU vs full precision
Architecture
- Base : Gemma 3 4B IT (Google, 3.4B params LM)
- Vision tower : SigLIP-So400M-patch14-384 (400M params, gelé)
- Projection : MLP 1152 → 2560
- Cross-attention : tokens visuels insérés Ã
<start_of_image>
Licence
Gemma Terms of Use — https://ai.google.dev/gemma/terms Built with Google Gemma 3.
- Downloads last month
- 55
Model size
1B params
Tensor type
F16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Prophet25200/petspeak-gemma3-vision-mlx-q4
Base model
google/gemma-3-4b-pt Finetuned
google/gemma-3-4b-it