Instructions to use google/gemma-4-12B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-12B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("google/gemma-4-12B-it") model = AutoModelForMultimodalLM.from_pretrained("google/gemma-4-12B-it") - Notebooks
- Google Colab
- Kaggle
it is speaking Cantonese while user prompt is Simplified Chinese
I tried both 26b a4b and 31b, working fine.
but 12b is speaking Cantonese.
this is just annoying.
---update---
the 6th Jun updated gguf by unsloth fixed this issue.
so this might be issue of gguf, not the original model.
Hi @leiyang88 ,
Thanks for addressing the issue!
To help us investigate and reproduce this exactly, could you please share:
- The exact prompt you used.
- A screenshots of the unexpected output if possible.
I think all prompts (as simple as 你好) in simplified Chinese can trigger Cantonese response.
Hi @leiyang88 ,
Thanks for addressing the issue!
To help us investigate and reproduce this exactly, could you please share:
- The exact prompt you used.
- A screenshots of the unexpected output if possible.
I used https://modelscope.cn/models/unsloth/gemma-4-12b-it-GGUF gemma-4-12b-it-UD-Q8_K_XL.gguf usign llama.server latest version. default parameters.
windows 11, cuda 13 rtx 3090 24G.