Instructions to use google/gemma-4-31B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-4-31B-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="google/gemma-4-31B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("google/gemma-4-31B-it")
model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-31B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
AMD Developer Cloud
Local Apps Settings

vLLM

How to use google/gemma-4-31B-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-4-31B-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-4-31B-it

SGLang

How to use google/gemma-4-31B-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-4-31B-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-4-31B-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-4-31B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use google/gemma-4-31B-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-4-31B-it
```

Generating pictures with Gemma-4-31B-it (Try not to laugh)

#110

by Flanua - opened 13 days ago

Discussion

Flanua

13 days ago

•

edited 13 days ago

By testing the model creativity and capabilities I decided to ask it to generate me an image (I know, I know its LLM but in ASCII art version hehe)

Model said challenge accepted (for the most part hehe)
Here is what it came out with:

Model inner though
The user wants a "picture of a little girl using ASCII art version."
Character: AI (LLM, follows requests, thinks outside the box).
Constraints: Single reply, no "AI:" prefix.

ASCII art of a little girl is challenging because ASCII is coarse.
A "standard" ASCII girl usually looks like a stick figure or a very simple silhouette.
Since the AI "thinks outside the box," it should try its best to create something recognizable or acknowledge the limitation while providing a creative attempt.

ASCII options:

A simple smiley-style girl.
A more detailed (but still ASCII) silhouette.
A creative arrangement of characters.
Example of a simple girl in ASCII:

But the user specifically mentioned "long hair" in the previous context. I should try to incorporate that.

Attempting a "long hair" ASCII girl:
@@@@@
@ o o @
@ ^ @
@ - @
@@@@@@@@@
@ ||| @
@ ||| @
\ | /
\ | /
_/

Actually, let's try to make it look a bit more like a stylized drawing.

( ^ ^ )
( ( ) )
( ) ( )
\ ___/ /
\ \ / /
\ V /
\ /
\ /

Wait, ASCII art is often distorted by fonts. I should use a code block for fixed-width formatting.

Plan:

Acknowledge the switch from "cosmic painting" to "ASCII art."
Provide the ASCII art in a code block to ensure it doesn't break.
Add a playful comment about the "low-res" version of the girl.
Refining the ASCII:

Let's try to make it look "long haired" by using characters like | or ( on the sides.

Wait, let's try to make the hair long.
@@@@@@@
@ o o @
@ ^ @
@ --- @
@@@@@@@@@@
@ | | @
@ | | @
@ | | @
@ | | @
\ | | /
\ _/ /

Let's go with a clean, stylized version in a code block.

Response:
"Switching from high-definition cosmic art to the classic charm of ASCII! Here is a stylized, low-res version of a girl with very long hair for you:"

[Code block with ASCII]

"Not quite as sparkly as the nebula version, but she has a certain retro charm, don't you think?"

Switching from high-definition cosmic art to the classic charm of ASCII! Here is a stylized version of a girl with very long hair for you:

gghfez

10 days ago

LLMs are terrible at ASCII for the same reason they struggle to count letters in text (r's in strawberry is the classic).
Try asking for SVG if you haven't already. Gemma-4-31B is the best <100B at this.

^ these are from the dense model at Q8

Flanua

8 days ago

•

edited 8 days ago

LLMs are terrible at ASCII for the same reason they struggle to count letters in text (r's in strawberry is the classic).
Try asking for SVG if you haven't already. Gemma-4-31B is the best <100B at this.
^ these are from the dense model at Q8

Very good idea with SVG. (Strange AI model didn't figure that out) Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment