Gemma 3 Text-Only Model Card

Model Information

Original Model: Gemma 3 by Google DeepMind

Adaptation: Text-only version (Image processing capabilities removed)

Description

This is a text-only version of the Gemma 3 model, adapted from Google's original multimodal Gemma 3. The image processing capabilities have been removed while preserving the text generation capabilities.

This text-only adaptation maintains the core language capabilities with a 128K context window and multilingual support in over 140 languages. The model is well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning.

The adaptation makes the model more lightweight and suitable for environments where only text processing is needed, or where resource constraints make the full multimodal model impractical.

Inputs and outputs

Input:
- Text string, such as a question, a prompt, or a document to be summarized
- Total input context of 128K tokens for the 27B size
Output:
- Generated text in response to the input, such as an answer to a question or a summary of a document
- Total output context of 8192 tokens

Adaptation Details

This adaptation:

Removes the image processing components from the model
Maintains the same text tokenization and generation capabilities
Is compatible with standard text-only inference pipelines
Can be used with regular AutoModelForCausalLM instead of requiring specialized multimodal classes

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "your-username/gemma-3-27b-text" # Replace with your model path after uploading
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [
    {"role": "system", "content": "You are an AI assistant that provides helpful and accurate information."},
    {"role": "user", "content": "Hello. How's the weather today?"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
)

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.2,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)