This model card corresponds to the 7B instruct finetuned version of the Gemma model.

Model Details

This is a general question-answer model finetuned on the web_questions dataset.

Model Description

This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

  • Developed by: Geerath Bhat
  • Model type: Fine-tuned Instruct LLM.
  • Language(s) (NLP): English
  • License: No
  • Finetuned from model: [google/gemma-7b-it]


Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to pip install -U transformers, then copy the snippet from the section that is relevant for your usecase.

hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions

# Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)

# Load the model

model = AutoModelForCausalLM.from_pretrained(hf_model_repo,

prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]

# Generate response
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
outputs = model.generate(input_ids=input_ids,
                         do_sample = True,

result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

result = "Question:"+result.split("Question:")[1]

Print the result

print(f"Generated response:\n{result}")

Fine-tuning the model

You can find fine-tuning scripts and notebook under the examples/ directory of google/gemma-7b repository. To adapt it to this model, simply change the model-id to google/gemma-7b-it. In that repository, we provide:

  • A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
  • A script to perform SFT using FSDP on TPU devices
  • A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset

How to Get Started with the Model

Use the code provided by google/gemma-7b-it to get started with this finetuned model.

Training Details

Training Data


Training Procedure

Trained using SFTTrainer and below are the TrainingArguments.

num_train_epochs=1, # adjust based on the data size
per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM


Evaluated on test set of the web_questions dataset.

Testing Data

Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!


Perplexity Accuracy F1 Score


After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.

Perplexity on test data from web_questions dataset: 5.13

