Gemma 4 E2B-it Full SFT (Nepali & English)
This repository contains the finalized fine-tuned weights for the himalaya-gemma-4-e2b-it model. It has been fully trained to understand and generate high-quality text in both English and Nepali.
⚙️ Training Details
Unlike standard QLoRA fine-tuning, this model uses Full-Parameter SFT. This means every parameter in the model is trainable.
The training run stabilized beautifully over approximately 125,000 steps. To fit this comprehensive training on a single 1x A100 GPU with a tight memory budget, we utilized:
- 8-bit AdamW optimizer
- Gradient Checkpointing
📚 Datasets
The training data is a 50/50 mix of two high-quality datasets:
- Nepali:
himalaya-ai/nepali-sft-dataset - English:
teknium/OpenHermes-2.5
🚀 How to Use for Benchmarking
You can load and test this model using the Hugging Face transformers library.
1. Install dependencies
First, make sure you have the required libraries installed:
pip install transformers accelerate torch
from transformers import AutoTokenizer, AutoModelForCausalLM import torch
Put the exact Hugging Face repository name here
model_id = "himalaya-ai/himalaya-gemma-4-e2b-it"
1. Load the tokenizer and model
print("Loading model for benchmarking...")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.float16
)
2. Set up your prompt
user_prompt = "Write a short poem about the mountains in Nepal."
Apply the chat template
3. Generate the response
print("Generating response...")
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7
)
4. Print the result
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\n--- Output ---\n")
print(response)
- Downloads last month
- 413
