Edit model card

Baby Nandi

Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes. It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.

πŸ† Benchmarks

Model Average AGIEval GPT4All TruthfulQA Bigbench
bharadwajswarna/gemma-2b-sft-teluguπŸ“„ 38.99 21.53 55.56 48.33 30.56
google/gemma-2b-it πŸ“„ 36.1 23.76 43.6 47.64 29.41
google/gemma-2b πŸ“„ 34.26 22.7 43.35 39.96 31.03

Training Process & Datasets :

  1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
  2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca

You can find the link to this model here : Gemma-2b-Telugu-Base-Model

Training Duration :

  1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
  2. SFT for 3 epochs

Inference Prompt Template:

"""
### Instruction:
{}

### Input:
{}

### Response:
{}
"""

Developer : Bharadwaj Swarna
You can reach out to me for any questions/suggestions/collaborations.

Downloads last month
1,511
Safetensors
Model size
2.51B params
Tensor type
FP16
Β·