Edit model card

Baby Nandi

Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes. It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.

πŸ† Benchmarks

Model Average AGIEval GPT4All TruthfulQA Bigbench
bharadwajswarna/gemma-2b-sft-teluguπŸ“„ 38.99 21.53 55.56 48.33 30.56
google/gemma-2b-it πŸ“„ 36.1 23.76 43.6 47.64 29.41
google/gemma-2b πŸ“„ 34.26 22.7 43.35 39.96 31.03

Training Process & Datasets :

  1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
  2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca

You can find the link to this model here : Gemma-2b-Telugu-Base-Model

Training Duration :

  1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
  2. SFT for 3 epochs

Inference Prompt Template:

"""
### Instruction:
{}

### Input:
{}

### Response:
{}
"""

Developer : Bharadwaj Swarna
You can reach out to me for any questions/suggestions/collaborations.

Downloads last month
305
Safetensors
Model size
2.51B params
Tensor type
FP16
Β·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.