Baby Nandi
Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes. It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.
π Benchmarks
Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
bharadwajswarna/gemma-2b-sft-teluguπ | 38.99 | 21.53 | 55.56 | 48.33 | 30.56 |
google/gemma-2b-it π | 36.1 | 23.76 | 43.6 | 47.64 | 29.41 |
google/gemma-2b π | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
Training Process & Datasets :
- Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
- SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
You can find the link to this model here : Gemma-2b-Telugu-Base-Model
Training Duration :
- Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
- SFT for 3 epochs
Inference Prompt Template:
"""
### Instruction:
{}
### Input:
{}
### Response:
{}
"""
Developer :
Bharadwaj Swarna
You can reach out to me for any questions/suggestions/collaborations.
- Downloads last month
- 305
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.