bharadwajswarna
/

gemma-2b-sft-telugu

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gemma-2b-sft-telugu / README.md

bharadwajswarna's picture

bharadwajswarna

Update README.md

7b9e3f2 verified 3 months ago

|

history blame contribute delete

No virus

1.73 kB

	---
	license: apache-2.0
	---

	# Baby Nandi
	Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
	It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.

	🏆 Benchmarks
	\| Model \| Average \| AGIEval \| GPT4All \| TruthfulQA \| Bigbench \|
	\|---\|---:\|---:\|---:\|---:\|---:\|
	\|[bharadwajswarna/gemma-2b-sft-telugu](bharadwajswarna/gemma-2b-sft-telugu)[📄](https://gist.github.com/bharadwajswarna2/6d5088f1b86890249e5b9e509ca7a8ce)\| 38.99\| 21.53\| 55.56\| 48.33\| 30.56\|
	\| [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) [📄](https://gist.github.com/mlabonne/db0761e74175573292acf497da9e5d95) \| 36.1 \| 23.76 \| 43.6 \| 47.64 \| 29.41 \|
	\| [google/gemma-2b](https://huggingface.co/google/gemma-2b) [📄](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) \| 34.26 \| 22.7 \| 43.35 \| 39.96 \| 31.03 \|

	Training Process & Datasets :
	1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
	2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca

	You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep)

	Training Duration :
	1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
	2. SFT for 3 epochs

	Inference Prompt Template:
	```
	"""
	### Instruction:
	{}

	### Input:
	{}

	### Response:
	{}
	"""
	```
	Developer :
	[Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\
	You can reach out to me for any questions/suggestions/collaborations.