abhinand
/

gemma-2b-it-tamil-v0.1-alpha

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

abhinand commited on Feb 25

Commit

dbed5af

•

1 Parent(s): 0939cad

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -15,6 +15,23 @@ base_model: abhinand/gemma-2b-tamil
 This is a Tamil finetuned version of Google's Gemma 2B model. This is an experiment to see if Gemma can be adapted for Tamil without expanding vocabulary. While the responses may be rusty at times, it shows a lot of promise for a 2B parameter model.
 ## Support my work

 This is a Tamil finetuned version of Google's Gemma 2B model. This is an experiment to see if Gemma can be adapted for Tamil without expanding vocabulary. While the responses may be rusty at times, it shows a lot of promise for a 2B parameter model.
+**Procedure:**
+1. The [Gemma base model](https://huggingface.co/google/gemma-2b) was continually pretrained on all available Tamil Wikipedia data for 3 epochs.
+2. The updated model was then finetuned on a mix of English and Tamil alpaca datasets for 5 epochs.
+> **Note:** This project is currently under development. The initial pretraining phase may not have been extensive enough, which suggests that the model's performance could improve by extending the pretraining on a larger dataset, such as CulturaX.
+## Model description
+- **Model type:** A 2B parameter GPT-like model finetuned on 100,000 samples consisting of an equal proportion of English and Tamil samples.
+- **Language(s):** Bilingual. English and Tamil.
+- **License:** [Google Gemma Terms of Use](https://ai.google.dev/gemma/terms)
+- **Finetuned from model:** [abhinand/gemma-2b-tamil](https://huggingface.co/abhinand/gemma-2b-tamil)
+- **Training Precision:** `bfloat16`
+- **Training Hardware:** 4x Nvidia RTX 3090 GPUs
+- **Training Cost:** $20
 ## Support my work