abhinand commited on
Commit
dbed5af
1 Parent(s): 0939cad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -15,6 +15,23 @@ base_model: abhinand/gemma-2b-tamil
15
 
16
  This is a Tamil finetuned version of Google's Gemma 2B model. This is an experiment to see if Gemma can be adapted for Tamil without expanding vocabulary. While the responses may be rusty at times, it shows a lot of promise for a 2B parameter model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ## Support my work
20
 
 
15
 
16
  This is a Tamil finetuned version of Google's Gemma 2B model. This is an experiment to see if Gemma can be adapted for Tamil without expanding vocabulary. While the responses may be rusty at times, it shows a lot of promise for a 2B parameter model.
17
 
18
+ **Procedure:**
19
+
20
+ 1. The [Gemma base model](https://huggingface.co/google/gemma-2b) was continually pretrained on all available Tamil Wikipedia data for 3 epochs.
21
+ 2. The updated model was then finetuned on a mix of English and Tamil alpaca datasets for 5 epochs.
22
+
23
+ > **Note:** This project is currently under development. The initial pretraining phase may not have been extensive enough, which suggests that the model's performance could improve by extending the pretraining on a larger dataset, such as CulturaX.
24
+
25
+ ## Model description
26
+
27
+ - **Model type:** A 2B parameter GPT-like model finetuned on 100,000 samples consisting of an equal proportion of English and Tamil samples.
28
+ - **Language(s):** Bilingual. English and Tamil.
29
+ - **License:** [Google Gemma Terms of Use](https://ai.google.dev/gemma/terms)
30
+ - **Finetuned from model:** [abhinand/gemma-2b-tamil](https://huggingface.co/abhinand/gemma-2b-tamil)
31
+ - **Training Precision:** `bfloat16`
32
+ - **Training Hardware:** 4x Nvidia RTX 3090 GPUs
33
+ - **Training Cost:** $20
34
+
35
 
36
  ## Support my work
37