capleaf
/

T-Llama

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

1TuanPham commited on Mar 27

Commit

bc66a0e

•

1 Parent(s): e5daf15

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -211,6 +211,9 @@ Here is a kaggle script to quickly test the model:
 * Learning rate: 2e-5 cosine
 * Optimizer: PagedLion8bit
 * QLora: rank: 64 /Q: 4-bit
   - 250k examples of 70% Vietnamese 30% English for 3.37 epoch
   - 350k examples of 60% Vietnamese 40% English for 1.4 epoch

 * Learning rate: 2e-5 cosine
 * Optimizer: PagedLion8bit
 * QLora: rank: 64 /Q: 4-bit
+* Batch size: 2
+* Gradient accumulation: 128
+* Effective batch size: 256
   - 250k examples of 70% Vietnamese 30% English for 3.37 epoch
   - 350k examples of 60% Vietnamese 40% English for 1.4 epoch