Update README.md
Browse files
README.md
CHANGED
@@ -211,6 +211,9 @@ Here is a kaggle script to quickly test the model:
|
|
211 |
* Learning rate: 2e-5 cosine
|
212 |
* Optimizer: PagedLion8bit
|
213 |
* QLora: rank: 64 /Q: 4-bit
|
|
|
|
|
|
|
214 |
|
215 |
- 250k examples of 70% Vietnamese 30% English for 3.37 epoch
|
216 |
- 350k examples of 60% Vietnamese 40% English for 1.4 epoch
|
|
|
211 |
* Learning rate: 2e-5 cosine
|
212 |
* Optimizer: PagedLion8bit
|
213 |
* QLora: rank: 64 /Q: 4-bit
|
214 |
+
* Batch size: 2
|
215 |
+
* Gradient accumulation: 128
|
216 |
+
* Effective batch size: 256
|
217 |
|
218 |
- 250k examples of 70% Vietnamese 30% English for 3.37 epoch
|
219 |
- 350k examples of 60% Vietnamese 40% English for 1.4 epoch
|