Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- QingyiSi/Alpaca-CoT
|
4 |
+
- tatsu-lab/alpaca
|
5 |
+
- GAIR/lima
|
6 |
+
language:
|
7 |
+
- vi
|
8 |
+
---
|
9 |
+
|
10 |
+
+ LLaMa2 - 7B Chat models, extend vocab size to 44800 for Vietnamese understanding.
|
11 |
+
+ Continual Pre-Train with 2B Vietnames Tokens aligned from VnNews Corpus, 10K vnthuquan books, wikipedia_vi
|
12 |
+
+ Fine-Tuning with vietllama2-tiny dataset, the combination of [Alpaca, CoT, LIMA, daily chat] then translated into Vietnamese using OpenAI GPT-3
|
13 |
+
|
14 |
+
+ For more information: email me at duyhunghd6@gmail.com | http://fb.com/hungbui2013
|