hungeni commited on
Commit
c3e3e7f
1 Parent(s): 7d4e982

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - QingyiSi/Alpaca-CoT
4
+ - tatsu-lab/alpaca
5
+ - GAIR/lima
6
+ language:
7
+ - vi
8
+ ---
9
+
10
+ + LLaMa2 - 7B Chat models, extend vocab size to 44800 for Vietnamese understanding.
11
+ + Continual Pre-Train with 2B Vietnames Tokens aligned from VnNews Corpus, 10K vnthuquan books, wikipedia_vi
12
+ + Fine-Tuning with vietllama2-tiny dataset, the combination of [Alpaca, CoT, LIMA, daily chat] then translated into Vietnamese using OpenAI GPT-3
13
+
14
+ + For more information: email me at duyhunghd6@gmail.com | http://fb.com/hungbui2013