Afrizal Hasbi Azizy commited on
Commit
607ee93
β€’
1 Parent(s): a00ec2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -41,7 +41,7 @@ Selamat datang!
41
 
42
  I am ultra-overjoyed to introduce you... the 🦌 Kancil! It's a fine-tuned version of Llama 3 8B with the TumpengQA, an instruction dataset of 6.7 million words. Both the model and dataset is openly available in Huggingface.
43
 
44
- πŸ“š The dataset was synthetically generated from Llama 3 70B. A big problem with existing Indonesian instruction dataset is they're really badly translated versions of English datasets. Llama 3 70B can generate fluent Indonesian! (with minor caveats πŸ˜”)
45
 
46
  🦚 This follows previous efforts for collection of open, fine-tuned Indonesian models, like Merak and Cendol. However, Kancil solely leverages synthetic data in a very creative way, which makes it a very unique contribution!
47
 
 
41
 
42
  I am ultra-overjoyed to introduce you... the 🦌 Kancil! It's a fine-tuned version of Llama 3 8B with the TumpengQA, an instruction dataset of 6.7 million words. Both the model and dataset is openly available in Huggingface.
43
 
44
+ πŸ“š The dataset was synthetically generated from Llama 3 70B. A big problem with existing Indonesian instruction dataset is they're in reality not-very-good-translations of English datasets. Llama 3 70B can generate fluent Indonesian! (with minor caveats πŸ˜”)
45
 
46
  🦚 This follows previous efforts for collection of open, fine-tuned Indonesian models, like Merak and Cendol. However, Kancil solely leverages synthetic data in a very creative way, which makes it a very unique contribution!
47