matsuo-lab commited on
Commit
112a5ad
1 Parent(s): ebb84f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,14 +17,14 @@ This repository provides a Japanese-centric multilingual GPT-NeoX model of 10 bi
17
 
18
  * **Pre-training**
19
 
20
- The model was trained on around **600B** tokens from a mixture of the following corpora
21
 
22
  - [Japanese C4](https://huggingface.co/datasets/mc4)
23
  - [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
24
 
25
  * **Instruction-supervised-finetuning**
26
 
27
- The model was finetuned on a subset records from a mixture of the following dataset
28
 
29
  - [Alpaca (English)](https://github.com/gururise/AlpacaDataCleaned/blob/main/alpaca_data_cleaned.json)
30
  - [Alpaca (Japanese translation)](https://github.com/shi3z/alpaca_ja/blob/main/alpaca_cleaned_ja.json)
 
17
 
18
  * **Pre-training**
19
 
20
+ The model was trained on around **600B** tokens from a mixture of the following corpora.
21
 
22
  - [Japanese C4](https://huggingface.co/datasets/mc4)
23
  - [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
24
 
25
  * **Instruction-supervised-finetuning**
26
 
27
+ The model was finetuned on a subset records from a mixture of the following dataset. Training epoch: 1.
28
 
29
  - [Alpaca (English)](https://github.com/gururise/AlpacaDataCleaned/blob/main/alpaca_data_cleaned.json)
30
  - [Alpaca (Japanese translation)](https://github.com/shi3z/alpaca_ja/blob/main/alpaca_cleaned_ja.json)