dkawahara commited on
Commit
03d5bd1
1 Parent(s): 17a38ab

Updated README.md.

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -32,11 +32,11 @@ You can fine-tune this model on downstream tasks.
32
 
33
  ## Tokenization
34
 
35
- The input text should be segmented into words by [Juman++](https://github.com/ku-nlp/jumanpp) in advance. Each word is tokenized into subwords by [sentencepiece](https://github.com/google/sentencepiece).
36
 
37
  ## Vocabulary
38
 
39
- The vocabulary consists of 32000 subwords induced by the unigram language model of [sentencepiece](https://github.com/google/sentencepiece).
40
 
41
  ## Training procedure
42
 
@@ -53,6 +53,7 @@ The following hyperparameters were used during pretraining:
53
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
  - lr_scheduler_type: linear
55
  - training_steps: 700000
 
56
  - mixed_precision_training: Native AMP
57
 
58
  ## Performance on JGLUE
 
32
 
33
  ## Tokenization
34
 
35
+ The input text should be segmented into words by [Juman++](https://github.com/ku-nlp/jumanpp) in advance. Each word is tokenized into tokens by [sentencepiece](https://github.com/google/sentencepiece).
36
 
37
  ## Vocabulary
38
 
39
+ The vocabulary consists of 32000 tokens including words ([JumanDIC](https://github.com/ku-nlp/JumanDIC)) and subwords induced by the unigram language model of [sentencepiece](https://github.com/google/sentencepiece).
40
 
41
  ## Training procedure
42
 
 
53
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
  - lr_scheduler_type: linear
55
  - training_steps: 700000
56
+ - warmup_steps: 10000
57
  - mixed_precision_training: Native AMP
58
 
59
  ## Performance on JGLUE