dkawahara commited on
Commit
17a38ab
1 Parent(s): 8341167

Updated README.md.

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -1,7 +1,5 @@
1
  ---
2
  language: ja
3
- tags:
4
- - exbert
5
  license: cc-by-sa-4.0
6
  datasets:
7
  - wikipedia
@@ -30,7 +28,7 @@ encoding = tokenizer(sentence, return_tensors='pt')
30
  ...
31
  ```
32
 
33
- You can use this model for fine-tuning on downstream tasks.
34
 
35
  ## Tokenization
36
 
@@ -42,7 +40,7 @@ The vocabulary consists of 32000 subwords induced by the unigram language model
42
 
43
  ## Training procedure
44
 
45
- This model was trained on Japanese Wikipedia and the Japanese portion of CC-100. It took a week using eight NVIDIA A100 GPUs.
46
 
47
  The following hyperparameters were used during pretraining:
48
  - learning_rate: 1e-4
 
1
  ---
2
  language: ja
 
 
3
  license: cc-by-sa-4.0
4
  datasets:
5
  - wikipedia
 
28
  ...
29
  ```
30
 
31
+ You can fine-tune this model on downstream tasks.
32
 
33
  ## Tokenization
34
 
 
40
 
41
  ## Training procedure
42
 
43
+ This model was trained on Japanese Wikipedia (as of 20210920) and the Japanese portion of CC-100. It took a week using eight NVIDIA A100 GPUs.
44
 
45
  The following hyperparameters were used during pretraining:
46
  - learning_rate: 1e-4