tianyuz commited on
Commit
1954765
1 Parent(s): 7b407b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -27,17 +27,17 @@ This repository provides a base-sized Japanese RoBERTa model. The model is provi
27
  ~~~~
28
  from transformers import T5Tokenizer, RobertaModel
29
 
30
- tokenizer = T5Tokenizer.from_pretrained("rinna/japanese-roberta-base")
31
  tokenizer.do_lower_case = True # due to some bug of tokenizer config loading
32
 
33
- model = RobertaModel.from_pretrained("rinna/japanese-roberta-base")
34
  ~~~~
35
 
36
  # Model architecture
37
  A 12-layer, 768-hidden-size transformer-based masked language model.
38
 
39
  # Training
40
- The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/) to optimize a masked language modelling objective on 8\\\\\\\\*V100 GPUs for around 15 days.
41
 
42
  # Tokenization
43
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.
 
27
  ~~~~
28
  from transformers import T5Tokenizer, RobertaModel
29
 
30
+ tokenizer = T5Tokenizer.from_pretrained("rinna/japanese-roberta-base", use_auth_token=True)
31
  tokenizer.do_lower_case = True # due to some bug of tokenizer config loading
32
 
33
+ model = RobertaModel.from_pretrained("rinna/japanese-roberta-base", use_auth_token=True)
34
  ~~~~
35
 
36
  # Model architecture
37
  A 12-layer, 768-hidden-size transformer-based masked language model.
38
 
39
  # Training
40
+ The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/) to optimize a masked language modelling objective on 8\\\\\\\\\\\\\\\\*V100 GPUs for around 15 days.
41
 
42
  # Tokenization
43
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.