update readme; rinna pic
Browse files
README.md
CHANGED
@@ -37,7 +37,7 @@ model = GPT2LMHeadModel.from_pretrained("rinna/japanese-gpt2-small")
|
|
37 |
A 6-layer, 512-hidden-size transformer-based language model.
|
38 |
|
39 |
# Training
|
40 |
-
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/
|
41 |
|
42 |
# Tokenization
|
43 |
The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.
|
|
|
37 |
A 6-layer, 512-hidden-size transformer-based language model.
|
38 |
|
39 |
# Training
|
40 |
+
The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective on 8\\*V100 GPUs for around 4 days. It reaches around 28 perplexity on a chosen validation set from CC-100.
|
41 |
|
42 |
# Tokenization
|
43 |
The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.
|
rinna.png
ADDED