rinna
/

japanese-gpt2-xsmall

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tianyuz commited on Aug 16, 2021

Commit

80dd97b

•

1 Parent(s): 2c3f4b2

update readme; rinna pic

Files changed (2) hide show

README.md +1 -1
rinna.png +0 -0

README.md CHANGED Viewed

@@ -37,7 +37,7 @@ model = GPT2LMHeadModel.from_pretrained("rinna/japanese-gpt2-small")
 A 6-layer, 512-hidden-size transformer-based language model.
 # Training
-The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/jawiki/) to optimize a traditional language modelling objective on 8\\*V100 GPUs for around 4 days. It reaches around 28 perplexity on a chosen validation set from CC-100.
 # Tokenization
 The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.

 A 6-layer, 512-hidden-size transformer-based language model.
 # Training
+The model was trained on [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective on 8\\*V100 GPUs for around 4 days. It reaches around 28 perplexity on a chosen validation set from CC-100.
 # Tokenization
 The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using the official sentencepiece training script.

rinna.png ADDED Viewed