rinna
/

japanese-gpt-1b

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

keisawada commited on Apr 3

Commit

5db0508

•

1 Parent(s): e50d65b

Update README.md

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -62,7 +62,27 @@ A 24-layer, 2048-hidden-size transformer-based language model.
 # Training
 The model was trained on [Japanese C4](https://huggingface.co/datasets/allenai/c4), [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.
 # Tokenization
 The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.
 # Licenese
 [The MIT license](https://opensource.org/licenses/MIT)

 # Training
 The model was trained on [Japanese C4](https://huggingface.co/datasets/allenai/c4), [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.
 # Tokenization
 The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.
+# How to cite
+~~~
+@misc{rinna-japanese-gpt-1b,
+    title = {rinna/japanese-gpt-1b},
+    author = {Zhao, Tianyu and Sawada, Kei}
+    url = {https://huggingface.co/rinna/japanese-gpt-1b},
+}
+@inproceedings{sawada2024release,
+    title = {Release of Pre-Trained Models for the {J}apanese Language},
+    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
+    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
+    month = {5},
+    year = {2024},
+    url = {https://arxiv.org/abs/2404.01657},
+}
+~~~
 # Licenese
 [The MIT license](https://opensource.org/licenses/MIT)