keisawada commited on
Commit
5db0508
1 Parent(s): e50d65b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md CHANGED
@@ -62,7 +62,27 @@ A 24-layer, 2048-hidden-size transformer-based language model.
62
 
63
  # Training
64
  The model was trained on [Japanese C4](https://huggingface.co/datasets/allenai/c4), [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.
 
65
  # Tokenization
66
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  # Licenese
68
  [The MIT license](https://opensource.org/licenses/MIT)
62
 
63
  # Training
64
  The model was trained on [Japanese C4](https://huggingface.co/datasets/allenai/c4), [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz) and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective. It reaches around 14 perplexity on a chosen validation set from the same data.
65
+
66
  # Tokenization
67
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer. The vocabulary was first trained on a selected subset from the training data using the official sentencepiece training script, and then augmented with emojis and symbols.
68
+
69
+ # How to cite
70
+ ~~~
71
+ @misc{rinna-japanese-gpt-1b,
72
+ title = {rinna/japanese-gpt-1b},
73
+ author = {Zhao, Tianyu and Sawada, Kei}
74
+ url = {https://huggingface.co/rinna/japanese-gpt-1b},
75
+ }
76
+
77
+ @inproceedings{sawada2024release,
78
+ title = {Release of Pre-Trained Models for the {J}apanese Language},
79
+ author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
80
+ booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
81
+ month = {5},
82
+ year = {2024},
83
+ url = {https://arxiv.org/abs/2404.01657},
84
+ }
85
+ ~~~
86
+
87
  # Licenese
88
  [The MIT license](https://opensource.org/licenses/MIT)