tianyuz commited on
Commit
72440c8
β€’
1 Parent(s): e993984

update readme

Browse files
Files changed (1) hide show
  1. README.md +16 -10
README.md CHANGED
@@ -20,15 +20,25 @@ inference: false
20
  ![rinna-icon](./rinna.png)
21
 
22
  # Overview
23
- This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters. The model was trained using code based on [EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
24
 
25
- ## Model architecture
26
- A 36-layer, 2816-hidden-size transformer-based language model.
 
27
 
28
- # Pre-training
29
- The model was trained on around **312.5B** tokens from [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese C4](https://huggingface.co/datasets/mc4), and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective.
30
 
31
- A final validation perplexity of **8.68** has been reached.
 
 
 
 
 
 
 
 
 
 
32
 
33
  # How to use the model
34
 
@@ -89,9 +99,5 @@ The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based
89
  # 'გამარ[UNK]ობა 吾輩は ηŒ«γ§γ‚γ‚‹ </s>'
90
  ~~~
91
 
92
- # Authors
93
- * [Tianyu Zhao](https://huggingface.co/tianyuz)
94
- * [Kei Sawada](https://huggingface.co/keisawada)
95
-
96
  # Licenese
97
  [The MIT license](https://opensource.org/licenses/MIT)
 
20
  ![rinna-icon](./rinna.png)
21
 
22
  # Overview
23
+ This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters.
24
 
25
+ * **Library**
26
+
27
+ The model was trained using code based on [EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
28
 
29
+ * **Model architecture**
 
30
 
31
+ A 36-layer, 2816-hidden-size transformer-based language model.
32
+
33
+ * **Pre-training**
34
+
35
+ The model was trained on around **312.5B** tokens from [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese C4](https://huggingface.co/datasets/mc4), and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective.
36
+
37
+ A final validation perplexity of **8.68** has been reached.
38
+
39
+ * **Authors**
40
+
41
+ [Tianyu Zhao](https://huggingface.co/tianyuz) and [Kei Sawada](https://huggingface.co/keisawada)
42
 
43
  # How to use the model
44
 
 
99
  # 'გამარ[UNK]ობა 吾輩は ηŒ«γ§γ‚γ‚‹ </s>'
100
  ~~~
101
 
 
 
 
 
102
  # Licenese
103
  [The MIT license](https://opensource.org/licenses/MIT)