Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ widget:
|
|
15 |
|
16 |
For further details read [our paper](http://real.mtak.hu/173960/1/TSD_2023_GPT.pdf) or testing our instruct model, see [our demo site](https://juniper.nytud.hu/demo/gptrio).
|
17 |
|
18 |
-
- Hungarian-English-Chinese trilingual GPT-NeoX model (
|
19 |
- Trained with EleutherAI's GPT-NeoX [github](https://github.com/EleutherAI/gpt-neox)
|
20 |
- Checkpoint: 410 000 steps
|
21 |
|
@@ -31,6 +31,7 @@ For further details read [our paper](http://real.mtak.hu/173960/1/TSD_2023_GPT.p
|
|
31 |
|
32 |
- max_seq_length = 2048
|
33 |
- float16
|
|
|
34 |
|
35 |
|
36 |
## Citation
|
|
|
15 |
|
16 |
For further details read [our paper](http://real.mtak.hu/173960/1/TSD_2023_GPT.pdf) or testing our instruct model, see [our demo site](https://juniper.nytud.hu/demo/gptrio).
|
17 |
|
18 |
+
- Hungarian-English-Chinese trilingual GPT-NeoX model (7.67B billion parameter)
|
19 |
- Trained with EleutherAI's GPT-NeoX [github](https://github.com/EleutherAI/gpt-neox)
|
20 |
- Checkpoint: 410 000 steps
|
21 |
|
|
|
31 |
|
32 |
- max_seq_length = 2048
|
33 |
- float16
|
34 |
+
- vocab size: 150 016
|
35 |
|
36 |
|
37 |
## Citation
|