Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,8 @@ model-index:
|
|
17 |
|
18 |
# Suzume
|
19 |
|
|
|
|
|
20 |
This Suzume 8B, a multilingual finetune of Llama 3.
|
21 |
|
22 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
@@ -254,3 +256,22 @@ The following hyperparameters were used during training:
|
|
254 |
- Pytorch 2.2.1+cu121
|
255 |
- Datasets 2.18.0
|
256 |
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
# Suzume
|
19 |
|
20 |
+
[[Paper](https://arxiv.org/abs/2405.12612)] [[Dataset](https://huggingface.co/datasets/lightblue/tagengo-gpt4)]
|
21 |
+
|
22 |
This Suzume 8B, a multilingual finetune of Llama 3.
|
23 |
|
24 |
Llama 3 has exhibited excellent performance on many English language benchmarks.
|
|
|
256 |
- Pytorch 2.2.1+cu121
|
257 |
- Datasets 2.18.0
|
258 |
- Tokenizers 0.15.0
|
259 |
+
|
260 |
+
# How to cite
|
261 |
+
|
262 |
+
Please cite [this paper](https://arxiv.org/abs/2405.12612) when referencing this model.
|
263 |
+
|
264 |
+
```tex
|
265 |
+
@misc{devine2024tagengo,
|
266 |
+
title={Tagengo: A Multilingual Chat Dataset},
|
267 |
+
author={Peter Devine},
|
268 |
+
year={2024},
|
269 |
+
eprint={2405.12612},
|
270 |
+
archivePrefix={arXiv},
|
271 |
+
primaryClass={cs.CL}
|
272 |
+
}
|
273 |
+
```
|
274 |
+
|
275 |
+
# Developer
|
276 |
+
|
277 |
+
Peter Devine - ([ptrdvn](https://huggingface.co/ptrdvn))
|