Update README.md
Browse files
README.md
CHANGED
@@ -83,7 +83,7 @@ The dataset is a translation to Spanish of [alpaca_data_cleaned.json](https://gi
|
|
83 |
|
84 |
## Finetuning
|
85 |
|
86 |
-
To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048
|
87 |
|
88 |
## Example outputs
|
89 |
|
|
|
83 |
|
84 |
## Finetuning
|
85 |
|
86 |
+
To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 on a single TPUv3-8 for 3 hours on top of BERTIN GPT-J-6B.
|
87 |
|
88 |
## Example outputs
|
89 |
|