mia naomi
commited on
Commit
•
1caebaf
1
Parent(s):
23fe965
Update README.md
Browse files
README.md
CHANGED
@@ -38,11 +38,11 @@ This checkpoint was afterwards finetuned on [tiny_shakespeare](https://huggingfa
|
|
38 |
| batch size | 8 |
|
39 |
| context length (tokens) | 256 |
|
40 |
|
41 |
-
Trained on 1 Tesla T4 from
|
42 |
|
43 |
```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
|
44 |
|
45 |
-
A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit).
|
46 |
|
47 |
No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
|
48 |
|
|
|
38 |
| batch size | 8 |
|
39 |
| context length (tokens) | 256 |
|
40 |
|
41 |
+
Trained on 1 Tesla T4 from [google colab](https://colab.research.google.com/)
|
42 |
|
43 |
```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
|
44 |
|
45 |
+
A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit), or with the notebook [in this repository](https://huggingface.co/crumb/gpt-j-6b-shakespeare/blob/main/gpt_j_6b_bias+norm_fit.ipynb) which you can download and open in [google colab](https://colab.research.google.com/) or any other ipynb service
|
46 |
|
47 |
No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
|
48 |
|