mia naomi
commited on
Commit
•
ad1bad8
1
Parent(s):
f41b4c6
Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,8 @@ datasets:
|
|
11 |
|
12 |
# GPT-J 6b Shakespeare
|
13 |
|
14 |
-
<p style="color:green"> <b> The "Hosted inference API" does not work. Go to the <a href="https://huggingface.co/crumb/gpt-j-6b-shakespeare#how-to-use">How to Use</a> section
|
|
|
15 |
|
16 |
## Model Description
|
17 |
|
@@ -42,7 +43,7 @@ Trained on 1 Tesla T4 from [google colab](https://colab.research.google.com/)
|
|
42 |
|
43 |
```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
|
44 |
|
45 |
-
A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit), or with the notebook
|
46 |
|
47 |
No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
|
48 |
|
|
|
11 |
|
12 |
# GPT-J 6b Shakespeare
|
13 |
|
14 |
+
<p style="color:green"> <b> 1.) The "Hosted inference API" does not work. Go to the <a href="https://huggingface.co/crumb/gpt-j-6b-shakespeare#how-to-use">How to Use</a> section <br>
|
15 |
+
2.) This is a "proof of concept" and not fully trained, simple training script also in "How to Use" section. </b>
|
16 |
|
17 |
## Model Description
|
18 |
|
|
|
43 |
|
44 |
```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
|
45 |
|
46 |
+
A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit), or with the notebook in [this repository](https://github.com/aicrumb/gpt-j-8bit) which you can download and open in [google colab](https://colab.research.google.com/) or any other ipynb service
|
47 |
|
48 |
No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
|
49 |
|