mia naomi commited on
Commit
ad1bad8
1 Parent(s): f41b4c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -11,7 +11,8 @@ datasets:
11
 
12
  # GPT-J 6b Shakespeare
13
 
14
- <p style="color:green"> <b> The "Hosted inference API" does not work. Go to the <a href="https://huggingface.co/crumb/gpt-j-6b-shakespeare#how-to-use">How to Use</a> section </b>
 
15
 
16
  ## Model Description
17
 
@@ -42,7 +43,7 @@ Trained on 1 Tesla T4 from [google colab](https://colab.research.google.com/)
42
 
43
  ```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
44
 
45
- A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit), or with the notebook [in this repository](https://huggingface.co/crumb/gpt-j-6b-shakespeare/blob/main/gpt_j_6b_bias+norm_fit.ipynb) which you can download and open in [google colab](https://colab.research.google.com/) or any other ipynb service
46
 
47
  No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
48
 
 
11
 
12
  # GPT-J 6b Shakespeare
13
 
14
+ <p style="color:green"> <b> 1.) The "Hosted inference API" does not work. Go to the <a href="https://huggingface.co/crumb/gpt-j-6b-shakespeare#how-to-use">How to Use</a> section <br>
15
+ 2.) This is a "proof of concept" and not fully trained, simple training script also in "How to Use" section. </b>
16
 
17
  ## Model Description
18
 
 
43
 
44
  ```TrainOutput(global_step=147, training_loss=1.665000240818984, metrics={'train_runtime': 2828.7347, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.052, 'total_flos': 1555992281088.0, 'train_loss': 1.665000240818984, 'epoch': 1.0})```
45
 
46
+ A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit), or with the notebook in [this repository](https://github.com/aicrumb/gpt-j-8bit) which you can download and open in [google colab](https://colab.research.google.com/) or any other ipynb service
47
 
48
  No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
49