crumb commited on
Commit
1d44aac
1 Parent(s): 110184d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -40,7 +40,9 @@ This checkpoint was afterwards finetuned on [tiny_shakespeare](https://huggingfa
40
  | batch size | 8 |
41
  | context length (tokens) | 256 |
42
 
43
- I used a modified version of [hivemind's 8bit training script](https://huggingface.co/hivemind/gpt-j-6B-8bit) on 1 Tesla T4 for ~15 minutes
 
 
44
 
45
  No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
46
 
 
40
  | batch size | 8 |
41
  | context length (tokens) | 256 |
42
 
43
+ Trained on 1 Tesla T4 la [google colab](https://colab.research.google.com/)) for ~15 minutes
44
+
45
+ A good starting point to finetune your own gpt-j-6b would be [hivemind's 8bit training code](https://huggingface.co/hivemind/gpt-j-6B-8bit).
46
 
47
  No LORA adapters were used for the sake of easy loading and inference with 🤗. Only Linear biases and LayerNorm scales were passed to the optimizer.
48