crumb
commited on
Commit
•
3380329
1
Parent(s):
3fc66ba
Update README.md
Browse files
README.md
CHANGED
@@ -42,7 +42,7 @@ This checkpoint was afterwards finetuned on [tiny_shakespeare](https://huggingfa
|
|
42 |
|
43 |
I used a modified version of [hivemind's 8bit training script](https://huggingface.co/hivemind/gpt-j-6B-8bit) on 1 Tesla T4 for ~15 minutes
|
44 |
|
45 |
-
No LORA adapters were used for the sake of easy loading and inference with 🤗.
|
46 |
|
47 |
End loss: 0.1757839471101761
|
48 |
|
|
|
42 |
|
43 |
I used a modified version of [hivemind's 8bit training script](https://huggingface.co/hivemind/gpt-j-6B-8bit) on 1 Tesla T4 for ~15 minutes
|
44 |
|
45 |
+
No LORA adapters were used for the sake of easy loading and inference with 🤗. Only feed forward biases and layernorms were finetuned.
|
46 |
|
47 |
End loss: 0.1757839471101761
|
48 |
|