crumb
/

gpt-j-6b-shakespeare

Text Generation

Model card Files Files and versions Community

crumb commited on Jul 8, 2022

Commit

3380329

•

1 Parent(s): 3fc66ba

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -42,7 +42,7 @@ This checkpoint was afterwards finetuned on [tiny_shakespeare](https://huggingfa
 I used a modified version of [hivemind's 8bit training script](https://huggingface.co/hivemind/gpt-j-6B-8bit) on 1 Tesla T4 for ~15 minutes
-No LORA adapters were used for the sake of easy loading and inference with 🤗. Finetuning was done traditionally (all parameters were passed to optimizer)
 End loss: 0.1757839471101761

 I used a modified version of [hivemind's 8bit training script](https://huggingface.co/hivemind/gpt-j-6B-8bit) on 1 Tesla T4 for ~15 minutes
+No LORA adapters were used for the sake of easy loading and inference with 🤗. Only feed forward biases and layernorms were finetuned.
 End loss: 0.1757839471101761