pszemraj
/

opt-peter-1.3B

Text Generation

Generated from Trainer

Inference Endpoints

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

pszemraj commited on May 22, 2022

Commit

2788720

•

1 Parent(s): a31b85b

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -35,10 +35,12 @@ inference:
 ---
-# opt-peter-1pt3B-1E-ps_DS-msgs_Ep-2_Bs-4
 This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
-It achieves the following results on the evaluation set:
 - eval_loss: 3.4220
 - eval_runtime: 954.9678
 - eval_samples_per_second: 9.114
@@ -48,7 +50,7 @@ It achieves the following results on the evaluation set:
 ## Model description
-- Exploring to see how OPT does in terms of dialogue/conversational applications
 - Seems to do a lot better than GPT-Neo with similar training parameters
 ## Intended uses & limitations
@@ -56,9 +58,6 @@ It achieves the following results on the evaluation set:
 - OPT has a license that does not allow for commercial use, see original for details
 - **any statements or claims made by this model do not reflect actual claims/statements by me**
-## Training and evaluation data
-More information needed
 ## Training procedure

 ---
+# pszemraj/opt-peter-1.3B
 This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
+It achieves the following results on the evaluation set, after training for 1 epoch (_on top of the 1E checkpoint linked above_):
 - eval_loss: 3.4220
 - eval_runtime: 954.9678
 - eval_samples_per_second: 9.114
 ## Model description
+- Exploring to see how OPT does in terms of dialogue/conversational applications :)
 - Seems to do a lot better than GPT-Neo with similar training parameters
 ## Intended uses & limitations
 - OPT has a license that does not allow for commercial use, see original for details
 - **any statements or claims made by this model do not reflect actual claims/statements by me**
 ## Training procedure