pszemraj commited on
Commit
2788720
1 Parent(s): a31b85b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -35,10 +35,12 @@ inference:
35
  ---
36
 
37
 
38
- # opt-peter-1pt3B-1E-ps_DS-msgs_Ep-2_Bs-4
39
 
40
  This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
41
- It achieves the following results on the evaluation set:
 
 
42
  - eval_loss: 3.4220
43
  - eval_runtime: 954.9678
44
  - eval_samples_per_second: 9.114
@@ -48,7 +50,7 @@ It achieves the following results on the evaluation set:
48
 
49
  ## Model description
50
 
51
- - Exploring to see how OPT does in terms of dialogue/conversational applications
52
  - Seems to do a lot better than GPT-Neo with similar training parameters
53
 
54
  ## Intended uses & limitations
@@ -56,9 +58,6 @@ It achieves the following results on the evaluation set:
56
  - OPT has a license that does not allow for commercial use, see original for details
57
  - **any statements or claims made by this model do not reflect actual claims/statements by me**
58
 
59
- ## Training and evaluation data
60
-
61
- More information needed
62
 
63
  ## Training procedure
64
 
 
35
  ---
36
 
37
 
38
+ # pszemraj/opt-peter-1.3B
39
 
40
  This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
41
+
42
+ It achieves the following results on the evaluation set, after training for 1 epoch (_on top of the 1E checkpoint linked above_):
43
+
44
  - eval_loss: 3.4220
45
  - eval_runtime: 954.9678
46
  - eval_samples_per_second: 9.114
 
50
 
51
  ## Model description
52
 
53
+ - Exploring to see how OPT does in terms of dialogue/conversational applications :)
54
  - Seems to do a lot better than GPT-Neo with similar training parameters
55
 
56
  ## Intended uses & limitations
 
58
  - OPT has a license that does not allow for commercial use, see original for details
59
  - **any statements or claims made by this model do not reflect actual claims/statements by me**
60
 
 
 
 
61
 
62
  ## Training procedure
63