Update README.md
Browse files
README.md
CHANGED
@@ -35,10 +35,12 @@ inference:
|
|
35 |
---
|
36 |
|
37 |
|
38 |
-
# opt-peter-
|
39 |
|
40 |
This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
|
41 |
-
|
|
|
|
|
42 |
- eval_loss: 3.4220
|
43 |
- eval_runtime: 954.9678
|
44 |
- eval_samples_per_second: 9.114
|
@@ -48,7 +50,7 @@ It achieves the following results on the evaluation set:
|
|
48 |
|
49 |
## Model description
|
50 |
|
51 |
-
- Exploring to see how OPT does in terms of dialogue/conversational applications
|
52 |
- Seems to do a lot better than GPT-Neo with similar training parameters
|
53 |
|
54 |
## Intended uses & limitations
|
@@ -56,9 +58,6 @@ It achieves the following results on the evaluation set:
|
|
56 |
- OPT has a license that does not allow for commercial use, see original for details
|
57 |
- **any statements or claims made by this model do not reflect actual claims/statements by me**
|
58 |
|
59 |
-
## Training and evaluation data
|
60 |
-
|
61 |
-
More information needed
|
62 |
|
63 |
## Training procedure
|
64 |
|
|
|
35 |
---
|
36 |
|
37 |
|
38 |
+
# pszemraj/opt-peter-1.3B
|
39 |
|
40 |
This model is a fine-tuned version of [pszemraj/opt-peter-1.3B-1E](https://huggingface.co/pszemraj/opt-peter-1.3B-1E) on 80k Whatsapp/iMessages (mine).
|
41 |
+
|
42 |
+
It achieves the following results on the evaluation set, after training for 1 epoch (_on top of the 1E checkpoint linked above_):
|
43 |
+
|
44 |
- eval_loss: 3.4220
|
45 |
- eval_runtime: 954.9678
|
46 |
- eval_samples_per_second: 9.114
|
|
|
50 |
|
51 |
## Model description
|
52 |
|
53 |
+
- Exploring to see how OPT does in terms of dialogue/conversational applications :)
|
54 |
- Seems to do a lot better than GPT-Neo with similar training parameters
|
55 |
|
56 |
## Intended uses & limitations
|
|
|
58 |
- OPT has a license that does not allow for commercial use, see original for details
|
59 |
- **any statements or claims made by this model do not reflect actual claims/statements by me**
|
60 |
|
|
|
|
|
|
|
61 |
|
62 |
## Training procedure
|
63 |
|