Update README.md
Browse files
README.md
CHANGED
@@ -32,16 +32,14 @@ inference:
|
|
32 |
min_length: 2
|
33 |
max_length: 64
|
34 |
length_penalty: 0.7
|
35 |
-
no_repeat_ngram_size:
|
36 |
do_sample: True
|
37 |
-
top_p: 0.
|
38 |
-
top_k:
|
39 |
repetition_penalty: 2.1
|
40 |
|
41 |
---
|
42 |
|
43 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
44 |
-
should probably proofread and complete it, then remove this comment. -->
|
45 |
|
46 |
# distilgpt2-tiny-conversational
|
47 |
|
@@ -55,15 +53,16 @@ It achieves the following results on the evaluation set:
|
|
55 |
|
56 |
## Intended uses & limitations
|
57 |
|
58 |
-
|
59 |
|
60 |
## Training and evaluation data
|
61 |
|
62 |
-
|
63 |
|
64 |
## Training procedure
|
65 |
|
66 |
-
- deepspeed
|
|
|
67 |
### Training hyperparameters
|
68 |
|
69 |
The following hyperparameters were used during training:
|
|
|
32 |
min_length: 2
|
33 |
max_length: 64
|
34 |
length_penalty: 0.7
|
35 |
+
no_repeat_ngram_size: 2
|
36 |
do_sample: True
|
37 |
+
top_p: 0.95
|
38 |
+
top_k: 30
|
39 |
repetition_penalty: 2.1
|
40 |
|
41 |
---
|
42 |
|
|
|
|
|
43 |
|
44 |
# distilgpt2-tiny-conversational
|
45 |
|
|
|
53 |
|
54 |
## Intended uses & limitations
|
55 |
|
56 |
+
- [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
|
57 |
|
58 |
## Training and evaluation data
|
59 |
|
60 |
+
- [wizard of Wikipedia](https://parl.ai/projects/wizard_of_wikipedia/) parsed, from parlAI
|
61 |
|
62 |
## Training procedure
|
63 |
|
64 |
+
- deepspeed + huggingface trainer, an example notebook is in [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
|
65 |
+
|
66 |
### Training hyperparameters
|
67 |
|
68 |
The following hyperparameters were used during training:
|