pszemraj commited on
Commit
ac37ba7
1 Parent(s): 575c037

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -32,16 +32,14 @@ inference:
32
  min_length: 2
33
  max_length: 64
34
  length_penalty: 0.7
35
- no_repeat_ngram_size: 3
36
  do_sample: True
37
- top_p: 0.90
38
- top_k: 15
39
  repetition_penalty: 2.1
40
 
41
  ---
42
 
43
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
44
- should probably proofread and complete it, then remove this comment. -->
45
 
46
  # distilgpt2-tiny-conversational
47
 
@@ -55,15 +53,16 @@ It achieves the following results on the evaluation set:
55
 
56
  ## Intended uses & limitations
57
 
58
- More information needed
59
 
60
  ## Training and evaluation data
61
 
62
- More information needed
63
 
64
  ## Training procedure
65
 
66
- - deepspeed
 
67
  ### Training hyperparameters
68
 
69
  The following hyperparameters were used during training:
 
32
  min_length: 2
33
  max_length: 64
34
  length_penalty: 0.7
35
+ no_repeat_ngram_size: 2
36
  do_sample: True
37
+ top_p: 0.95
38
+ top_k: 30
39
  repetition_penalty: 2.1
40
 
41
  ---
42
 
 
 
43
 
44
  # distilgpt2-tiny-conversational
45
 
 
53
 
54
  ## Intended uses & limitations
55
 
56
+ - [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
57
 
58
  ## Training and evaluation data
59
 
60
+ - [wizard of Wikipedia](https://parl.ai/projects/wizard_of_wikipedia/) parsed, from parlAI
61
 
62
  ## Training procedure
63
 
64
+ - deepspeed + huggingface trainer, an example notebook is in [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
65
+
66
  ### Training hyperparameters
67
 
68
  The following hyperparameters were used during training: