pszemraj commited on
Commit
11dbf7a
1 Parent(s): a2b4f17

add 4 epochs tuning

Browse files
README.md CHANGED
@@ -2,55 +2,19 @@
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
- - distilgpt2
6
- - email generation
7
- - email
8
- datasets:
9
- - aeslc
10
- - postbot/multi-emails-100k
11
-
12
- widget:
13
- - text: "Good Morning Professor Beans,
14
-
15
- Hope you are doing well. I just wanted to reach out and ask if differential calculus will be on the exam"
16
- example_title: "email to prof"
17
- - text: "Hey <NAME>,\n\nThank you for signing up for my weekly newsletter. Before we get started, you'll have to confirm your email address."
18
- example_title: "newsletter"
19
- - text: "Hi <NAME>,\n\nI hope this email finds you well. I wanted to reach out and ask about office hours"
20
- example_title: "office hours"
21
- - text: "Greetings <NAME>,\n\nI hope you had a splendid evening at the Company sausage eating festival. I am reaching out because"
22
- example_title: "festival"
23
- - text: "Good Morning Harold,\n\nI was wondering when the next"
24
- example_title: "event"
25
- - text: "URGENT - I need the TPS reports"
26
- example_title: "URGENT"
27
- - text: "Hi Archibald,\n\nI hope this email finds you extremely well."
28
- example_title: "emails that find you"
29
- - text: "Hello there.\n\nI just wanted to reach out and check in to"
30
- example_title: "checking in"
31
- - text: "Hello <NAME>,\n\nI hope this email finds you well. I wanted to reach out and see if you've enjoyed your time with us"
32
- example_title: "work well"
33
- - text: "Hi <NAME>,\n\nI hope this email finds you well. I wanted to reach out and see if we could catch up"
34
- example_title: "catch up"
35
- - text: "I'm <NAME> and I just moved into the area and wanted to reach out and get some details on where I could get groceries and"
36
- example_title: "grocery"
37
- parameters:
38
- min_length: 4
39
- max_length: 128
40
- length_penalty: 0.8
41
- no_repeat_ngram_size: 2
42
- do_sample: False
43
- num_beams: 12
44
- early_stopping: True
45
- repetition_penalty: 2.5
46
  ---
47
 
 
 
48
 
49
- # distilgpt2-emailgen-v2
50
 
51
- This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the postbot/multi-emails-100k dataset.
52
  It achieves the following results on the evaluation set:
53
- - Loss: 2.0401
54
 
55
  ## Model description
56
 
@@ -69,27 +33,26 @@ More information needed
69
  ### Training hyperparameters
70
 
71
  The following hyperparameters were used during training:
72
- - learning_rate: 0.001
73
- - train_batch_size: 8
74
- - eval_batch_size: 8
75
  - seed: 42
76
  - distributed_type: multi-GPU
77
- - gradient_accumulation_steps: 16
78
  - total_train_batch_size: 128
79
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
80
  - lr_scheduler_type: cosine
81
- - lr_scheduler_warmup_ratio: 0.02
82
- - num_epochs: 5
83
 
84
  ### Training results
85
 
86
  | Training Loss | Epoch | Step | Validation Loss |
87
  |:-------------:|:-----:|:----:|:---------------:|
88
- | 2.4393 | 1.0 | 789 | 2.3821 |
89
- | 2.1549 | 2.0 | 1578 | 2.1982 |
90
- | 2.1424 | 3.0 | 2367 | 2.1065 |
91
- | 1.9885 | 4.0 | 3156 | 2.0514 |
92
- | 1.806 | 5.0 | 3945 | 2.0401 |
93
 
94
 
95
  ### Framework versions
 
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
+ model-index:
6
+ - name: distilgpt2-emailgen-V2-emailgen_DS-multi-clean-100k_Ep-4_Bs-16
7
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
 
13
+ # distilgpt2-emailgen-V2-emailgen_DS-multi-clean-100k_Ep-4_Bs-16
14
 
15
+ This model is a fine-tuned version of [postbot/distilgpt2-emailgen-V2](https://huggingface.co/postbot/distilgpt2-emailgen-V2) on the None dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 1.9126
18
 
19
  ## Model description
20
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
+ - learning_rate: 0.0006
37
+ - train_batch_size: 16
38
+ - eval_batch_size: 16
39
  - seed: 42
40
  - distributed_type: multi-GPU
41
+ - gradient_accumulation_steps: 8
42
  - total_train_batch_size: 128
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
+ - lr_scheduler_warmup_ratio: 0.01
46
+ - num_epochs: 4
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:----:|:---------------:|
52
+ | 1.9045 | 1.0 | 789 | 2.0006 |
53
+ | 1.8115 | 2.0 | 1578 | 1.9557 |
54
+ | 1.8501 | 3.0 | 2367 | 1.9110 |
55
+ | 1.7376 | 4.0 | 3156 | 1.9126 |
 
56
 
57
 
58
  ### Framework versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "distilgpt2",
3
  "_num_labels": 1,
4
  "activation_function": "gelu_new",
5
  "architectures": [
 
1
  {
2
+ "_name_or_path": "postbot/distilgpt2-emailgen-V2",
3
  "_num_labels": 1,
4
  "activation_function": "gelu_new",
5
  "architectures": [
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:264051a4107113d7cccbff8e4c2b9afcaa6208ac97db37edb57fa161cdd1d5dc
3
  size 333969117
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f564a5fc55c8e20acf4d8195073ee3c5a0ce012ab28beb9c5bdf357fa1b26b7
3
  size 333969117
tokenizer_config.json CHANGED
@@ -19,7 +19,7 @@
19
  },
20
  "errors": "replace",
21
  "model_max_length": 1024,
22
- "name_or_path": "distilgpt2",
23
  "pad_token": null,
24
  "special_tokens_map_file": null,
25
  "tokenizer_class": "GPT2Tokenizer",
 
19
  },
20
  "errors": "replace",
21
  "model_max_length": 1024,
22
+ "name_or_path": "postbot/distilgpt2-emailgen-V2",
23
  "pad_token": null,
24
  "special_tokens_map_file": null,
25
  "tokenizer_class": "GPT2Tokenizer",
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dde570ff07cb025ff9030c6c32fd2d0b180fd30e7ee1096d7ce8a33b3d149c7f
3
- size 3567
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d55d1b578701be043c5e54f68e065520d66dee6b2f36f998afd658361f854c85
3
+ size 3631