vickt commited on
Commit
f35d05c
1 Parent(s): 4a7228a

End of training

Browse files
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ metrics:
5
+ - rouge
6
+ - precision
7
+ - recall
8
+ - f1
9
+ model-index:
10
+ - name: LLM_Teached_PEGASUS_CNNDM_2
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # LLM_Teached_PEGASUS_CNNDM_2
18
+
19
+ This model was trained from scratch on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 1.7016
22
+ - Rouge1: 0.4651
23
+ - Rouge2: 0.2076
24
+ - Rougel: 0.3457
25
+ - Rougelsum: 0.3459
26
+ - Gen Len: 52.1582
27
+ - Precision: 0.906
28
+ - Recall: 0.9098
29
+ - F1: 0.9077
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 2e-05
49
+ - train_batch_size: 16
50
+ - eval_batch_size: 4
51
+ - seed: 42
52
+ - gradient_accumulation_steps: 4
53
+ - total_train_batch_size: 64
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: linear
56
+ - num_epochs: 10
57
+ - mixed_precision_training: Native AMP
58
+
59
+ ### Training results
60
+
61
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Precision | Recall | F1 |
62
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:---------:|:------:|:------:|
63
+ | No log | 1.0 | 312 | 1.7705 | 0.4551 | 0.1985 | 0.335 | 0.3351 | 51.6464 | 0.9043 | 0.9073 | 0.9056 |
64
+ | 1.8539 | 2.0 | 625 | 1.7468 | 0.4578 | 0.2016 | 0.3394 | 0.3397 | 51.0627 | 0.9054 | 0.908 | 0.9065 |
65
+ | 1.8539 | 3.0 | 937 | 1.7331 | 0.4595 | 0.2019 | 0.3389 | 0.3391 | 52.9318 | 0.9039 | 0.9089 | 0.9063 |
66
+ | 1.7903 | 4.0 | 1250 | 1.7226 | 0.4606 | 0.2032 | 0.3406 | 0.3405 | 52.8055 | 0.9046 | 0.9094 | 0.9068 |
67
+ | 1.746 | 5.0 | 1562 | 1.7132 | 0.4642 | 0.2068 | 0.3453 | 0.3453 | 51.7873 | 0.9062 | 0.9096 | 0.9077 |
68
+ | 1.746 | 6.0 | 1875 | 1.7117 | 0.463 | 0.2055 | 0.3435 | 0.3436 | 53.4382 | 0.905 | 0.91 | 0.9073 |
69
+ | 1.7173 | 7.0 | 2187 | 1.7057 | 0.4644 | 0.2073 | 0.3456 | 0.3457 | 52.1718 | 0.906 | 0.9099 | 0.9078 |
70
+ | 1.7004 | 8.0 | 2500 | 1.7033 | 0.4668 | 0.2084 | 0.3464 | 0.3466 | 51.9 | 0.9063 | 0.91 | 0.908 |
71
+ | 1.7004 | 9.0 | 2812 | 1.7027 | 0.4651 | 0.2074 | 0.3457 | 0.3458 | 52.3591 | 0.906 | 0.9099 | 0.9078 |
72
+ | 1.6888 | 9.98 | 3120 | 1.7016 | 0.4651 | 0.2076 | 0.3457 | 0.3459 | 52.1582 | 0.906 | 0.9098 | 0.9077 |
73
+
74
+
75
+ ### Framework versions
76
+
77
+ - Transformers 4.36.0
78
+ - Pytorch 2.0.1+cu117
79
+ - Datasets 2.7.1
80
+ - Tokenizers 0.15.2
generation_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 0,
4
+ "eos_token_id": 1,
5
+ "forced_eos_token_id": 1,
6
+ "length_penalty": 0.8,
7
+ "max_length": 128,
8
+ "min_length": 32,
9
+ "num_beams": 8,
10
+ "pad_token_id": 0,
11
+ "transformers_version": "4.36.0"
12
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9ebd6ccf0d5d009c8907c0c594e304c369b78333252ad9f1c2ca2dfd5adb1f12
3
  size 2283652852
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:434c1a5389f8b428de8155355c95aaf56971b2df5a8613999e500cb15c34432a
3
  size 2283652852
runs/Mar11_17-27-37_muyg4vctr1710140194365-gzplj/events.out.tfevents.1710149260.muyg4vctr1710140194365-gzplj.10947.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a7bcfc4c0aaec3ac4d0b114aa610103cb0504a3f1bc28abdb6d0d4a867372c96
3
- size 12289
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca066f91ec3a0d7ef959c26a90c085066e2a3e3e35c4fd69f53517ea5ffa55cb
3
+ size 13317