alexdg19 commited on
Commit
b56ab03
1 Parent(s): 158163e

End of training

Browse files
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: alexdg19/bert_large_xsum_samsum
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - samsum
8
+ metrics:
9
+ - rouge
10
+ model-index:
11
+ - name: bert_large_xsum_samsum3
12
+ results:
13
+ - task:
14
+ name: Sequence-to-sequence Language Modeling
15
+ type: text2text-generation
16
+ dataset:
17
+ name: samsum
18
+ type: samsum
19
+ config: samsum
20
+ split: test
21
+ args: samsum
22
+ metrics:
23
+ - name: Rouge1
24
+ type: rouge
25
+ value: 0.5313
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # bert_large_xsum_samsum3
32
+
33
+ This model is a fine-tuned version of [alexdg19/bert_large_xsum_samsum](https://huggingface.co/alexdg19/bert_large_xsum_samsum) on the samsum dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 2.2354
36
+ - Rouge1: 0.5313
37
+ - Rouge2: 0.2827
38
+ - Rougel: 0.4367
39
+ - Rougelsum: 0.4357
40
+ - Gen Len: 30.939
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 2e-05
60
+ - train_batch_size: 2
61
+ - eval_batch_size: 2
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 2
64
+ - total_train_batch_size: 4
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - num_epochs: 10
68
+ - mixed_precision_training: Native AMP
69
+
70
+ ### Training results
71
+
72
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
73
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
74
+ | No log | 1.0 | 164 | 1.1370 | 0.5599 | 0.3246 | 0.4748 | 0.4743 | 29.0122 |
75
+ | No log | 2.0 | 328 | 1.2659 | 0.5494 | 0.3033 | 0.4623 | 0.4612 | 27.0671 |
76
+ | No log | 3.0 | 492 | 1.4188 | 0.5198 | 0.2726 | 0.436 | 0.4346 | 28.6768 |
77
+ | 0.6603 | 4.0 | 656 | 1.5628 | 0.5391 | 0.2905 | 0.4555 | 0.4553 | 28.6159 |
78
+ | 0.6603 | 5.0 | 820 | 1.9045 | 0.5237 | 0.2774 | 0.4326 | 0.4321 | 31.5854 |
79
+ | 0.6603 | 6.0 | 984 | 2.0670 | 0.5199 | 0.2689 | 0.4251 | 0.4243 | 31.8049 |
80
+ | 0.1722 | 7.0 | 1148 | 1.9653 | 0.5269 | 0.2703 | 0.4342 | 0.4333 | 28.5122 |
81
+ | 0.1722 | 8.0 | 1312 | 2.1921 | 0.5296 | 0.2765 | 0.4393 | 0.4387 | 31.8354 |
82
+ | 0.1722 | 9.0 | 1476 | 2.4336 | 0.5299 | 0.2825 | 0.4399 | 0.4388 | 31.7988 |
83
+ | 0.052 | 10.0 | 1640 | 2.2354 | 0.5313 | 0.2827 | 0.4367 | 0.4357 | 30.939 |
84
+
85
+
86
+ ### Framework versions
87
+
88
+ - Transformers 4.35.0
89
+ - Pytorch 2.1.0+cu118
90
+ - Datasets 2.14.6
91
+ - Tokenizers 0.14.1
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 2,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "forced_eos_token_id": 2,
7
+ "max_length": 62,
8
+ "min_length": 11,
9
+ "no_repeat_ngram_size": 3,
10
+ "num_beams": 6,
11
+ "pad_token_id": 1,
12
+ "transformers_version": "4.35.0"
13
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3af9e27a60472f51a662041eb071dfba4a9e7b62464b102bebae6d2c07d76c48
3
  size 1625422896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d2e1aa509ea7bbecb55bb07876cc9490cc24bba47c1d22aa232927a875bcd3c
3
  size 1625422896
runs/Nov07_22-56-48_98244a879ce7/events.out.tfevents.1699397815.98244a879ce7.409.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e582d3e51f051f0afa8434313ab37a1fe4e054cfa48d75025a42a101b7bc23c4
3
- size 10527
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4519810d065b00806816631caeeccf83ed53ee54f62f1f73ee699d4d21e7d45c
3
+ size 11406