RMWeerasinghe commited on
Commit
d4c37bd
1 Parent(s): 7f28205

Training complete

Browse files
Files changed (3) hide show
  1. README.md +34 -28
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -9,7 +9,6 @@ metrics:
9
  model-index:
10
  - name: long-t5-tglobal-base-boardpapers-4096
11
  results: []
12
- pipeline_tag: summarization
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -19,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096](https://huggingface.co/RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.5617
23
- - Rouge1: 0.0743
24
- - Rouge2: 0.0398
25
- - Rougel: 0.0589
26
- - Rougelsum: 0.0703
27
 
28
  ## Model description
29
 
@@ -50,32 +49,39 @@ The following hyperparameters were used during training:
50
  - total_train_batch_size: 32
51
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
  - lr_scheduler_type: linear
53
- - num_epochs: 30
54
 
55
  ### Training results
56
 
57
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
58
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
59
- | No log | 0.67 | 1 | 0.6654 | 0.0514 | 0.0197 | 0.0386 | 0.0477 |
60
- | No log | 2.0 | 3 | 0.6378 | 0.0667 | 0.0309 | 0.0512 | 0.0596 |
61
- | No log | 2.67 | 4 | 0.6293 | 0.0646 | 0.0274 | 0.0515 | 0.0619 |
62
- | No log | 4.0 | 6 | 0.6128 | 0.0706 | 0.0377 | 0.0566 | 0.067 |
63
- | No log | 4.67 | 7 | 0.6049 | 0.0706 | 0.0377 | 0.0566 | 0.067 |
64
- | No log | 6.0 | 9 | 0.5935 | 0.0706 | 0.0377 | 0.0566 | 0.067 |
65
- | No log | 6.67 | 10 | 0.5891 | 0.0718 | 0.0385 | 0.0578 | 0.067 |
66
- | No log | 8.0 | 12 | 0.5815 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
67
- | No log | 8.67 | 13 | 0.5785 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
68
- | No log | 10.0 | 15 | 0.5742 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
69
- | No log | 10.67 | 16 | 0.5724 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
70
- | No log | 12.0 | 18 | 0.5694 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
71
- | No log | 12.67 | 19 | 0.5681 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
72
- | 0.7929 | 14.0 | 21 | 0.5661 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
73
- | 0.7929 | 14.67 | 22 | 0.5652 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
74
- | 0.7929 | 16.0 | 24 | 0.5636 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
75
- | 0.7929 | 16.67 | 25 | 0.5630 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
76
- | 0.7929 | 18.0 | 27 | 0.5621 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
77
- | 0.7929 | 18.67 | 28 | 0.5619 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
78
- | 0.7929 | 20.0 | 30 | 0.5617 | 0.0743 | 0.0398 | 0.0589 | 0.0703 |
 
 
 
 
 
 
 
79
 
80
 
81
  ### Framework versions
@@ -83,4 +89,4 @@ The following hyperparameters were used during training:
83
  - Transformers 4.37.0
84
  - Pytorch 2.1.2
85
  - Datasets 2.17.0
86
- - Tokenizers 0.15.1
 
9
  model-index:
10
  - name: long-t5-tglobal-base-boardpapers-4096
11
  results: []
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
18
 
19
  This model is a fine-tuned version of [RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096](https://huggingface.co/RMWeerasinghe/long-t5-tglobal-base-finetuned-govReport-4096) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.5356
22
+ - Rouge1: 0.0844
23
+ - Rouge2: 0.0543
24
+ - Rougel: 0.0716
25
+ - Rougelsum: 0.0842
26
 
27
  ## Model description
28
 
 
49
  - total_train_batch_size: 32
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: linear
52
+ - num_epochs: 40
53
 
54
  ### Training results
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
57
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
58
+ | No log | 0.67 | 1 | 0.6583 | 0.0647 | 0.03 | 0.0504 | 0.0595 |
59
+ | No log | 2.0 | 3 | 0.6232 | 0.067 | 0.036 | 0.0527 | 0.0643 |
60
+ | No log | 2.67 | 4 | 0.6134 | 0.067 | 0.036 | 0.0527 | 0.0643 |
61
+ | No log | 4.0 | 6 | 0.5971 | 0.0742 | 0.0426 | 0.0654 | 0.0735 |
62
+ | No log | 4.67 | 7 | 0.5897 | 0.0765 | 0.0462 | 0.0654 | 0.0762 |
63
+ | No log | 6.0 | 9 | 0.5777 | 0.0803 | 0.0486 | 0.0665 | 0.0802 |
64
+ | No log | 6.67 | 10 | 0.5729 | 0.0813 | 0.0498 | 0.0677 | 0.0801 |
65
+ | No log | 8.0 | 12 | 0.5652 | 0.0813 | 0.0498 | 0.0677 | 0.0801 |
66
+ | No log | 8.67 | 13 | 0.5622 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
67
+ | No log | 10.0 | 15 | 0.5575 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
68
+ | No log | 10.67 | 16 | 0.5559 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
69
+ | No log | 12.0 | 18 | 0.5528 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
70
+ | No log | 12.67 | 19 | 0.5513 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
71
+ | 0.7235 | 14.0 | 21 | 0.5488 | 0.0823 | 0.0544 | 0.0685 | 0.0811 |
72
+ | 0.7235 | 14.67 | 22 | 0.5476 | 0.0811 | 0.0544 | 0.0674 | 0.0794 |
73
+ | 0.7235 | 16.0 | 24 | 0.5451 | 0.086 | 0.0574 | 0.074 | 0.0841 |
74
+ | 0.7235 | 16.67 | 25 | 0.5438 | 0.086 | 0.0574 | 0.074 | 0.0841 |
75
+ | 0.7235 | 18.0 | 27 | 0.5420 | 0.086 | 0.0574 | 0.074 | 0.0841 |
76
+ | 0.7235 | 18.67 | 28 | 0.5412 | 0.086 | 0.0574 | 0.074 | 0.0841 |
77
+ | 0.7235 | 20.0 | 30 | 0.5397 | 0.086 | 0.0574 | 0.074 | 0.0841 |
78
+ | 0.7235 | 20.67 | 31 | 0.5390 | 0.086 | 0.0574 | 0.074 | 0.0841 |
79
+ | 0.7235 | 22.0 | 33 | 0.5377 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
80
+ | 0.7235 | 22.67 | 34 | 0.5372 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
81
+ | 0.7235 | 24.0 | 36 | 0.5363 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
82
+ | 0.7235 | 24.67 | 37 | 0.5360 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
83
+ | 0.7235 | 26.0 | 39 | 0.5357 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
84
+ | 0.6478 | 26.67 | 40 | 0.5356 | 0.0844 | 0.0543 | 0.0716 | 0.0842 |
85
 
86
 
87
  ### Framework versions
 
89
  - Transformers 4.37.0
90
  - Pytorch 2.1.2
91
  - Datasets 2.17.0
92
+ - Tokenizers 0.15.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6ecdae2e04691f3beb40df306c530950a4454e32459ae9a6067150aca8e8b73e
3
  size 990386200
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0545d0943680974a6a7e6c6ff55f3bb2490f6d823804e593581e311fcf727117
3
  size 990386200
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:78e7af93fe1786412c9c5d0c4ae83624c02f3d1d9ead6b1478d334990a5d0abc
3
  size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3501f87da90202598dbc3f9cd899a33e587c8550eb3df15c2f091a45fb2ddbee
3
  size 4856