MarPla commited on
Commit
03ef218
1 Parent(s): e4b1a61

End of training

Browse files
README.md CHANGED
@@ -17,12 +17,12 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 6.2735
21
- - Rouge1: 47.2445
22
- - Rouge2: 15.068
23
- - Rougel: 32.1487
24
- - Rougelsum: 44.5885
25
- - Gen Len: 246.14
26
 
27
  ## Model description
28
 
@@ -54,11 +54,24 @@ The following hyperparameters were used during training:
54
 
55
  ### Training results
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
 
59
  ### Framework versions
60
 
61
- - Transformers 4.41.1
62
- - Pytorch 2.1.2
63
- - Datasets 2.2.1
64
  - Tokenizers 0.19.1
 
17
 
18
  This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 3.8706
21
+ - Rouge1: 57.2472
22
+ - Rouge2: 23.1787
23
+ - Rougel: 41.8726
24
+ - Rougelsum: 53.8183
25
+ - Gen Len: 234.4232
26
 
27
  ## Model description
28
 
 
54
 
55
  ### Training results
56
 
57
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
58
+ |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
59
+ | 5.8303 | 0.0835 | 100 | 5.6762 | 48.0404 | 16.526 | 33.0315 | 45.2714 | 234.4232 |
60
+ | 5.2419 | 0.1671 | 200 | 5.1330 | 49.5121 | 17.8978 | 34.5708 | 46.291 | 234.4232 |
61
+ | 5.0085 | 0.2506 | 300 | 4.8037 | 52.3507 | 19.2179 | 36.3445 | 48.7473 | 234.4232 |
62
+ | 4.676 | 0.3342 | 400 | 4.5745 | 51.4939 | 19.2534 | 37.2441 | 48.7288 | 234.4232 |
63
+ | 4.4521 | 0.4177 | 500 | 4.4154 | 52.9389 | 20.2028 | 38.4139 | 49.9981 | 234.4232 |
64
+ | 4.4572 | 0.5013 | 600 | 4.2389 | 54.6029 | 21.0796 | 39.2355 | 51.1397 | 234.4232 |
65
+ | 4.2836 | 0.5848 | 700 | 4.1267 | 55.5174 | 22.1184 | 40.2744 | 52.0886 | 234.4232 |
66
+ | 4.2862 | 0.6684 | 800 | 4.0549 | 56.305 | 22.433 | 40.8636 | 52.6987 | 234.4232 |
67
+ | 4.0806 | 0.7519 | 900 | 3.9673 | 57.3033 | 22.873 | 41.2543 | 53.5936 | 234.4232 |
68
+ | 4.0806 | 0.8355 | 1000 | 3.9154 | 56.3519 | 22.7588 | 41.4512 | 52.9385 | 234.4232 |
69
+ | 3.8885 | 0.9190 | 1100 | 3.8706 | 57.2472 | 23.1787 | 41.8726 | 53.8183 | 234.4232 |
70
 
71
 
72
  ### Framework versions
73
 
74
+ - Transformers 4.41.2
75
+ - Pytorch 2.3.1+cu121
76
+ - Datasets 2.19.2
77
  - Tokenizers 0.19.1
config.json CHANGED
@@ -64,7 +64,7 @@
64
  }
65
  },
66
  "torch_dtype": "float32",
67
- "transformers_version": "4.41.1",
68
  "use_cache": true,
69
  "vocab_size": 50264
70
  }
 
64
  }
65
  },
66
  "torch_dtype": "float32",
67
+ "transformers_version": "4.41.2",
68
  "use_cache": true,
69
  "vocab_size": 50264
70
  }
generation_config.json CHANGED
@@ -12,5 +12,5 @@
12
  "no_repeat_ngram_size": 3,
13
  "num_beams": 4,
14
  "pad_token_id": 1,
15
- "transformers_version": "4.41.1"
16
  }
 
12
  "no_repeat_ngram_size": 3,
13
  "num_beams": 4,
14
  "pad_token_id": 1,
15
+ "transformers_version": "4.41.2"
16
  }
merges.txt CHANGED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6cbceb7a08d9fdabe156c80c879d2d57fe2667613aaae1132a96b342011b7008
3
  size 1625422896
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06a013fcee634911902d07f197b15ed34abbb13213964640a998010841a9d277
3
  size 1625422896
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d75032202883cefe71c517a8cf177079eeeba8343b262c501777bf1a53b81b4c
3
  size 5048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:198daf957d7a81932cfa966d9fd2764e3f93f8ac2577b77af2d9c33c443e1474
3
  size 5048
vocab.json CHANGED
The diff for this file is too large to render. See raw diff