RichardErkhov commited on
Commit
1b770ce
1 Parent(s): 2468b1d

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +103 -0
README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ bart-base-finetuned-xsum - bnb 4bits
11
+ - Model creator: https://huggingface.co/Vexemous/
12
+ - Original model: https://huggingface.co/Vexemous/bart-base-finetuned-xsum/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: apache-2.0
20
+ base_model: facebook/bart-base
21
+ tags:
22
+ - generated_from_trainer
23
+ datasets:
24
+ - xsum
25
+ metrics:
26
+ - rouge
27
+ model-index:
28
+ - name: bart-base-finetuned-xsum
29
+ results:
30
+ - task:
31
+ name: Sequence-to-sequence Language Modeling
32
+ type: text2text-generation
33
+ dataset:
34
+ name: xsum
35
+ type: xsum
36
+ config: default
37
+ split: train[:10%]
38
+ args: default
39
+ metrics:
40
+ - name: Rouge1
41
+ type: rouge
42
+ value: 35.8214
43
+ pipeline_tag: summarization
44
+ ---
45
+
46
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
47
+ should probably proofread and complete it, then remove this comment. -->
48
+
49
+ # bart-base-finetuned-xsum
50
+
51
+ This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the xsum dataset.
52
+ It achieves the following results on the evaluation set:
53
+ - Loss: 1.9356
54
+ - Rouge1: 35.8214
55
+ - Rouge2: 14.7565
56
+ - Rougel: 29.4566
57
+ - Rougelsum: 29.4496
58
+ - Gen Len: 19.562
59
+
60
+ ## Model description
61
+
62
+ More information needed
63
+
64
+ ## Intended uses & limitations
65
+
66
+ More information needed
67
+
68
+ ## Training and evaluation data
69
+
70
+ More information needed
71
+
72
+ ## Training procedure
73
+
74
+ ### Training hyperparameters
75
+
76
+ The following hyperparameters were used during training:
77
+ - learning_rate: 2e-05
78
+ - train_batch_size: 16
79
+ - eval_batch_size: 16
80
+ - seed: 42
81
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
82
+ - lr_scheduler_type: linear
83
+ - num_epochs: 5
84
+ - mixed_precision_training: Native AMP
85
+
86
+ ### Training results
87
+
88
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
89
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
90
+ | 2.301 | 1.0 | 1148 | 1.9684 | 34.4715 | 13.6638 | 28.1147 | 28.1204 | 19.5816 |
91
+ | 2.1197 | 2.0 | 2296 | 1.9442 | 35.2502 | 14.284 | 28.8462 | 28.8384 | 19.5546 |
92
+ | 1.9804 | 3.0 | 3444 | 1.9406 | 35.7799 | 14.7422 | 29.3669 | 29.3742 | 19.5326 |
93
+ | 1.8891 | 4.0 | 4592 | 1.9349 | 35.5151 | 14.4668 | 29.0359 | 29.0484 | 19.5492 |
94
+ | 1.827 | 5.0 | 5740 | 1.9356 | 35.8214 | 14.7565 | 29.4566 | 29.4496 | 19.562 |
95
+
96
+
97
+ ### Framework versions
98
+
99
+ - Transformers 4.40.1
100
+ - Pytorch 1.13.1+cu117
101
+ - Datasets 2.19.0
102
+ - Tokenizers 0.19.1
103
+