navjordj commited on
Commit
5fd06b5
1 Parent(s): 94920e0

update model card README.md

Browse files
.ipynb_checkpoints/all_results-checkpoint.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "eval_gen_len": 41.562881562881564,
3
+ "eval_loss": 1.869093418121338,
4
+ "eval_rouge1": 35.1506,
5
+ "eval_rouge2": 16.0888,
6
+ "eval_rougeL": 29.7007,
7
+ "eval_rougeLsum": 32.4251,
8
+ "eval_runtime": 261.235,
9
+ "eval_samples": 819,
10
+ "eval_samples_per_second": 3.135,
11
+ "eval_steps_per_second": 0.199,
12
+ "predict_gen_len": 41.73230769230769,
13
+ "predict_loss": 1.8758330345153809,
14
+ "predict_rouge1": 35.1974,
15
+ "predict_rouge2": 16.4972,
16
+ "predict_rougeL": 30.2616,
17
+ "predict_rougeLsum": 32.5539,
18
+ "predict_runtime": 419.3492,
19
+ "predict_samples": 1300,
20
+ "predict_samples_per_second": 3.1,
21
+ "predict_steps_per_second": 0.196
22
+ }
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - navjordj/SNL_summarization
6
+ model-index:
7
+ - name: t5-large-snl-2
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # t5-large-snl-2
15
+
16
+ This model is a fine-tuned version of [navjordj/t5-large-snl](https://huggingface.co/navjordj/t5-large-snl) on the navjordj/SNL_summarization dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - eval_loss: 1.8691
19
+ - eval_rouge1: 35.1506
20
+ - eval_rouge2: 16.0888
21
+ - eval_rougeL: 29.7007
22
+ - eval_rougeLsum: 32.4251
23
+ - eval_gen_len: 41.5629
24
+ - eval_runtime: 261.235
25
+ - eval_samples_per_second: 3.135
26
+ - eval_steps_per_second: 0.199
27
+ - step: 0
28
+
29
+ ## Model description
30
+
31
+ More information needed
32
+
33
+ ## Intended uses & limitations
34
+
35
+ More information needed
36
+
37
+ ## Training and evaluation data
38
+
39
+ More information needed
40
+
41
+ ## Training procedure
42
+
43
+ ### Training hyperparameters
44
+
45
+ The following hyperparameters were used during training:
46
+ - learning_rate: 5e-05
47
+ - train_batch_size: 16
48
+ - eval_batch_size: 16
49
+ - seed: 42
50
+ - distributed_type: multi-GPU
51
+ - gradient_accumulation_steps: 4
52
+ - total_train_batch_size: 64
53
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
+ - lr_scheduler_type: linear
55
+ - num_epochs: 20.0
56
+
57
+ ### Framework versions
58
+
59
+ - Transformers 4.27.0.dev0
60
+ - Pytorch 1.13.1
61
+ - Datasets 2.10.1
62
+ - Tokenizers 0.13.2