psxjp5 commited on
Commit
9dcae90
1 Parent(s): 363f796

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/mt5-small
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - rouge
8
+ - bleu
9
+ model-index:
10
+ - name: mt5-small_large_lr
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # mt5-small_large_lr
18
+
19
+ This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.9688
22
+ - Rouge1: 38.8633
23
+ - Rouge2: 33.0802
24
+ - Rougel: 37.6956
25
+ - Rougelsum: 37.7116
26
+ - Bleu: 26.6301
27
+ - Gen Len: 11.5566
28
+ - Meteor: 0.3519
29
+ - No ans accuracy: 22.99
30
+ - Av cosine sim: 0.6861
31
+
32
+ ## Model description
33
+
34
+ More information needed
35
+
36
+ ## Intended uses & limitations
37
+
38
+ More information needed
39
+
40
+ ## Training and evaluation data
41
+
42
+ More information needed
43
+
44
+ ## Training procedure
45
+
46
+ ### Training hyperparameters
47
+
48
+ The following hyperparameters were used during training:
49
+ - learning_rate: 0.005
50
+ - train_batch_size: 16
51
+ - eval_batch_size: 16
52
+ - seed: 9
53
+ - gradient_accumulation_steps: 8
54
+ - total_train_batch_size: 128
55
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
+ - lr_scheduler_type: linear
57
+ - num_epochs: 20
58
+
59
+ ### Training results
60
+
61
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len | Meteor | No ans accuracy | Av cosine sim |
62
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|:-------:|:------:|:---------------:|:-------------:|
63
+ | 5.4434 | 1.0 | 175 | 2.1918 | 1.8449 | 1.2024 | 1.7039 | 1.7116 | 0.0 | 2.7672 | 0.0145 | 28.9700 | 0.1363 |
64
+ | 1.8436 | 1.99 | 350 | 1.1852 | 33.6062 | 26.8725 | 32.2258 | 32.241 | 20.3395 | 12.2528 | 0.2957 | 17.3800 | 0.636 |
65
+ | 1.2276 | 2.99 | 525 | 1.0630 | 33.186 | 27.4949 | 32.0715 | 32.0522 | 20.3232 | 11.0301 | 0.2957 | 21.18 | 0.6109 |
66
+ | 0.9589 | 3.98 | 700 | 1.0083 | 40.265 | 33.6652 | 38.9503 | 38.9661 | 28.0884 | 12.8545 | 0.3623 | 17.54 | 0.7157 |
67
+ | 0.7931 | 4.98 | 875 | 0.9682 | 37.9437 | 31.7611 | 36.7618 | 36.7671 | 25.7738 | 12.0286 | 0.3424 | 20.66 | 0.6825 |
68
+ | 0.6686 | 5.97 | 1050 | 0.9601 | 37.5742 | 31.9098 | 36.4225 | 36.4381 | 24.9584 | 11.4169 | 0.3398 | 22.56 | 0.6713 |
69
+ | 0.5686 | 6.97 | 1225 | 0.9620 | 43.1436 | 36.6363 | 41.7279 | 41.7571 | 32.4301 | 13.6142 | 0.3893 | 16.9400 | 0.757 |
70
+ | 0.4939 | 7.96 | 1400 | 0.9688 | 38.8633 | 33.0802 | 37.6956 | 37.7116 | 26.6301 | 11.5566 | 0.3519 | 22.99 | 0.6861 |
71
+
72
+
73
+ ### Framework versions
74
+
75
+ - Transformers 4.31.0
76
+ - Pytorch 2.0.1+cu118
77
+ - Datasets 2.13.1
78
+ - Tokenizers 0.13.3