sriram-sanjeev9s commited on
Commit
d8eb7e7
1 Parent(s): b5cc6cb

Model save

Browse files
Files changed (3) hide show
  1. README.md +95 -0
  2. generation_config.json +6 -0
  3. pytorch_model.bin +1 -1
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google-t5/t5-small
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - wmt14
8
+ metrics:
9
+ - bleu
10
+ model-index:
11
+ - name: T5_wmt14_En_Fr_1million
12
+ results:
13
+ - task:
14
+ name: Sequence-to-sequence Language Modeling
15
+ type: text2text-generation
16
+ dataset:
17
+ name: wmt14
18
+ type: wmt14
19
+ config: fr-en
20
+ split: validation
21
+ args: fr-en
22
+ metrics:
23
+ - name: Bleu
24
+ type: bleu
25
+ value: 8.7934
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # T5_wmt14_En_Fr_1million
32
+
33
+ This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the wmt14 dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 1.3618
36
+ - Bleu: 8.7934
37
+ - Gen Len: 17.9953
38
+
39
+ ## Model description
40
+
41
+ More information needed
42
+
43
+ ## Intended uses & limitations
44
+
45
+ More information needed
46
+
47
+ ## Training and evaluation data
48
+
49
+ More information needed
50
+
51
+ ## Training procedure
52
+
53
+ ### Training hyperparameters
54
+
55
+ The following hyperparameters were used during training:
56
+ - learning_rate: 0.001
57
+ - train_batch_size: 60
58
+ - eval_batch_size: 60
59
+ - seed: 42
60
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
+ - lr_scheduler_type: linear
62
+ - num_epochs: 20
63
+
64
+ ### Training results
65
+
66
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
67
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|:-------:|
68
+ | 1.0796 | 1.0 | 1667 | 1.1872 | 9.2959 | 18.0253 |
69
+ | 1.01 | 2.0 | 3334 | 1.2029 | 9.1594 | 18.0187 |
70
+ | 0.9686 | 3.0 | 5001 | 1.2114 | 9.2836 | 18.0123 |
71
+ | 0.9366 | 4.0 | 6668 | 1.2261 | 9.18 | 17.995 |
72
+ | 0.8999 | 5.0 | 8335 | 1.2319 | 9.2754 | 17.9793 |
73
+ | 0.8769 | 6.0 | 10002 | 1.2413 | 9.1705 | 18.026 |
74
+ | 0.8536 | 7.0 | 11669 | 1.2502 | 9.036 | 17.9987 |
75
+ | 0.8273 | 8.0 | 13336 | 1.2633 | 9.2003 | 18.006 |
76
+ | 0.8125 | 9.0 | 15003 | 1.2740 | 9.0991 | 18.009 |
77
+ | 0.7905 | 10.0 | 16670 | 1.2835 | 8.9005 | 18.007 |
78
+ | 0.774 | 11.0 | 18337 | 1.2943 | 9.0676 | 17.9967 |
79
+ | 0.76 | 12.0 | 20004 | 1.3023 | 9.0644 | 18.0227 |
80
+ | 0.7358 | 13.0 | 21671 | 1.3125 | 8.9858 | 18.0027 |
81
+ | 0.7238 | 14.0 | 23338 | 1.3204 | 9.0178 | 18.0073 |
82
+ | 0.7143 | 15.0 | 25005 | 1.3317 | 8.9826 | 18.015 |
83
+ | 0.6988 | 16.0 | 26672 | 1.3402 | 8.9224 | 18.0073 |
84
+ | 0.6829 | 17.0 | 28339 | 1.3500 | 8.9307 | 17.996 |
85
+ | 0.6776 | 18.0 | 30006 | 1.3517 | 8.8798 | 17.9987 |
86
+ | 0.6695 | 19.0 | 31673 | 1.3585 | 8.895 | 17.9967 |
87
+ | 0.6637 | 20.0 | 33340 | 1.3618 | 8.7934 | 17.9953 |
88
+
89
+
90
+ ### Framework versions
91
+
92
+ - Transformers 4.32.1
93
+ - Pytorch 1.12.1
94
+ - Datasets 2.18.0
95
+ - Tokenizers 0.13.2
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.32.1"
6
+ }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:90f732a6224c0ecf783fe80e70ce483bf3c30c06addbc97a02ad4b50da208e74
3
  size 242070267
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:431fa6d6750b6f489dce4bd60fabdc56022de7f8241626e393089d41b793a12e
3
  size 242070267