ucsahin commited on
Commit
1ba659d
1 Parent(s): d61830e

Training complete

Browse files
Files changed (2) hide show
  1. README.md +70 -0
  2. generation_config.json +6 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/mt5-base
4
+ tags:
5
+ - Question Answering
6
+ - generated_from_trainer
7
+ metrics:
8
+ - rouge
9
+ model-index:
10
+ - name: mT5-base-turkish-qa
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # mT5-base-turkish-qa
18
+
19
+ This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.5109
22
+ - Rouge1: 79.3283
23
+ - Rouge2: 68.0845
24
+ - Rougel: 79.3474
25
+ - Rougelsum: 79.2937
26
+
27
+ ## Model description
28
+
29
+ More information needed
30
+
31
+ ## Intended uses & limitations
32
+
33
+ More information needed
34
+
35
+ ## Training and evaluation data
36
+
37
+ More information needed
38
+
39
+ ## Training procedure
40
+
41
+ ### Training hyperparameters
42
+
43
+ The following hyperparameters were used during training:
44
+ - learning_rate: 0.0001
45
+ - train_batch_size: 16
46
+ - eval_batch_size: 16
47
+ - seed: 42
48
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
+ - lr_scheduler_type: linear
50
+ - num_epochs: 1
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
55
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
56
+ | 2.0454 | 0.13 | 500 | 0.6771 | 73.1040 | 59.8915 | 73.1819 | 73.0558 |
57
+ | 0.8012 | 0.26 | 1000 | 0.6012 | 76.3357 | 64.1967 | 76.3796 | 76.2688 |
58
+ | 0.7703 | 0.39 | 1500 | 0.5844 | 76.8932 | 65.2509 | 76.9932 | 76.9418 |
59
+ | 0.6783 | 0.51 | 2000 | 0.5587 | 76.7284 | 64.8453 | 76.7416 | 76.6720 |
60
+ | 0.6546 | 0.64 | 2500 | 0.5362 | 78.2261 | 66.5893 | 78.2515 | 78.2142 |
61
+ | 0.6289 | 0.77 | 3000 | 0.5133 | 78.6917 | 67.1534 | 78.6852 | 78.6319 |
62
+ | 0.6292 | 0.9 | 3500 | 0.5109 | 79.3283 | 68.0845 | 79.3474 | 79.2937 |
63
+
64
+
65
+ ### Framework versions
66
+
67
+ - Transformers 4.36.2
68
+ - Pytorch 2.1.0+cu118
69
+ - Datasets 2.16.1
70
+ - Tokenizers 0.15.0
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "pad_token_id": 0,
5
+ "transformers_version": "4.36.2"
6
+ }