File size: 3,590 Bytes
371727c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: t5-v1_1-base-gramatika-final-e8-b16
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# t5-v1_1-base-gramatika-final-e8-b16

This model is a fine-tuned version of [google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1723
- Rouge1: 43.8331
- Rouge2: 34.7609
- Rougel: 43.5803
- Rougelsum: 43.5467
- Gen Len: 18.9287

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 2.6434        | 0.37  | 300  | 0.4530          | 38.4418 | 26.1528 | 37.8295 | 37.7894   | 18.9434 |
| 0.5551        | 0.73  | 600  | 0.3368          | 39.883  | 28.2471 | 39.2883 | 39.2822   | 18.9345 |
| 0.4523        | 1.1   | 900  | 0.2959          | 40.3084 | 29.2298 | 39.8742 | 39.8747   | 18.9350 |
| 0.4165        | 1.46  | 1200 | 0.2610          | 41.0422 | 30.4902 | 40.6542 | 40.6354   | 18.9350 |
| 0.3196        | 1.83  | 1500 | 0.2292          | 41.6111 | 31.1549 | 41.2572 | 41.2477   | 18.9355 |
| 0.2718        | 2.2   | 1800 | 0.2153          | 41.9295 | 31.6902 | 41.5757 | 41.5624   | 18.9334 |
| 0.2446        | 2.56  | 2100 | 0.2055          | 42.2918 | 32.4861 | 42.0541 | 42.0135   | 18.9324 |
| 0.2301        | 2.93  | 2400 | 0.2232          | 42.6172 | 33.0243 | 42.3474 | 42.3224   | 18.9334 |
| 0.1997        | 3.29  | 2700 | 0.1859          | 42.8442 | 33.4479 | 42.6294 | 42.6121   | 18.9350 |
| 0.186         | 3.66  | 3000 | 0.1816          | 42.9407 | 33.5872 | 42.7248 | 42.7125   | 18.9277 |
| 0.1736        | 4.02  | 3300 | 0.1771          | 43.1994 | 34.0513 | 43.0334 | 42.9982   | 18.9308 |
| 0.1439        | 4.39  | 3600 | 0.1818          | 43.2146 | 33.997  | 43.0221 | 42.9893   | 18.9282 |
| 0.1429        | 4.76  | 3900 | 0.1732          | 43.4458 | 34.377  | 43.3072 | 43.26     | 18.9277 |
| 0.132         | 5.12  | 4200 | 0.1795          | 43.7156 | 34.6069 | 43.4982 | 43.481    | 18.9292 |
| 0.1151        | 5.49  | 4500 | 0.1767          | 43.7618 | 34.7345 | 43.5565 | 43.5181   | 18.9287 |
| 0.1127        | 5.85  | 4800 | 0.1723          | 43.8331 | 34.7609 | 43.5803 | 43.5467   | 18.9287 |
| 0.0994        | 6.22  | 5100 | 0.1757          | 43.8866 | 34.9216 | 43.641  | 43.6214   | 18.9287 |
| 0.0892        | 6.59  | 5400 | 0.1779          | 43.9415 | 34.9905 | 43.7332 | 43.7063   | 18.9292 |
| 0.0914        | 6.95  | 5700 | 0.1725          | 43.9439 | 35.0456 | 43.7419 | 43.7266   | 18.9298 |
| 0.0772        | 7.32  | 6000 | 0.1776          | 44.1132 | 35.3173 | 43.9301 | 43.9135   | 18.9287 |
| 0.0755        | 7.68  | 6300 | 0.1778          | 44.0494 | 35.3179 | 43.8797 | 43.8587   | 18.9282 |


### Framework versions

- Transformers 4.30.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.3