jeremyvictor
/

mt5-large-gramatika161k-b16-lr0.001

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

mt5-large-gramatika161k-b16-lr0.001

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1429
Rouge1: 71.0622
Rouge2: 65.0219
Rougel: 70.921
Rougelsum: 70.9407
Gen Len: 18.3295

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.3954	0.63	5000	0.1851	69.5715	62.3503	69.3784	69.3899	18.3461
0.1746	1.27	10000	0.1537	70.6244	64.1779	70.4518	70.4717	18.3410
0.123	1.9	15000	0.1429	71.0622	65.0219	70.921	70.9407	18.3295
0.0758	2.54	20000	0.1468	71.5151	65.7486	71.3742	71.3959	18.3246
0.0568	3.17	25000	0.1603	71.6869	66.1031	71.5594	71.5794	18.3302
0.0327	3.81	30000	0.1556	71.9011	66.4738	71.7817	71.8013	18.3311
0.0196	4.44	35000	0.1782	72.0041	66.6645	71.886	71.9038	18.3293

Framework versions

Transformers 4.30.1
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

Downloads last month: 7

Inference API

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

Metadata error: specify a dataset to view leaderboard