Edit model card

flan-t5-large-coref

This model is a fine-tuned version of google/flan-t5-large on the winograd_wsc dataset.

The model was trained on the task of coreference resolution.

It achieves the following results on the evaluation set:

  • Loss: 0.2404
  • Rouge1: 0.9495
  • Rouge2: 0.9107
  • Rougel: 0.9494
  • Rougelsum: 0.9494
  • Gen Len: 23.4828

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0169 1.0 16 0.6742 0.7918 0.6875 0.7836 0.7847 18.2414
0.6275 2.0 32 0.5093 0.8776 0.7947 0.8734 0.8732 21.5517
0.596 3.0 48 0.4246 0.9104 0.8486 0.9085 0.9091 22.5172
0.743 4.0 64 0.3632 0.9247 0.8661 0.9235 0.9231 22.8621
0.5007 5.0 80 0.3301 0.9353 0.8845 0.9357 0.9353 22.8621
0.2567 6.0 96 0.3093 0.9388 0.8962 0.9392 0.9388 22.9655
0.4146 7.0 112 0.2978 0.9449 0.907 0.9455 0.9458 23.1034
0.1991 8.0 128 0.2853 0.9454 0.9064 0.946 0.9462 23.069
0.1786 9.0 144 0.2794 0.9475 0.9097 0.9475 0.9477 23.069
0.3559 10.0 160 0.2701 0.9424 0.9013 0.9428 0.9426 23.0345
0.2059 11.0 176 0.2636 0.9472 0.9069 0.9472 0.9472 23.0345
0.199 12.0 192 0.2592 0.9523 0.9141 0.9521 0.9524 23.4483
0.1634 13.0 208 0.2553 0.9523 0.9141 0.9521 0.9524 23.4483
0.2006 14.0 224 0.2518 0.9523 0.9141 0.9521 0.9524 23.4483
0.1419 15.0 240 0.2487 0.9523 0.9141 0.9521 0.9524 23.4483
0.2089 16.0 256 0.2456 0.9523 0.9141 0.9521 0.9524 23.4483
0.1007 17.0 272 0.2431 0.9523 0.9141 0.9521 0.9524 23.4483
0.1598 18.0 288 0.2415 0.9495 0.9107 0.9494 0.9494 23.4828
0.3088 19.0 304 0.2407 0.9495 0.9107 0.9494 0.9494 23.4828
0.2003 20.0 320 0.2404 0.9495 0.9107 0.9494 0.9494 23.4828

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.7.1
  • Tokenizers 0.13.2
Downloads last month
8
Safetensors
Model size
783M params
Tensor type
F32
·

Finetuned from

Dataset used to train jtlicardo/flan-t5-large-coref

Evaluation results