MikaSie's picture
End of training
88e5db7 verified
|
raw
history blame
3.6 kB
metadata
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
datasets:
  - eur-lex-sum
model-index:
  - name: LongT5_no_extraction_V1
    results: []

LongT5_no_extraction_V1

This model is a fine-tuned version of google/long-t5-tglobal-base on the eur-lex-sum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3639

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.2571 0.9963 68 1.8571
2.6516 1.9927 136 1.7238
2.2687 2.9890 204 1.6153
2.0466 4.0 273 1.5414
1.9659 4.9963 341 1.4955
1.8813 5.9927 409 1.4752
1.8277 6.9890 477 1.4571
1.7626 8.0 546 1.4437
1.7528 8.9963 614 1.4315
1.7249 9.9927 682 1.4229
1.6981 10.9890 750 1.4126
1.6559 12.0 819 1.4061
1.6599 12.9963 887 1.3983
1.6465 13.9927 955 1.3994
1.6282 14.9890 1023 1.3923
1.5906 16.0 1092 1.3873
1.6035 16.9963 1160 1.3878
1.5909 17.9927 1228 1.3851
1.5802 18.9890 1296 1.3799
1.5481 20.0 1365 1.3860
1.5607 20.9963 1433 1.3745
1.5517 21.9927 1501 1.3736
1.5436 22.9890 1569 1.3735
1.5126 24.0 1638 1.3728
1.5289 24.9963 1706 1.3739
1.5234 25.9927 1774 1.3706
1.5179 26.9890 1842 1.3671
1.4908 28.0 1911 1.3680
1.5057 28.9963 1979 1.3688
1.5026 29.9927 2047 1.3649
1.498 30.9890 2115 1.3662
1.4866 32.0 2184 1.3655
1.493 32.9963 2252 1.3644
1.4877 33.9927 2320 1.3669
1.4858 34.9890 2388 1.3650
1.465 36.0 2457 1.3649
1.4822 36.9963 2525 1.3647
1.4797 37.9927 2593 1.3644
1.4803 38.9890 2661 1.3640
1.4548 39.8535 2720 1.3639

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.17.1
  • Tokenizers 0.19.1