MikaSie's picture
End of training
81e0946 verified
|
raw
history blame
3.59 kB
metadata
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
datasets:
  - arrow
model-index:
  - name: RoBERTa_LongT5_dependent_V1
    results: []

RoBERTa_LongT5_dependent_V1

This model is a fine-tuned version of google/long-t5-tglobal-base on the arrow dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5152

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4528 0.9963 68 1.9964
2.8835 1.9927 136 1.8580
2.5241 2.9890 204 1.7485
2.2845 4.0 273 1.6858
2.1993 4.9963 341 1.6422
2.1193 5.9927 409 1.6253
2.067 6.9890 477 1.6027
1.9859 8.0 546 1.5902
1.9823 8.9963 614 1.5784
1.9528 9.9927 682 1.5714
1.9304 10.9890 750 1.5636
1.8756 12.0 819 1.5591
1.891 12.9963 887 1.5537
1.8688 13.9927 955 1.5506
1.8497 14.9890 1023 1.5423
1.8089 16.0 1092 1.5425
1.8222 16.9963 1160 1.5369
1.8087 17.9927 1228 1.5376
1.7963 18.9890 1296 1.5328
1.7618 20.0 1365 1.5321
1.7753 20.9963 1433 1.5267
1.7671 21.9927 1501 1.5280
1.7578 22.9890 1569 1.5248
1.7261 24.0 1638 1.5268
1.7427 24.9963 1706 1.5265
1.7338 25.9927 1774 1.5221
1.7303 26.9890 1842 1.5214
1.6963 28.0 1911 1.5201
1.7173 28.9963 1979 1.5178
1.7132 29.9927 2047 1.5180
1.7088 30.9890 2115 1.5167
1.6809 32.0 2184 1.5155
1.7037 32.9963 2252 1.5162
1.699 33.9927 2320 1.5161
1.6964 34.9890 2388 1.5152
1.6718 36.0 2457 1.5154
1.6944 36.9963 2525 1.5151
1.6909 37.9927 2593 1.5154
1.6884 38.9890 2661 1.5152
1.6631 39.8535 2720 1.5152

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.17.1
  • Tokenizers 0.19.1