Edit model card

fine-tuned-t5-small

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.5422
  • Precision: nan
  • Recall: 0.7117
  • F1: 0.5635
  • Hashcode: roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2)
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Hashcode Gen Len
No log 1.0 1 12.9679 0.7745 0.7227 0.7474 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 2.0 2 12.1426 0.7811 0.7221 0.7503 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 3.0 3 11.2809 0.7811 0.7221 0.7503 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 4.0 4 10.4669 0.7821 0.7273 0.7536 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 5.0 5 9.7061 0.7821 0.7273 0.7536 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 6.0 6 9.0054 0.7821 0.7273 0.7536 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 7.0 7 8.3875 0.7821 0.7273 0.7536 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 8.0 8 7.8287 0.7772 0.7278 0.7515 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 9.0 9 7.3385 0.7772 0.7278 0.7515 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 10.0 10 6.9141 0.7772 0.7278 0.7515 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 11.0 11 6.5516 0.7801 0.7240 0.7509 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 12.0 12 6.2399 0.7801 0.7240 0.7509 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 13.0 13 5.9851 0.7801 0.7240 0.7509 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 14.0 14 5.7744 0.7801 0.7240 0.7509 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 15.0 15 5.5976 0.7801 0.7240 0.7509 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 16.0 16 5.4546 0.7873 0.7158 0.7497 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 17.0 17 5.3403 0.7873 0.7158 0.7497 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 18.0 18 5.2461 0.7873 0.7158 0.7497 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 19.0 19 5.1688 0.7873 0.7158 0.7497 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 20.0 20 5.1052 0.7922 0.7169 0.7525 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 21.0 21 5.0489 0.7922 0.7169 0.7525 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 22.0 22 5.0025 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 23.0 23 4.9621 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 24.0 24 4.9263 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 25.0 25 4.8933 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 26.0 26 4.8623 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 27.0 27 4.8327 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 28.0 28 4.8060 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 29.0 29 4.7811 0.7941 0.7122 0.7508 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 30.0 30 4.7583 0.7712 0.7105 0.7392 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 31.0 31 4.7361 0.7712 0.7105 0.7392 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 32.0 32 4.7152 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 33.0 33 4.6964 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 34.0 34 4.6789 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 35.0 35 4.6627 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 36.0 36 4.6475 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 37.0 37 4.6330 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 38.0 38 4.6192 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 39.0 39 4.6066 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 40.0 40 4.5957 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 41.0 41 4.5859 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 42.0 42 4.5771 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 43.0 43 4.5693 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 44.0 44 4.5625 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 45.0 45 4.5567 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 46.0 46 4.5518 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 47.0 47 4.5480 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 48.0 48 4.5451 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 49.0 49 4.5432 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0
No log 50.0 50 4.5422 nan 0.7117 0.5635 roberta-large_L17_idf_version=0.3.12(hug_trans=4.30.2) 19.0

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
8