Edit model card

scenario-KD-PO-CDF-CL-D2_data-cl-cardiff_cl_only_beta-jason

This model is a fine-tuned version of haryoaw/scenario-TCR_data-cl-cardiff_cl_only2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 16.2119
  • Accuracy: 0.3974
  • F1: 0.3962

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 6666
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 1.09 250 11.7960 0.3650 0.3427
13.5624 2.17 500 11.1810 0.3927 0.3922
13.5624 3.26 750 10.7353 0.3889 0.3725
11.0045 4.35 1000 11.0340 0.4113 0.4092
11.0045 5.43 1250 11.1511 0.3958 0.3764
9.4397 6.52 1500 11.3893 0.4128 0.4098
9.4397 7.61 1750 11.7867 0.4174 0.4136
8.0563 8.7 2000 12.7215 0.4020 0.3934
8.0563 9.78 2250 13.1991 0.4159 0.4158
7.071 10.87 2500 13.4791 0.3966 0.3938
7.071 11.96 2750 12.9321 0.4005 0.3939
5.9842 13.04 3000 13.5185 0.3873 0.3775
5.9842 14.13 3250 14.5623 0.4028 0.3984
5.0429 15.22 3500 14.7614 0.4012 0.3954
5.0429 16.3 3750 13.8500 0.4151 0.4149
4.4814 17.39 4000 14.0842 0.4051 0.4022
4.4814 18.48 4250 14.4055 0.3904 0.3804
4.0867 19.57 4500 15.5442 0.3858 0.3809
4.0867 20.65 4750 14.6236 0.3966 0.3951
3.5928 21.74 5000 14.9268 0.4005 0.3920
3.5928 22.83 5250 15.2065 0.3897 0.3884
3.2444 23.91 5500 16.5178 0.3889 0.3860
3.2444 25.0 5750 15.1592 0.3920 0.3879
3.0552 26.09 6000 15.4594 0.3974 0.3952
3.0552 27.17 6250 15.8492 0.3843 0.3842
2.8654 28.26 6500 15.8468 0.3827 0.3806
2.8654 29.35 6750 16.2119 0.3974 0.3962

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from