Edit model card

scenario-KD-PO-CDF-CL-D2_data-cl-cardiff_cl_only_gamma-jason

This model is a fine-tuned version of haryoaw/scenario-TCR_data-cl-cardiff_cl_only2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 17.5985
  • Accuracy: 0.3958
  • F1: 0.3956

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 88458
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 1.09 250 11.6483 0.3650 0.3609
13.703 2.17 500 11.1328 0.3843 0.3821
13.703 3.26 750 10.9310 0.3997 0.3991
11.1159 4.35 1000 11.8248 0.4035 0.3976
11.1159 5.43 1250 11.2452 0.4074 0.4047
9.5324 6.52 1500 12.0093 0.4082 0.4079
9.5324 7.61 1750 12.4283 0.4043 0.4042
8.2184 8.7 2000 12.1651 0.3897 0.3860
8.2184 9.78 2250 13.2395 0.4012 0.4006
7.0214 10.87 2500 13.4757 0.4028 0.4030
7.0214 11.96 2750 14.5726 0.3943 0.3878
6.0532 13.04 3000 15.9024 0.3966 0.3918
6.0532 14.13 3250 16.2467 0.3819 0.3672
5.3338 15.22 3500 14.5700 0.3912 0.3911
5.3338 16.3 3750 14.7870 0.3989 0.3969
4.7168 17.39 4000 16.6837 0.3804 0.3758
4.7168 18.48 4250 16.3479 0.3835 0.3807
4.052 19.57 4500 16.3096 0.3897 0.3860
4.052 20.65 4750 16.4666 0.3958 0.3947
3.6691 21.74 5000 16.7052 0.3935 0.3851
3.6691 22.83 5250 16.9439 0.4012 0.3985
3.3107 23.91 5500 16.8777 0.4051 0.4025
3.3107 25.0 5750 17.4662 0.3897 0.3866
2.9893 26.09 6000 17.5858 0.3951 0.3939
2.9893 27.17 6250 17.6884 0.3935 0.3928
2.8471 28.26 6500 17.7042 0.3881 0.3871
2.8471 29.35 6750 17.5985 0.3958 0.3956

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from