Edit model card

scenario-NON-KD-PO-COPY-CDF-CL-D2_data-cl-cardiff_cl_only_delta

This model is a fine-tuned version of haryoaw/scenario-TCR_data-cl-cardiff_cl_only2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.5684
  • Accuracy: 0.4676
  • F1: 0.4662

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 11213
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 1.09 250 1.8346 0.4684 0.4686
0.5182 2.17 500 2.4903 0.4745 0.4735
0.5182 3.26 750 2.5207 0.4830 0.4775
0.2149 4.35 1000 3.1950 0.4468 0.4376
0.2149 5.43 1250 3.2340 0.4738 0.4669
0.1358 6.52 1500 3.7896 0.4761 0.4750
0.1358 7.61 1750 3.2005 0.4954 0.4933
0.0968 8.7 2000 3.7302 0.4491 0.4429
0.0968 9.78 2250 4.0015 0.4691 0.4662
0.0624 10.87 2500 4.1400 0.4514 0.4462
0.0624 11.96 2750 4.6716 0.4460 0.4388
0.0508 13.04 3000 4.6683 0.4545 0.4531
0.0508 14.13 3250 4.9959 0.4398 0.4311
0.0408 15.22 3500 4.6543 0.4537 0.4541
0.0408 16.3 3750 4.7483 0.4753 0.4765
0.0272 17.39 4000 5.0125 0.4468 0.4446
0.0272 18.48 4250 5.1208 0.4475 0.4438
0.0241 19.57 4500 5.0070 0.4599 0.4586
0.0241 20.65 4750 4.8782 0.4761 0.4759
0.0187 21.74 5000 5.1882 0.4599 0.4608
0.0187 22.83 5250 5.1202 0.4761 0.4761
0.0086 23.91 5500 5.4158 0.4622 0.4616
0.0086 25.0 5750 5.4933 0.4637 0.4620
0.0088 26.09 6000 5.5752 0.4622 0.4604
0.0088 27.17 6250 5.5790 0.4668 0.4648
0.0063 28.26 6500 5.5069 0.4614 0.4603
0.0063 29.35 6750 5.5684 0.4676 0.4662

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
5

Finetuned from