Edit model card

scenario-KD-PO-CDF-CL-D2_data-cl-cardiff_cl_only_delta-jason

This model is a fine-tuned version of haryoaw/scenario-TCR_data-cl-cardiff_cl_only2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 15.4781
  • Accuracy: 0.4043
  • F1: 0.3989

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 7777
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 1.09 250 11.8950 0.3735 0.3391
13.6959 2.17 500 11.1790 0.3850 0.3767
13.6959 3.26 750 11.1554 0.4082 0.4075
11.138 4.35 1000 11.3070 0.4090 0.4082
11.138 5.43 1250 11.4384 0.4043 0.4029
9.5889 6.52 1500 11.3483 0.4090 0.4061
9.5889 7.61 1750 12.4116 0.4020 0.3985
8.2638 8.7 2000 11.7820 0.3997 0.3798
8.2638 9.78 2250 11.9709 0.4012 0.4001
6.9206 10.87 2500 12.7419 0.4198 0.4197
6.9206 11.96 2750 13.1813 0.4074 0.4078
5.993 13.04 3000 13.5405 0.4074 0.4062
5.993 14.13 3250 13.6698 0.3920 0.3860
5.1998 15.22 3500 13.8872 0.3989 0.3991
5.1998 16.3 3750 13.8954 0.4074 0.4076
4.6117 17.39 4000 14.2214 0.4090 0.4090
4.6117 18.48 4250 13.7015 0.4097 0.4033
4.0848 19.57 4500 14.8333 0.3912 0.3798
4.0848 20.65 4750 14.0279 0.4035 0.3989
3.7518 21.74 5000 14.6064 0.4035 0.4024
3.7518 22.83 5250 14.6257 0.3943 0.3914
3.285 23.91 5500 14.5868 0.4020 0.3968
3.285 25.0 5750 15.5230 0.3843 0.3756
3.0056 26.09 6000 14.9209 0.3904 0.3811
3.0056 27.17 6250 14.9466 0.3981 0.3968
2.8364 28.26 6500 15.5711 0.3881 0.3824
2.8364 29.35 6750 15.4781 0.4043 0.3989

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from