Edit model card

GUE_tf_3-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_3 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6945
  • F1 Score: 0.6306
  • Accuracy: 0.634

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2048
  • eval_batch_size: 2048
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 10000

Training results

Training Loss Epoch Step Validation Loss F1 Score Accuracy
0.6652 14.29 200 0.6261 0.6351 0.647
0.6027 28.57 400 0.6331 0.6487 0.655
0.5569 42.86 600 0.6640 0.6571 0.657
0.5209 57.14 800 0.6659 0.6667 0.667
0.494 71.43 1000 0.7023 0.6501 0.65
0.4694 85.71 1200 0.7381 0.646 0.646
0.452 100.0 1400 0.7667 0.6200 0.622
0.4332 114.29 1600 0.7595 0.6270 0.627
0.4193 128.57 1800 0.7789 0.6348 0.635
0.405 142.86 2000 0.7961 0.6230 0.623
0.393 157.14 2200 0.8005 0.6279 0.628
0.3814 171.43 2400 0.9150 0.6064 0.608
0.3679 185.71 2600 0.8467 0.6221 0.622
0.3581 200.0 2800 0.8222 0.6150 0.616
0.3458 214.29 3000 0.8990 0.616 0.616
0.3343 228.57 3200 0.9159 0.6185 0.619
0.3241 242.86 3400 0.9124 0.6011 0.601
0.3145 257.14 3600 0.9340 0.6141 0.614
0.3054 271.43 3800 0.9421 0.6161 0.618
0.2955 285.71 4000 0.9610 0.6050 0.605
0.2851 300.0 4200 0.9503 0.6132 0.614
0.2787 314.29 4400 0.9691 0.6088 0.609
0.2713 328.57 4600 0.9770 0.6107 0.611
0.2643 342.86 4800 1.0160 0.5997 0.6
0.2568 357.14 5000 1.0290 0.6181 0.618
0.2495 371.43 5200 1.0194 0.6058 0.606
0.2435 385.71 5400 1.0307 0.6058 0.606
0.2382 400.0 5600 1.0560 0.6014 0.602
0.2318 414.29 5800 1.0271 0.6011 0.601
0.2279 428.57 6000 1.0710 0.6041 0.604
0.2202 442.86 6200 1.1111 0.5997 0.6
0.218 457.14 6400 1.0763 0.6051 0.605
0.2131 471.43 6600 1.0867 0.6120 0.612
0.2079 485.71 6800 1.1044 0.6080 0.608
0.2051 500.0 7000 1.0884 0.6141 0.614
0.2003 514.29 7200 1.1269 0.6081 0.608
0.1964 528.57 7400 1.1436 0.6058 0.606
0.1954 542.86 7600 1.1151 0.6030 0.603
0.1917 557.14 7800 1.1323 0.6081 0.608
0.1886 571.43 8000 1.1501 0.5968 0.597
0.1874 585.71 8200 1.1396 0.6041 0.604
0.1845 600.0 8400 1.1702 0.6050 0.605
0.1821 614.29 8600 1.1690 0.6031 0.603
0.1804 628.57 8800 1.1632 0.5978 0.598
0.1786 642.86 9000 1.1731 0.6009 0.601
0.1776 657.14 9200 1.1736 0.6030 0.603
0.177 671.43 9400 1.1712 0.5960 0.596
0.1747 685.71 9600 1.1700 0.6050 0.605
0.1731 700.0 9800 1.1720 0.5990 0.599
0.1742 714.29 10000 1.1726 0.6000 0.6

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.