Edit model card

GUE_tf_2-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6338
  • F1 Score: 0.6884
  • Accuracy: 0.689

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 1536
  • eval_batch_size: 1536
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 10000

Training results

Training Loss Epoch Step Validation Loss F1 Score Accuracy
0.63 15.38 200 0.6302 0.6461 0.647
0.5337 30.77 400 0.6630 0.6424 0.644
0.4737 46.15 600 0.6934 0.6516 0.654
0.4271 61.54 800 0.7213 0.6735 0.674
0.3909 76.92 1000 0.7608 0.6702 0.671
0.362 92.31 1200 0.7714 0.6624 0.663
0.3431 107.69 1400 0.8214 0.6710 0.671
0.3246 123.08 1600 0.8769 0.6568 0.657
0.3089 138.46 1800 0.8430 0.6725 0.673
0.2939 153.85 2000 0.9266 0.6689 0.669
0.2794 169.23 2200 0.9087 0.6697 0.67
0.2673 184.62 2400 0.9141 0.6609 0.661
0.2546 200.0 2600 0.9812 0.6516 0.652
0.245 215.38 2800 0.9577 0.6570 0.657
0.2333 230.77 3000 0.9936 0.6489 0.649
0.2256 246.15 3200 0.9704 0.6550 0.655
0.2166 261.54 3400 1.0434 0.6478 0.648
0.208 276.92 3600 1.0574 0.664 0.664
0.1987 292.31 3800 1.1171 0.6540 0.654
0.191 307.69 4000 1.0810 0.6529 0.653
0.1841 323.08 4200 1.0971 0.6434 0.645
0.1783 338.46 4400 1.1030 0.6538 0.654
0.1729 353.85 4600 1.0723 0.6549 0.655
0.1663 369.23 4800 1.1525 0.6540 0.654
0.1611 384.62 5000 1.1418 0.6589 0.659
0.156 400.0 5200 1.1778 0.6520 0.652
0.1516 415.38 5400 1.1558 0.6560 0.656
0.1481 430.77 5600 1.1824 0.6470 0.647
0.1441 446.15 5800 1.1839 0.6510 0.651
0.1399 461.54 6000 1.1635 0.6460 0.646
0.1354 476.92 6200 1.2265 0.6527 0.653
0.1324 492.31 6400 1.2001 0.6590 0.659
0.1304 507.69 6600 1.2135 0.6508 0.651
0.1257 523.08 6800 1.2496 0.6550 0.655
0.1236 538.46 7000 1.2449 0.6470 0.647
0.1205 553.85 7200 1.2688 0.6550 0.655
0.1188 569.23 7400 1.2710 0.6639 0.664
0.1157 584.62 7600 1.2893 0.6540 0.654
0.1135 600.0 7800 1.2557 0.6520 0.652
0.1117 615.38 8000 1.2621 0.6490 0.649
0.1097 630.77 8200 1.2867 0.6460 0.646
0.1081 646.15 8400 1.2929 0.6510 0.651
0.1077 661.54 8600 1.2848 0.6598 0.66
0.1061 676.92 8800 1.2900 0.6479 0.648
0.1043 692.31 9000 1.2882 0.648 0.648
0.1062 707.69 9200 1.2893 0.6560 0.656
0.1035 723.08 9400 1.3024 0.6560 0.656
0.1025 738.46 9600 1.2972 0.6620 0.662
0.1017 753.85 9800 1.3034 0.6580 0.658
0.1013 769.23 10000 1.3126 0.6560 0.656

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.