Edit model card

GUE_tf_1-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_1 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5174
  • F1 Score: 0.7418
  • Accuracy: 0.746

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2048
  • eval_batch_size: 2048
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 10000

Training results

Training Loss Epoch Step Validation Loss F1 Score Accuracy
0.6225 13.33 200 0.5973 0.6770 0.677
0.5426 26.67 400 0.6050 0.6616 0.663
0.5017 40.0 600 0.6217 0.6740 0.674
0.4668 53.33 800 0.6290 0.6902 0.692
0.4393 66.67 1000 0.6491 0.6888 0.689
0.4151 80.0 1200 0.6627 0.6889 0.689
0.3961 93.33 1400 0.6513 0.6840 0.684
0.3797 106.67 1600 0.6851 0.6879 0.688
0.3656 120.0 1800 0.7099 0.6855 0.686
0.3537 133.33 2000 0.7395 0.6800 0.68
0.3408 146.67 2200 0.7374 0.6830 0.683
0.3307 160.0 2400 0.7293 0.6840 0.684
0.3191 173.33 2600 0.7739 0.6810 0.681
0.3083 186.67 2800 0.7673 0.6770 0.677
0.2991 200.0 3000 0.8049 0.6789 0.679
0.289 213.33 3200 0.7730 0.6768 0.677
0.2784 226.67 3400 0.8322 0.6779 0.678
0.2716 240.0 3600 0.8422 0.6690 0.67
0.262 253.33 3800 0.8461 0.6730 0.673
0.2521 266.67 4000 0.8696 0.6776 0.678
0.2461 280.0 4200 0.8740 0.6739 0.674
0.2383 293.33 4400 0.9173 0.6850 0.685
0.2307 306.67 4600 0.9165 0.6779 0.678
0.2255 320.0 4800 0.9309 0.6857 0.686
0.2192 333.33 5000 0.9353 0.6709 0.671
0.2138 346.67 5200 0.9088 0.6780 0.678
0.2083 360.0 5400 0.9699 0.6704 0.671
0.2018 373.33 5600 0.9811 0.6769 0.677
0.1975 386.67 5800 0.9467 0.6687 0.669
0.1925 400.0 6000 0.9813 0.6755 0.676
0.1886 413.33 6200 0.9830 0.6779 0.678
0.184 426.67 6400 0.9905 0.6770 0.677
0.1806 440.0 6600 1.0004 0.6721 0.673
0.1771 453.33 6800 1.0257 0.6809 0.681
0.1726 466.67 7000 1.0673 0.6677 0.668
0.1702 480.0 7200 1.0637 0.6689 0.669
0.1674 493.33 7400 1.0590 0.6670 0.667
0.1655 506.67 7600 1.0730 0.6680 0.668
0.1629 520.0 7800 1.0953 0.6730 0.673
0.1594 533.33 8000 1.0809 0.6679 0.668
0.1588 546.67 8200 1.0749 0.6650 0.665
0.1565 560.0 8400 1.0858 0.6709 0.671
0.1543 573.33 8600 1.1003 0.6650 0.665
0.1528 586.67 8800 1.0985 0.6680 0.668
0.1504 600.0 9000 1.1135 0.6670 0.667
0.1502 613.33 9200 1.1064 0.6669 0.667
0.1491 626.67 9400 1.1020 0.6678 0.668
0.1492 640.0 9600 1.1107 0.6670 0.667
0.1482 653.33 9800 1.1083 0.6690 0.669
0.1475 666.67 10000 1.1123 0.6630 0.663

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.