Edit model card

GUE_tf_4-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_4 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1072
  • F1 Score: 0.6985
  • Accuracy: 0.7

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2048
  • eval_batch_size: 2048
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 10000

Training results

Training Loss Epoch Step Validation Loss F1 Score Accuracy
0.5938 20.0 200 0.5839 0.6927 0.696
0.4624 40.0 400 0.5618 0.7407 0.741
0.3879 60.0 600 0.5554 0.7666 0.767
0.3327 80.0 800 0.5816 0.7678 0.771
0.2946 100.0 1000 0.5931 0.7744 0.776
0.2647 120.0 1200 0.5808 0.7855 0.787
0.2412 140.0 1400 0.6176 0.7794 0.781
0.2206 160.0 1600 0.6405 0.7669 0.77
0.2049 180.0 1800 0.6688 0.7695 0.772
0.1907 200.0 2000 0.6833 0.7732 0.775
0.1827 220.0 2200 0.6694 0.7772 0.779
0.1707 240.0 2400 0.7068 0.7844 0.786
0.1623 260.0 2600 0.6585 0.7922 0.793
0.1527 280.0 2800 0.7206 0.7775 0.78
0.1459 300.0 3000 0.7293 0.7797 0.782
0.1402 320.0 3200 0.6942 0.7992 0.8
0.1342 340.0 3400 0.7153 0.7863 0.788
0.1307 360.0 3600 0.7720 0.7765 0.779
0.1232 380.0 3800 0.7279 0.7822 0.784
0.1181 400.0 4000 0.7732 0.7808 0.783
0.1138 420.0 4200 0.7846 0.7840 0.786
0.1092 440.0 4400 0.7541 0.7829 0.785
0.1072 460.0 4600 0.7809 0.7938 0.796
0.102 480.0 4800 0.7725 0.7924 0.794
0.0999 500.0 5000 0.7435 0.7949 0.796
0.0964 520.0 5200 0.7584 0.7758 0.778
0.0933 540.0 5400 0.7664 0.7843 0.786
0.0899 560.0 5600 0.8301 0.7762 0.779
0.0883 580.0 5800 0.7747 0.7928 0.794
0.0857 600.0 6000 0.7789 0.7941 0.795
0.0847 620.0 6200 0.7575 0.7899 0.791
0.0822 640.0 6400 0.7835 0.7949 0.796
0.0781 660.0 6600 0.8146 0.7873 0.789
0.0774 680.0 6800 0.8272 0.7817 0.784
0.0749 700.0 7000 0.8346 0.7940 0.795
0.0741 720.0 7200 0.8273 0.7859 0.788
0.0726 740.0 7400 0.8139 0.7902 0.792
0.0712 760.0 7600 0.8389 0.7893 0.791
0.0689 780.0 7800 0.8566 0.7893 0.791
0.0686 800.0 8000 0.8251 0.7977 0.799
0.067 820.0 8200 0.8071 0.7884 0.79
0.0662 840.0 8400 0.8441 0.7874 0.789
0.0646 860.0 8600 0.8219 0.7937 0.795
0.0633 880.0 8800 0.8501 0.7894 0.791
0.0634 900.0 9000 0.8174 0.7862 0.788
0.0628 920.0 9200 0.8389 0.7884 0.79
0.0619 940.0 9400 0.8552 0.7861 0.788
0.0606 960.0 9600 0.8563 0.7891 0.791
0.0617 980.0 9800 0.8554 0.7862 0.788
0.0607 1000.0 10000 0.8497 0.7863 0.788

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.