GUE_tf_4-seqsight_8192_512_30M-L32_all
This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_4 dataset. It achieves the following results on the evaluation set:
- Loss: 1.1072
- F1 Score: 0.6985
- Accuracy: 0.7
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2048
- eval_batch_size: 2048
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Accuracy |
---|---|---|---|---|---|
0.5938 | 20.0 | 200 | 0.5839 | 0.6927 | 0.696 |
0.4624 | 40.0 | 400 | 0.5618 | 0.7407 | 0.741 |
0.3879 | 60.0 | 600 | 0.5554 | 0.7666 | 0.767 |
0.3327 | 80.0 | 800 | 0.5816 | 0.7678 | 0.771 |
0.2946 | 100.0 | 1000 | 0.5931 | 0.7744 | 0.776 |
0.2647 | 120.0 | 1200 | 0.5808 | 0.7855 | 0.787 |
0.2412 | 140.0 | 1400 | 0.6176 | 0.7794 | 0.781 |
0.2206 | 160.0 | 1600 | 0.6405 | 0.7669 | 0.77 |
0.2049 | 180.0 | 1800 | 0.6688 | 0.7695 | 0.772 |
0.1907 | 200.0 | 2000 | 0.6833 | 0.7732 | 0.775 |
0.1827 | 220.0 | 2200 | 0.6694 | 0.7772 | 0.779 |
0.1707 | 240.0 | 2400 | 0.7068 | 0.7844 | 0.786 |
0.1623 | 260.0 | 2600 | 0.6585 | 0.7922 | 0.793 |
0.1527 | 280.0 | 2800 | 0.7206 | 0.7775 | 0.78 |
0.1459 | 300.0 | 3000 | 0.7293 | 0.7797 | 0.782 |
0.1402 | 320.0 | 3200 | 0.6942 | 0.7992 | 0.8 |
0.1342 | 340.0 | 3400 | 0.7153 | 0.7863 | 0.788 |
0.1307 | 360.0 | 3600 | 0.7720 | 0.7765 | 0.779 |
0.1232 | 380.0 | 3800 | 0.7279 | 0.7822 | 0.784 |
0.1181 | 400.0 | 4000 | 0.7732 | 0.7808 | 0.783 |
0.1138 | 420.0 | 4200 | 0.7846 | 0.7840 | 0.786 |
0.1092 | 440.0 | 4400 | 0.7541 | 0.7829 | 0.785 |
0.1072 | 460.0 | 4600 | 0.7809 | 0.7938 | 0.796 |
0.102 | 480.0 | 4800 | 0.7725 | 0.7924 | 0.794 |
0.0999 | 500.0 | 5000 | 0.7435 | 0.7949 | 0.796 |
0.0964 | 520.0 | 5200 | 0.7584 | 0.7758 | 0.778 |
0.0933 | 540.0 | 5400 | 0.7664 | 0.7843 | 0.786 |
0.0899 | 560.0 | 5600 | 0.8301 | 0.7762 | 0.779 |
0.0883 | 580.0 | 5800 | 0.7747 | 0.7928 | 0.794 |
0.0857 | 600.0 | 6000 | 0.7789 | 0.7941 | 0.795 |
0.0847 | 620.0 | 6200 | 0.7575 | 0.7899 | 0.791 |
0.0822 | 640.0 | 6400 | 0.7835 | 0.7949 | 0.796 |
0.0781 | 660.0 | 6600 | 0.8146 | 0.7873 | 0.789 |
0.0774 | 680.0 | 6800 | 0.8272 | 0.7817 | 0.784 |
0.0749 | 700.0 | 7000 | 0.8346 | 0.7940 | 0.795 |
0.0741 | 720.0 | 7200 | 0.8273 | 0.7859 | 0.788 |
0.0726 | 740.0 | 7400 | 0.8139 | 0.7902 | 0.792 |
0.0712 | 760.0 | 7600 | 0.8389 | 0.7893 | 0.791 |
0.0689 | 780.0 | 7800 | 0.8566 | 0.7893 | 0.791 |
0.0686 | 800.0 | 8000 | 0.8251 | 0.7977 | 0.799 |
0.067 | 820.0 | 8200 | 0.8071 | 0.7884 | 0.79 |
0.0662 | 840.0 | 8400 | 0.8441 | 0.7874 | 0.789 |
0.0646 | 860.0 | 8600 | 0.8219 | 0.7937 | 0.795 |
0.0633 | 880.0 | 8800 | 0.8501 | 0.7894 | 0.791 |
0.0634 | 900.0 | 9000 | 0.8174 | 0.7862 | 0.788 |
0.0628 | 920.0 | 9200 | 0.8389 | 0.7884 | 0.79 |
0.0619 | 940.0 | 9400 | 0.8552 | 0.7861 | 0.788 |
0.0606 | 960.0 | 9600 | 0.8563 | 0.7891 | 0.791 |
0.0617 | 980.0 | 9800 | 0.8554 | 0.7862 | 0.788 |
0.0607 | 1000.0 | 10000 | 0.8497 | 0.7863 | 0.788 |
Framework versions
- PEFT 0.9.0
- Transformers 4.38.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.