GUE_tf_4-seqsight_65536_512_47M-L32_all
This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_4 dataset. It achieves the following results on the evaluation set:
- Loss: 1.0826
- F1 Score: 0.6434
- Accuracy: 0.647
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2048
- eval_batch_size: 2048
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Accuracy |
---|---|---|---|---|---|
0.6429 | 20.0 | 200 | 0.6242 | 0.6495 | 0.65 |
0.5598 | 40.0 | 400 | 0.5938 | 0.6990 | 0.699 |
0.5 | 60.0 | 600 | 0.5678 | 0.7125 | 0.715 |
0.4497 | 80.0 | 800 | 0.5621 | 0.7270 | 0.727 |
0.4262 | 100.0 | 1000 | 0.5559 | 0.7432 | 0.744 |
0.4118 | 120.0 | 1200 | 0.5539 | 0.7485 | 0.75 |
0.3982 | 140.0 | 1400 | 0.5559 | 0.7379 | 0.738 |
0.39 | 160.0 | 1600 | 0.5506 | 0.7376 | 0.738 |
0.3813 | 180.0 | 1800 | 0.5543 | 0.7500 | 0.751 |
0.3721 | 200.0 | 2000 | 0.5692 | 0.7418 | 0.742 |
0.3627 | 220.0 | 2200 | 0.5774 | 0.7394 | 0.741 |
0.3552 | 240.0 | 2400 | 0.5622 | 0.7492 | 0.75 |
0.3475 | 260.0 | 2600 | 0.5459 | 0.7529 | 0.753 |
0.3372 | 280.0 | 2800 | 0.5509 | 0.7562 | 0.757 |
0.3274 | 300.0 | 3000 | 0.5506 | 0.7618 | 0.762 |
0.3182 | 320.0 | 3200 | 0.5787 | 0.7554 | 0.758 |
0.3076 | 340.0 | 3400 | 0.5501 | 0.7782 | 0.779 |
0.2999 | 360.0 | 3600 | 0.5493 | 0.7640 | 0.766 |
0.2889 | 380.0 | 3800 | 0.5461 | 0.7793 | 0.78 |
0.2791 | 400.0 | 4000 | 0.5430 | 0.7828 | 0.783 |
0.2711 | 420.0 | 4200 | 0.5613 | 0.7844 | 0.786 |
0.2613 | 440.0 | 4400 | 0.5767 | 0.7811 | 0.783 |
0.2525 | 460.0 | 4600 | 0.5546 | 0.7789 | 0.781 |
0.2441 | 480.0 | 4800 | 0.5489 | 0.7917 | 0.793 |
0.2355 | 500.0 | 5000 | 0.5749 | 0.7831 | 0.785 |
0.2295 | 520.0 | 5200 | 0.5618 | 0.7925 | 0.794 |
0.2219 | 540.0 | 5400 | 0.5502 | 0.8067 | 0.807 |
0.2162 | 560.0 | 5600 | 0.5644 | 0.7957 | 0.797 |
0.2106 | 580.0 | 5800 | 0.5789 | 0.8058 | 0.807 |
0.2077 | 600.0 | 6000 | 0.5623 | 0.8074 | 0.808 |
0.1995 | 620.0 | 6200 | 0.5720 | 0.8083 | 0.809 |
0.1954 | 640.0 | 6400 | 0.5754 | 0.8072 | 0.808 |
0.1907 | 660.0 | 6600 | 0.5907 | 0.8071 | 0.808 |
0.1859 | 680.0 | 6800 | 0.5828 | 0.8091 | 0.81 |
0.183 | 700.0 | 7000 | 0.5844 | 0.8153 | 0.816 |
0.1777 | 720.0 | 7200 | 0.5739 | 0.8196 | 0.82 |
0.1752 | 740.0 | 7400 | 0.6080 | 0.8060 | 0.807 |
0.1738 | 760.0 | 7600 | 0.6083 | 0.8036 | 0.805 |
0.1711 | 780.0 | 7800 | 0.6113 | 0.8121 | 0.813 |
0.1684 | 800.0 | 8000 | 0.6043 | 0.8120 | 0.813 |
0.1669 | 820.0 | 8200 | 0.6051 | 0.8112 | 0.812 |
0.164 | 840.0 | 8400 | 0.6015 | 0.8133 | 0.814 |
0.1612 | 860.0 | 8600 | 0.6188 | 0.8124 | 0.813 |
0.1595 | 880.0 | 8800 | 0.6013 | 0.8123 | 0.813 |
0.1576 | 900.0 | 9000 | 0.5933 | 0.8164 | 0.817 |
0.1579 | 920.0 | 9200 | 0.6078 | 0.8081 | 0.809 |
0.1551 | 940.0 | 9400 | 0.6100 | 0.8132 | 0.814 |
0.1543 | 960.0 | 9600 | 0.6119 | 0.8111 | 0.812 |
0.1545 | 980.0 | 9800 | 0.6110 | 0.8112 | 0.812 |
0.1536 | 1000.0 | 10000 | 0.6102 | 0.8122 | 0.813 |
Framework versions
- PEFT 0.9.0
- Transformers 4.38.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.