GUE_prom_prom_300_tata-seqsight_4096_512_15M-L32_all
This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_15M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:
- Loss: 1.9825
- F1 Score: 0.6573
- Accuracy: 0.6574
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2048
- eval_batch_size: 2048
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Accuracy |
---|---|---|---|---|---|
0.5881 | 66.67 | 200 | 0.7620 | 0.6185 | 0.6183 |
0.374 | 133.33 | 400 | 0.9147 | 0.6573 | 0.6574 |
0.272 | 200.0 | 600 | 1.0506 | 0.6653 | 0.6656 |
0.2261 | 266.67 | 800 | 1.1062 | 0.6469 | 0.6476 |
0.2015 | 333.33 | 1000 | 1.1267 | 0.6414 | 0.6411 |
0.1875 | 400.0 | 1200 | 1.1849 | 0.6363 | 0.6362 |
0.1758 | 466.67 | 1400 | 1.1932 | 0.6455 | 0.6460 |
0.1634 | 533.33 | 1600 | 1.2416 | 0.6440 | 0.6444 |
0.1555 | 600.0 | 1800 | 1.3574 | 0.6488 | 0.6493 |
0.1475 | 666.67 | 2000 | 1.4170 | 0.6350 | 0.6378 |
0.1374 | 733.33 | 2200 | 1.4038 | 0.6441 | 0.6444 |
0.1286 | 800.0 | 2400 | 1.4878 | 0.6465 | 0.6476 |
0.1229 | 866.67 | 2600 | 1.4307 | 0.6577 | 0.6574 |
0.1162 | 933.33 | 2800 | 1.5280 | 0.6427 | 0.6427 |
0.1091 | 1000.0 | 3000 | 1.4177 | 0.6488 | 0.6493 |
0.1018 | 1066.67 | 3200 | 1.6755 | 0.6524 | 0.6525 |
0.0973 | 1133.33 | 3400 | 1.5230 | 0.6463 | 0.6460 |
0.0917 | 1200.0 | 3600 | 1.5559 | 0.6550 | 0.6558 |
0.0877 | 1266.67 | 3800 | 1.6510 | 0.6602 | 0.6607 |
0.0819 | 1333.33 | 4000 | 1.6203 | 0.6586 | 0.6591 |
0.0777 | 1400.0 | 4200 | 1.6706 | 0.6600 | 0.6607 |
0.0736 | 1466.67 | 4400 | 1.5861 | 0.6652 | 0.6656 |
0.0698 | 1533.33 | 4600 | 1.6971 | 0.6623 | 0.6623 |
0.0671 | 1600.0 | 4800 | 1.7818 | 0.6717 | 0.6721 |
0.0634 | 1666.67 | 5000 | 1.8030 | 0.6590 | 0.6591 |
0.0615 | 1733.33 | 5200 | 1.7842 | 0.6615 | 0.6623 |
0.0587 | 1800.0 | 5400 | 1.7741 | 0.6591 | 0.6607 |
0.0568 | 1866.67 | 5600 | 1.8269 | 0.6577 | 0.6591 |
0.0551 | 1933.33 | 5800 | 1.8929 | 0.6661 | 0.6672 |
0.0531 | 2000.0 | 6000 | 1.9567 | 0.6641 | 0.6639 |
0.0505 | 2066.67 | 6200 | 1.8462 | 0.6526 | 0.6525 |
0.0494 | 2133.33 | 6400 | 1.8927 | 0.6600 | 0.6607 |
0.0473 | 2200.0 | 6600 | 2.0680 | 0.6575 | 0.6574 |
0.046 | 2266.67 | 6800 | 1.8894 | 0.6526 | 0.6525 |
0.0447 | 2333.33 | 7000 | 1.9051 | 0.6543 | 0.6542 |
0.0444 | 2400.0 | 7200 | 2.1094 | 0.6511 | 0.6509 |
0.0423 | 2466.67 | 7400 | 1.9778 | 0.6729 | 0.6737 |
0.0411 | 2533.33 | 7600 | 1.9854 | 0.6618 | 0.6623 |
0.0407 | 2600.0 | 7800 | 1.9483 | 0.6687 | 0.6688 |
0.04 | 2666.67 | 8000 | 1.9649 | 0.6575 | 0.6574 |
0.039 | 2733.33 | 8200 | 1.9644 | 0.6606 | 0.6607 |
0.0388 | 2800.0 | 8400 | 2.0501 | 0.6670 | 0.6672 |
0.0375 | 2866.67 | 8600 | 2.0106 | 0.6622 | 0.6623 |
0.0368 | 2933.33 | 8800 | 2.0446 | 0.6586 | 0.6591 |
0.0363 | 3000.0 | 9000 | 2.0473 | 0.6555 | 0.6558 |
0.0363 | 3066.67 | 9200 | 2.0159 | 0.6602 | 0.6607 |
0.0358 | 3133.33 | 9400 | 2.0621 | 0.6618 | 0.6623 |
0.0355 | 3200.0 | 9600 | 2.0734 | 0.6686 | 0.6688 |
0.0357 | 3266.67 | 9800 | 2.0886 | 0.6639 | 0.6639 |
0.0358 | 3333.33 | 10000 | 2.0690 | 0.6606 | 0.6607 |
Framework versions
- PEFT 0.9.0
- Transformers 4.38.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.