GUE_tf_2-seqsight_65536_512_47M-L32_all
This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:
- Loss: 0.6364
- F1 Score: 0.6850
- Accuracy: 0.685
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 2048
- eval_batch_size: 2048
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Accuracy |
---|---|---|---|---|---|
0.6523 | 20.0 | 200 | 0.6602 | 0.6107 | 0.615 |
0.5889 | 40.0 | 400 | 0.6631 | 0.6142 | 0.617 |
0.5457 | 60.0 | 600 | 0.6771 | 0.6209 | 0.621 |
0.5153 | 80.0 | 800 | 0.6813 | 0.6135 | 0.614 |
0.4954 | 100.0 | 1000 | 0.6920 | 0.6136 | 0.614 |
0.4847 | 120.0 | 1200 | 0.6981 | 0.6200 | 0.62 |
0.4753 | 140.0 | 1400 | 0.6823 | 0.6309 | 0.631 |
0.4698 | 160.0 | 1600 | 0.7015 | 0.6447 | 0.645 |
0.4634 | 180.0 | 1800 | 0.6763 | 0.6356 | 0.636 |
0.4559 | 200.0 | 2000 | 0.6808 | 0.6389 | 0.639 |
0.4485 | 220.0 | 2200 | 0.7041 | 0.6408 | 0.641 |
0.4435 | 240.0 | 2400 | 0.6797 | 0.6558 | 0.656 |
0.4352 | 260.0 | 2600 | 0.7195 | 0.6430 | 0.643 |
0.4283 | 280.0 | 2800 | 0.7155 | 0.65 | 0.65 |
0.42 | 300.0 | 3000 | 0.7098 | 0.6516 | 0.653 |
0.4135 | 320.0 | 3200 | 0.7060 | 0.6532 | 0.655 |
0.4048 | 340.0 | 3400 | 0.7106 | 0.6423 | 0.644 |
0.3943 | 360.0 | 3600 | 0.7462 | 0.6417 | 0.642 |
0.3849 | 380.0 | 3800 | 0.7403 | 0.6528 | 0.653 |
0.3768 | 400.0 | 4000 | 0.7351 | 0.6432 | 0.645 |
0.3665 | 420.0 | 4200 | 0.7459 | 0.6371 | 0.638 |
0.3585 | 440.0 | 4400 | 0.7503 | 0.6372 | 0.64 |
0.3501 | 460.0 | 4600 | 0.7474 | 0.6424 | 0.643 |
0.3425 | 480.0 | 4800 | 0.7972 | 0.6375 | 0.638 |
0.3354 | 500.0 | 5000 | 0.7901 | 0.6448 | 0.645 |
0.3266 | 520.0 | 5200 | 0.8136 | 0.6310 | 0.631 |
0.32 | 540.0 | 5400 | 0.7967 | 0.6369 | 0.637 |
0.3145 | 560.0 | 5600 | 0.7992 | 0.6369 | 0.637 |
0.3082 | 580.0 | 5800 | 0.8255 | 0.6330 | 0.633 |
0.3038 | 600.0 | 6000 | 0.8006 | 0.6268 | 0.627 |
0.2966 | 620.0 | 6200 | 0.8352 | 0.6329 | 0.633 |
0.2906 | 640.0 | 6400 | 0.8417 | 0.6247 | 0.625 |
0.2872 | 660.0 | 6600 | 0.8195 | 0.6369 | 0.637 |
0.2801 | 680.0 | 6800 | 0.8518 | 0.6330 | 0.633 |
0.2764 | 700.0 | 7000 | 0.8594 | 0.638 | 0.638 |
0.2728 | 720.0 | 7200 | 0.8553 | 0.632 | 0.632 |
0.2662 | 740.0 | 7400 | 0.8691 | 0.6319 | 0.632 |
0.2665 | 760.0 | 7600 | 0.8889 | 0.6310 | 0.631 |
0.2623 | 780.0 | 7800 | 0.8657 | 0.63 | 0.63 |
0.2598 | 800.0 | 8000 | 0.8847 | 0.6280 | 0.628 |
0.2553 | 820.0 | 8200 | 0.8976 | 0.6270 | 0.627 |
0.2528 | 840.0 | 8400 | 0.8937 | 0.6320 | 0.632 |
0.2509 | 860.0 | 8600 | 0.8924 | 0.6370 | 0.637 |
0.248 | 880.0 | 8800 | 0.9017 | 0.6249 | 0.625 |
0.2473 | 900.0 | 9000 | 0.8995 | 0.6330 | 0.633 |
0.2459 | 920.0 | 9200 | 0.9111 | 0.6260 | 0.626 |
0.2453 | 940.0 | 9400 | 0.9009 | 0.6209 | 0.621 |
0.2441 | 960.0 | 9600 | 0.9082 | 0.6270 | 0.627 |
0.2433 | 980.0 | 9800 | 0.9084 | 0.6249 | 0.625 |
0.243 | 1000.0 | 10000 | 0.9080 | 0.6250 | 0.625 |
Framework versions
- PEFT 0.9.0
- Transformers 4.38.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.