GUE_prom_prom_core_all-seqsight_4096_512_27M-L8_f
This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_core_all dataset. It achieves the following results on the evaluation set:
- Loss: 0.4057
- F1 Score: 0.8135
- Accuracy: 0.8137
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | F1 Score | Accuracy |
---|---|---|---|---|---|
0.5111 | 0.54 | 200 | 0.4623 | 0.7827 | 0.7828 |
0.4532 | 1.08 | 400 | 0.4563 | 0.7913 | 0.7916 |
0.4358 | 1.62 | 600 | 0.4413 | 0.7949 | 0.7949 |
0.4289 | 2.16 | 800 | 0.4435 | 0.7948 | 0.7951 |
0.4251 | 2.7 | 1000 | 0.4364 | 0.7980 | 0.7981 |
0.4242 | 3.24 | 1200 | 0.4312 | 0.7990 | 0.7990 |
0.4202 | 3.78 | 1400 | 0.4326 | 0.8022 | 0.8024 |
0.4104 | 4.32 | 1600 | 0.4300 | 0.8044 | 0.8044 |
0.4156 | 4.86 | 1800 | 0.4318 | 0.8021 | 0.8022 |
0.414 | 5.41 | 2000 | 0.4270 | 0.8057 | 0.8057 |
0.4105 | 5.95 | 2200 | 0.4289 | 0.8042 | 0.8042 |
0.4127 | 6.49 | 2400 | 0.4269 | 0.8049 | 0.8049 |
0.4054 | 7.03 | 2600 | 0.4302 | 0.8003 | 0.8005 |
0.4056 | 7.57 | 2800 | 0.4284 | 0.8052 | 0.8052 |
0.3989 | 8.11 | 3000 | 0.4282 | 0.8022 | 0.8024 |
0.3991 | 8.65 | 3200 | 0.4223 | 0.8084 | 0.8084 |
0.4032 | 9.19 | 3400 | 0.4259 | 0.8056 | 0.8056 |
0.3989 | 9.73 | 3600 | 0.4270 | 0.8056 | 0.8059 |
0.4032 | 10.27 | 3800 | 0.4242 | 0.8063 | 0.8064 |
0.3962 | 10.81 | 4000 | 0.4330 | 0.8023 | 0.8025 |
0.3967 | 11.35 | 4200 | 0.4260 | 0.8047 | 0.8047 |
0.3943 | 11.89 | 4400 | 0.4209 | 0.8074 | 0.8076 |
0.395 | 12.43 | 4600 | 0.4256 | 0.8027 | 0.8029 |
0.3926 | 12.97 | 4800 | 0.4204 | 0.8057 | 0.8057 |
0.3915 | 13.51 | 5000 | 0.4242 | 0.8039 | 0.8042 |
0.3892 | 14.05 | 5200 | 0.4224 | 0.8068 | 0.8068 |
0.3872 | 14.59 | 5400 | 0.4224 | 0.8078 | 0.8078 |
0.3911 | 15.14 | 5600 | 0.4237 | 0.8055 | 0.8056 |
0.388 | 15.68 | 5800 | 0.4240 | 0.8068 | 0.8071 |
0.3837 | 16.22 | 6000 | 0.4212 | 0.8058 | 0.8059 |
0.3872 | 16.76 | 6200 | 0.4185 | 0.8084 | 0.8084 |
0.3894 | 17.3 | 6400 | 0.4171 | 0.8057 | 0.8057 |
0.3832 | 17.84 | 6600 | 0.4202 | 0.8068 | 0.8068 |
0.3817 | 18.38 | 6800 | 0.4240 | 0.8071 | 0.8074 |
0.3824 | 18.92 | 7000 | 0.4159 | 0.8059 | 0.8059 |
0.3768 | 19.46 | 7200 | 0.4198 | 0.8062 | 0.8063 |
0.3883 | 20.0 | 7400 | 0.4204 | 0.8059 | 0.8059 |
0.3796 | 20.54 | 7600 | 0.4196 | 0.8076 | 0.8076 |
0.3825 | 21.08 | 7800 | 0.4205 | 0.8074 | 0.8074 |
0.3811 | 21.62 | 8000 | 0.4194 | 0.8037 | 0.8037 |
0.379 | 22.16 | 8200 | 0.4171 | 0.8077 | 0.8078 |
0.385 | 22.7 | 8400 | 0.4169 | 0.8101 | 0.8101 |
0.3771 | 23.24 | 8600 | 0.4182 | 0.8032 | 0.8032 |
0.3759 | 23.78 | 8800 | 0.4191 | 0.8084 | 0.8084 |
0.3766 | 24.32 | 9000 | 0.4184 | 0.8076 | 0.8076 |
0.3776 | 24.86 | 9200 | 0.4181 | 0.8056 | 0.8056 |
0.3806 | 25.41 | 9400 | 0.4177 | 0.8064 | 0.8064 |
0.3726 | 25.95 | 9600 | 0.4186 | 0.8066 | 0.8066 |
0.3789 | 26.49 | 9800 | 0.4186 | 0.8072 | 0.8073 |
0.3735 | 27.03 | 10000 | 0.4188 | 0.8073 | 0.8073 |
Framework versions
- PEFT 0.9.0
- Transformers 4.38.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.