Edit model card

GUE_tf_2-seqsight_65536_512_47M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6364
  • F1 Score: 0.6850
  • Accuracy: 0.685

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2048
  • eval_batch_size: 2048
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 10000

Training results

Training Loss Epoch Step Validation Loss F1 Score Accuracy
0.6523 20.0 200 0.6602 0.6107 0.615
0.5889 40.0 400 0.6631 0.6142 0.617
0.5457 60.0 600 0.6771 0.6209 0.621
0.5153 80.0 800 0.6813 0.6135 0.614
0.4954 100.0 1000 0.6920 0.6136 0.614
0.4847 120.0 1200 0.6981 0.6200 0.62
0.4753 140.0 1400 0.6823 0.6309 0.631
0.4698 160.0 1600 0.7015 0.6447 0.645
0.4634 180.0 1800 0.6763 0.6356 0.636
0.4559 200.0 2000 0.6808 0.6389 0.639
0.4485 220.0 2200 0.7041 0.6408 0.641
0.4435 240.0 2400 0.6797 0.6558 0.656
0.4352 260.0 2600 0.7195 0.6430 0.643
0.4283 280.0 2800 0.7155 0.65 0.65
0.42 300.0 3000 0.7098 0.6516 0.653
0.4135 320.0 3200 0.7060 0.6532 0.655
0.4048 340.0 3400 0.7106 0.6423 0.644
0.3943 360.0 3600 0.7462 0.6417 0.642
0.3849 380.0 3800 0.7403 0.6528 0.653
0.3768 400.0 4000 0.7351 0.6432 0.645
0.3665 420.0 4200 0.7459 0.6371 0.638
0.3585 440.0 4400 0.7503 0.6372 0.64
0.3501 460.0 4600 0.7474 0.6424 0.643
0.3425 480.0 4800 0.7972 0.6375 0.638
0.3354 500.0 5000 0.7901 0.6448 0.645
0.3266 520.0 5200 0.8136 0.6310 0.631
0.32 540.0 5400 0.7967 0.6369 0.637
0.3145 560.0 5600 0.7992 0.6369 0.637
0.3082 580.0 5800 0.8255 0.6330 0.633
0.3038 600.0 6000 0.8006 0.6268 0.627
0.2966 620.0 6200 0.8352 0.6329 0.633
0.2906 640.0 6400 0.8417 0.6247 0.625
0.2872 660.0 6600 0.8195 0.6369 0.637
0.2801 680.0 6800 0.8518 0.6330 0.633
0.2764 700.0 7000 0.8594 0.638 0.638
0.2728 720.0 7200 0.8553 0.632 0.632
0.2662 740.0 7400 0.8691 0.6319 0.632
0.2665 760.0 7600 0.8889 0.6310 0.631
0.2623 780.0 7800 0.8657 0.63 0.63
0.2598 800.0 8000 0.8847 0.6280 0.628
0.2553 820.0 8200 0.8976 0.6270 0.627
0.2528 840.0 8400 0.8937 0.6320 0.632
0.2509 860.0 8600 0.8924 0.6370 0.637
0.248 880.0 8800 0.9017 0.6249 0.625
0.2473 900.0 9000 0.8995 0.6330 0.633
0.2459 920.0 9200 0.9111 0.6260 0.626
0.2453 940.0 9400 0.9009 0.6209 0.621
0.2441 960.0 9600 0.9082 0.6270 0.627
0.2433 980.0 9800 0.9084 0.6249 0.625
0.243 1000.0 10000 0.9080 0.6250 0.625

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.