Edit model card

Hyperparameters:

  • learning rate: 2e-5
  • weight decay: 0.01
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps:1
  • eval steps: 6000
  • max_length: 512
  • num_epochs: 2

Dataset version:

  • “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”

Checkpoint:

  • 48000 steps

Results on Validation set:

Step Training Loss Validation Loss Accuracy Precision Recall F1
6000 0.031900 0.163412 0.982194 0.999211 0.980462 0.989748
12000 0.014700 0.106132 0.976666 0.999639 0.973733 0.986516
18000 0.010700 0.043012 0.995743 0.999223 0.995918 0.997568
24000 0.007400 0.095047 0.984724 0.999857 0.982714 0.991211
30000 0.004100 0.087274 0.990400 0.999829 0.989217 0.994495
36000 0.003100 0.162909 0.981972 1.000000 0.979434 0.989610
42000 0.002200 0.148721 0.980454 0.999986 0.977717 0.988726
48000 0.001000 0.094455 0.990437 0.999943 0.989147 0.994516
Downloads last month
8

Dataset used to train taskydata/deberta-v3-base_10xp3_10xc4_512