Edit model card

130000

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.0491

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 0.92 3 6.2222
No log 1.85 6 6.2146
No log 2.77 9 6.2032
5.9665 4.0 13 6.1877
5.9665 4.92 16 6.1734
5.9665 5.85 19 6.1620
5.8921 6.77 22 6.1539
5.8921 8.0 26 6.1426
5.8921 8.92 29 6.1335
5.8324 9.85 32 6.1277
5.8324 10.77 35 6.1178
5.8324 12.0 39 6.1105
5.8012 12.92 42 6.1059
5.8012 13.85 45 6.0992
5.8012 14.77 48 6.0959
5.7449 16.0 52 6.0910
5.7449 16.92 55 6.0859
5.7449 17.85 58 6.0819
5.7303 18.77 61 6.0767
5.7303 20.0 65 6.0734
5.7303 20.92 68 6.0721
5.6687 21.85 71 6.0694
5.6687 22.77 74 6.0658
5.6687 24.0 78 6.0628
5.6839 24.92 81 6.0627
5.6839 25.85 84 6.0600
5.6839 26.77 87 6.0586
5.6499 28.0 91 6.0572
5.6499 28.92 94 6.0558
5.6499 29.85 97 6.0555
5.6703 30.77 100 6.0545
5.6703 32.0 104 6.0533
5.6703 32.92 107 6.0520
5.6404 33.85 110 6.0518
5.6404 34.77 113 6.0511
5.6404 36.0 117 6.0509
5.6414 36.92 120 6.0504
5.6414 37.85 123 6.0498
5.6414 38.77 126 6.0498
5.6347 40.0 130 6.0496
5.6347 40.92 133 6.0493
5.6347 41.85 136 6.0491
5.6347 42.77 139 6.0491
5.638 44.0 143 6.0491
5.638 44.92 146 6.0491
5.638 45.85 149 6.0491
5.6249 46.15 150 6.0491

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
3
Safetensors
Model size
258k params
Tensor type
F32
·