Edit model card

fresh-2-layer-piqa-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 16.0175
  • Accuracy: 0.3586

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 321
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 63 18.2723 0.2273
No log 2.0 126 15.2877 0.2576
No log 3.0 189 16.4507 0.3131
No log 4.0 252 13.5118 0.3485
No log 5.0 315 13.6895 0.3384
No log 6.0 378 15.5376 0.3182
No log 7.0 441 17.7696 0.3182
2.8866 8.0 504 14.3626 0.3232
2.8866 9.0 567 15.1187 0.3232
2.8866 10.0 630 16.8223 0.3384
2.8866 11.0 693 16.0992 0.3081
2.8866 12.0 756 15.7492 0.3283
2.8866 13.0 819 14.8661 0.3485
2.8866 14.0 882 16.7236 0.3485
2.8866 15.0 945 16.0175 0.3586
0.4396 16.0 1008 15.5002 0.3333
0.4396 17.0 1071 16.1226 0.3586
0.4396 18.0 1134 16.3226 0.3232
0.4396 19.0 1197 16.2413 0.3434
0.4396 20.0 1260 16.0952 0.3384

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
1
Inference API (serverless) does not yet support transformers models for this pipeline type.