Edit model card

fresh-2-layer-medmcqa-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 11.4170
  • Accuracy: 0.5404

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 321
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 63 14.3363 0.2929
No log 2.0 126 13.8007 0.4040
No log 3.0 189 13.1932 0.4697
No log 4.0 252 12.4231 0.4899
No log 5.0 315 11.6190 0.5101
No log 6.0 378 11.4170 0.5404
No log 7.0 441 12.2002 0.4899
3.3802 8.0 504 11.9545 0.4646
3.3802 9.0 567 13.2518 0.5202
3.3802 10.0 630 11.9140 0.5
3.3802 11.0 693 11.4793 0.4545
3.3802 12.0 756 11.6963 0.4798
3.3802 13.0 819 11.2862 0.4848
3.3802 14.0 882 11.1868 0.4949
3.3802 15.0 945 10.9490 0.4646
0.479 16.0 1008 11.0089 0.4899
0.479 17.0 1071 11.1883 0.4798
0.479 18.0 1134 11.2915 0.4697
0.479 19.0 1197 11.1116 0.4747
0.479 20.0 1260 11.0499 0.4747

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
5
Inference API (serverless) does not yet support transformers models for this pipeline type.