Edit model card

fresh-8-layer-medmcqa-distill-of-fresh-8-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 17.1123
  • Accuracy: 0.5455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 321
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 63 22.9570 0.2273
No log 2.0 126 21.2108 0.3636
No log 3.0 189 26.1433 0.4040
No log 4.0 252 24.1795 0.3838
No log 5.0 315 17.9657 0.4747
No log 6.0 378 20.0576 0.5354
No log 7.0 441 17.5133 0.5
10.1769 8.0 504 22.3248 0.5101
10.1769 9.0 567 20.7352 0.4848
10.1769 10.0 630 22.9071 0.4596
10.1769 11.0 693 17.8100 0.4899
10.1769 12.0 756 17.9827 0.5202
10.1769 13.0 819 19.2382 0.5
10.1769 14.0 882 18.8849 0.4949
10.1769 15.0 945 17.6397 0.5202
2.2143 16.0 1008 19.0081 0.5101
2.2143 17.0 1071 17.8718 0.5152
2.2143 18.0 1134 17.5239 0.5303
2.2143 19.0 1197 17.1123 0.5455
2.2143 20.0 1260 17.7756 0.5404

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
1
Inference API (serverless) does not yet support transformers models for this pipeline type.