Edit model card

fresh-2-layer-medmcqa20000-distill-of-fresh-2-layer-mmlu_EVAL_mmlu

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 172.8253
  • Accuracy: 0.4615

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 321
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.16 100 200.5686 0.248
No log 0.32 200 201.9314 0.324
No log 0.48 300 181.0271 0.372
No log 0.64 400 206.5410 0.364
143.7199 0.8 500 184.4744 0.42
143.7199 0.96 600 181.4189 0.402
143.7199 1.12 700 186.8587 0.414
143.7199 1.28 800 195.9331 0.396
143.7199 1.44 900 182.1619 0.426
88.7183 1.6 1000 178.5117 0.428
88.7183 1.76 1100 180.1005 0.432
88.7183 1.92 1200 177.7711 0.418
88.7183 2.08 1300 184.9631 0.426
88.7183 2.24 1400 170.4556 0.41
71.5399 2.4 1500 180.7118 0.446
71.5399 2.56 1600 171.3761 0.438
71.5399 2.72 1700 165.6044 0.432
71.5399 2.88 1800 168.3776 0.456
71.5399 3.04 1900 165.8044 0.428
59.1947 3.2 2000 181.0893 0.44
59.1947 3.36 2100 174.6589 0.454
59.1947 3.52 2200 174.3077 0.448
59.1947 3.68 2300 169.3694 0.464
59.1947 3.84 2400 172.2202 0.47
48.8665 4.0 2500 161.0428 0.468
48.8665 4.16 2600 174.7397 0.468
48.8665 4.32 2700 167.8463 0.462
48.8665 4.48 2800 176.5635 0.47
48.8665 4.64 2900 168.6186 0.464

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
11
Inference API (serverless) does not yet support transformers models for this pipeline type.