Edit model card

fresh-2-layer-qasc-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 15.8650
  • Accuracy: 0.4040

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 321
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 63 16.9688 0.2778
No log 2.0 126 15.2240 0.3737
No log 3.0 189 17.7866 0.3485
No log 4.0 252 17.3558 0.3535
No log 5.0 315 14.5362 0.3232
No log 6.0 378 15.8903 0.3687
No log 7.0 441 15.9517 0.3939
1.9038 8.0 504 18.0730 0.3939
1.9038 9.0 567 15.1385 0.3586
1.9038 10.0 630 16.3576 0.3737
1.9038 11.0 693 16.6174 0.3586
1.9038 12.0 756 15.8650 0.4040
1.9038 13.0 819 15.8556 0.3636
1.9038 14.0 882 15.6212 0.3788
1.9038 15.0 945 15.3199 0.3586
0.2405 16.0 1008 15.4362 0.3737
0.2405 17.0 1071 15.7245 0.3737
0.2405 18.0 1134 15.4229 0.3687
0.2405 19.0 1197 15.5387 0.3889
0.2405 20.0 1260 15.5974 0.3737

Framework versions

  • Transformers 4.34.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.0
Downloads last month
1
Inference API (serverless) does not yet support transformers models for this pipeline type.