Edit model card

mistral-7b-scientific-mcq

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7480

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
0.9911 0.0581 100 0.8124
0.879 0.1162 200 0.7703
0.9359 0.1743 300 0.7576
0.7608 0.2325 400 0.7523
0.8144 0.2906 500 0.7469
0.8655 0.3487 600 0.7435
0.6748 0.4068 700 0.7390
0.7004 0.4649 800 0.7369
0.7561 0.5230 900 0.7351
0.7053 0.5811 1000 0.7317
0.7122 0.6393 1100 0.7294
0.7431 0.6974 1200 0.7279
0.6102 0.7555 1300 0.7255
0.7041 0.8136 1400 0.7244
0.7339 0.8717 1500 0.7227
0.6648 0.9298 1600 0.7207
0.5682 0.9879 1700 0.7192
0.6745 1.0461 1800 0.7242
0.6003 1.1042 1900 0.7258
0.6755 1.1623 2000 0.7273
0.6815 1.2204 2100 0.7265
0.5531 1.2785 2200 0.7253
0.5 1.3366 2300 0.7250
0.666 1.3947 2400 0.7236
0.518 1.4529 2500 0.7247
0.6223 1.5110 2600 0.7240
0.565 1.5691 2700 0.7234
0.5541 1.6272 2800 0.7220
0.7622 1.6853 2900 0.7220
0.5212 1.7434 3000 0.7223
0.6089 1.8015 3100 0.7205
0.6908 1.8597 3200 0.7210
0.6138 1.9178 3300 0.7204
0.6425 1.9759 3400 0.7199
0.4918 2.0340 3500 0.7416
0.5432 2.0921 3600 0.7468
0.6497 2.1502 3700 0.7463
0.5068 2.2083 3800 0.7448
0.5502 2.2665 3900 0.7475
0.4795 2.3246 4000 0.7482
0.5718 2.3827 4100 0.7486
0.5154 2.4408 4200 0.7474
0.6959 2.4989 4300 0.7479
0.5848 2.5570 4400 0.7473
0.5662 2.6151 4500 0.7479
0.4357 2.6733 4600 0.7482
0.5318 2.7314 4700 0.7476
0.4631 2.7895 4800 0.7480
0.5852 2.8476 4900 0.7481
0.5633 2.9057 5000 0.7480
0.5831 2.9638 5100 0.7480

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
299
Unable to determine this model’s pipeline type. Check the docs .

Adapter for