Edit model card

mistral_docs_sum_p1_full

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5829

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.6e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.1167 0.0277 200 2.1333
2.3428 0.0553 400 1.6966
1.3784 0.0830 600 1.4972
1.456 0.1107 800 1.3942
1.3227 0.1383 1000 1.3084
1.2535 0.1660 1200 1.2001
1.0612 0.1937 1400 1.0451
0.8815 0.2213 1600 0.9632
0.8971 0.2490 1800 0.9132
0.7908 0.2767 2000 0.8712
0.7549 0.3043 2200 0.8309
0.8099 0.3320 2400 0.8058
0.6891 0.3597 2600 0.7879
0.5204 0.3873 2800 0.7684
0.6249 0.4150 3000 0.7515
0.6764 0.4427 3200 0.7342
0.6996 0.4703 3400 0.7214
0.6371 0.4980 3600 0.7084
0.6694 0.5257 3800 0.6951
0.7048 0.5533 4000 0.6845
0.7265 0.5810 4200 0.6778
0.5663 0.6087 4400 0.6657
0.6222 0.6363 4600 0.6595
0.6463 0.6640 4800 0.6488
0.5754 0.6917 5000 0.6410
0.6208 0.7193 5200 0.6363
0.5613 0.7470 5400 0.6275
0.6316 0.7747 5600 0.6227
0.6564 0.8023 5800 0.6159
0.633 0.8300 6000 0.6077
0.5268 0.8577 6200 0.6022
0.4166 0.8853 6400 0.5978
0.6539 0.9130 6600 0.5926
0.5695 0.9407 6800 0.5875
0.6358 0.9683 7000 0.5845
0.5318 0.9960 7200 0.5829

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
7.24B params
Tensor type
F32
·

Finetuned from