Mistral-NWPU

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8049

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 64
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.3107 1.0 2215 1.0127
0.9907 2.0 4430 0.9383
0.9338 3.0 6645 0.9026
0.9008 4.0 8860 0.8811
0.8784 5.0 11075 0.8699
0.8630 6.0 13290 0.8576
0.8522 7.0 15505 0.8509
0.8438 8.0 17720 0.8450
0.8369 9.0 19935 0.8403
0.8313 10.0 22150 0.8374
0.8269 11.0 24365 0.8347
0.8230 12.0 26580 0.8303
0.8194 13.0 28795 0.8286
0.8163 14.0 31010 0.8283
0.8134 15.0 33225 0.8254
0.8112 16.0 35440 0.8228
0.8090 17.0 37655 0.8210
0.8066 18.0 39870 0.8219
0.8047 19.0 42085 0.8199
0.8027 20.0 44300 0.8183
0.8013 21.0 46515 0.8171
0.7999 22.0 48730 0.8168
0.7982 23.0 50945 0.8152
0.7970 24.0 53160 0.8146
0.7955 25.0 55375 0.8144
0.7948 26.0 57590 0.8137
0.7932 27.0 59805 0.8134
0.7923 28.0 62020 0.8119
0.7910 29.0 64235 0.8116
0.7901 30.0 66450 0.8111
0.7894 31.0 68665 0.8099
0.7881 32.0 70880 0.8102
0.7874 33.0 73095 0.8102
0.7864 34.0 75310 0.8087
0.7858 35.0 77525 0.8086
0.7849 36.0 79740 0.8079
0.7840 37.0 81955 0.8083
0.7831 38.0 84170 0.8075
0.7825 39.0 86385 0.8080
0.7821 40.0 88600 0.8073
0.7813 41.0 90815 0.8074
0.7805 42.0 93030 0.8062
0.7796 43.0 95245 0.8067
0.7789 44.0 97460 0.8066
0.7783 45.0 99675 0.8062
0.7779 46.0 101890 0.8059
0.7770 47.0 104105 0.8061
0.7765 48.0 106320 0.8053
0.7760 49.0 108535 0.8053
0.7752 50.0 110750 0.8053
0.7746 51.0 112965 0.8052
0.7741 52.0 115180 0.8052
0.7733 53.0 117395 0.8050
0.7729 54.0 119610 0.8049
0.7720 55.0 121825 0.8048
0.7714 56.0 124040 0.8048
0.7705 57.0 126255 0.8044
0.7698 58.0 128470 0.8045
0.7690 59.0 130685 0.8045
0.7684 60.0 132900 0.8046
0.7675 61.0 135115 0.8047
0.7666 62.0 137330 0.8047
0.7657 63.0 139545 0.8048
0.7649 64.0 141760 0.8049

Framework versions

  • Transformers 5.12.0
  • Pytorch 2.12.0+cu130
  • Datasets 4.8.5
  • Tokenizers 0.22.2
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support