Mistral-SYDNEY

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5562

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 64
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.0454 1.0 44 0.9482
0.8165 2.0 88 0.7148
0.6887 3.0 132 0.6592
0.6282 4.0 176 0.6342
0.6038 5.0 220 0.6053
0.5782 6.0 264 0.6054
0.5601 7.0 308 0.5857
0.5489 8.0 352 0.5795
0.5386 9.0 396 0.5744
0.5362 10.0 440 0.5665
0.5271 11.0 484 0.5617
0.5238 12.0 528 0.5617
0.5204 13.0 572 0.5604
0.5157 14.0 616 0.5614
0.5132 15.0 660 0.5622
0.5137 16.0 704 0.5624
0.5101 17.0 748 0.5580
0.5096 18.0 792 0.5571
0.5071 19.0 836 0.5572
0.5055 20.0 880 0.5591
0.5029 21.0 924 0.5574
0.5051 22.0 968 0.5564
0.5019 23.0 1012 0.5584
0.5025 24.0 1056 0.5574
0.5000 25.0 1100 0.5549
0.4982 26.0 1144 0.5558
0.4990 27.0 1188 0.5570
0.4991 28.0 1232 0.5581
0.4973 29.0 1276 0.5558
0.4967 30.0 1320 0.5599
0.4954 31.0 1364 0.5587
0.4951 32.0 1408 0.5555
0.4935 33.0 1452 0.5565
0.4935 34.0 1496 0.5547
0.4925 35.0 1540 0.5573
0.4933 36.0 1584 0.5563
0.4917 37.0 1628 0.5563
0.4915 38.0 1672 0.5583
0.4904 39.0 1716 0.5559
0.4909 40.0 1760 0.5551
0.4889 41.0 1804 0.5558
0.4891 42.0 1848 0.5554
0.4882 43.0 1892 0.5557
0.4877 44.0 1936 0.5562

Framework versions

  • Transformers 5.12.0
  • Pytorch 2.12.0+cu130
  • Datasets 4.8.5
  • Tokenizers 0.22.2
Downloads last month
113
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support