Edit model card

mistral_7b_v_Magiccoder_evol_10k

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1309

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.1655 0.0261 4 1.1621
1.0912 0.0523 8 1.1461
1.1696 0.0784 12 1.1691
1.0845 0.1046 16 1.1686
1.1548 0.1307 20 1.1627
1.0495 0.1569 24 1.1600
1.192 0.1830 28 1.1692
1.1397 0.2092 32 1.1666
1.0956 0.2353 36 1.1564
1.1748 0.2614 40 1.1647
1.1924 0.2876 44 1.1664
1.1258 0.3137 48 1.1596
1.1319 0.3399 52 1.1619
1.1099 0.3660 56 1.1577
1.122 0.3922 60 1.1573
1.1749 0.4183 64 1.1538
1.0708 0.4444 68 1.1579
1.0763 0.4706 72 1.1419
1.0635 0.4967 76 1.1494
1.1717 0.5229 80 1.1519
1.0674 0.5490 84 1.1404
1.1492 0.5752 88 1.1620
1.2029 0.6013 92 1.1477
1.1744 0.6275 96 1.1346
1.104 0.6536 100 1.1438
1.1398 0.6797 104 1.1436
1.1296 0.7059 108 1.1409
1.0167 0.7320 112 1.1469
1.1048 0.7582 116 1.1396
1.1004 0.7843 120 1.1358
1.1283 0.8105 124 1.1333
1.1287 0.8366 128 1.1322
1.1421 0.8627 132 1.1315
1.0848 0.8889 136 1.1303
1.184 0.9150 140 1.1300
1.0453 0.9412 144 1.1304
1.0604 0.9673 148 1.1307
1.2116 0.9935 152 1.1309

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/mistral_7b_v_Magiccoder_evol_10k

Adapter
(272)
this model