Edit model card

meta_llama_3_Magiccoder_evol_10k_ortho

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.2542 0.0259 4 1.3547
1.2886 0.0518 8 1.2904
1.2416 0.0777 12 1.2577
1.2563 0.1036 16 1.2476
1.2632 0.1296 20 1.2418
1.1767 0.1555 24 1.2388
1.2299 0.1814 28 1.2364
1.203 0.2073 32 1.2333
1.2295 0.2332 36 1.2313
1.2796 0.2591 40 1.2307
1.2378 0.2850 44 1.2274
1.162 0.3109 48 1.2276
1.2157 0.3368 52 1.2251
1.2534 0.3628 56 1.2245
1.2336 0.3887 60 1.2226
1.2928 0.4146 64 1.2219
1.1455 0.4405 68 1.2216
1.2152 0.4664 72 1.2194
1.1637 0.4923 76 1.2201
1.2462 0.5182 80 1.2180
1.1747 0.5441 84 1.2157
1.218 0.5700 88 1.2160
1.3152 0.5960 92 1.2149
1.1314 0.6219 96 1.2152
1.2156 0.6478 100 1.2149
1.134 0.6737 104 1.2151
1.1619 0.6996 108 1.2150
1.1718 0.7255 112 1.2150
1.2274 0.7514 116 1.2142
1.211 0.7773 120 1.2136
1.233 0.8032 124 1.2131
1.2209 0.8291 128 1.2126
1.2179 0.8551 132 1.2122
1.179 0.8810 136 1.2119
1.1764 0.9069 140 1.2120
1.1622 0.9328 144 1.2120
1.1853 0.9587 148 1.2120
1.1599 0.9846 152 1.2120

Framework versions

  • PEFT 0.7.1
  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for imdatta0/meta_llama_3_Magiccoder_evol_10k_ortho

Adapter
(535)
this model