Edit model card
Configuration Parsing Warning: In adapter_config.json: "peft.base_model_name_or_path" must be a string

final_llama

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0641

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
2.767 1.0 1 2.7479
2.767 2.0 2 2.7247
2.767 3.0 3 2.7010
2.767 4.0 4 2.6766
2.767 5.0 5 2.6526
2.767 6.0 6 2.6286
2.767 7.0 7 2.6044
2.767 8.0 8 2.5795
2.767 9.0 9 2.5543
2.767 10.0 10 2.5293
2.767 11.0 11 2.5038
2.767 12.0 12 2.4779
2.767 13.0 13 2.4519
2.767 14.0 14 2.4254
2.767 15.0 15 2.3991
2.767 16.0 16 2.3726
2.767 17.0 17 2.3458
2.767 18.0 18 2.3199
2.767 19.0 19 2.2934
2.767 20.0 20 2.2677
2.767 21.0 21 2.2425
2.767 22.0 22 2.2178
2.767 23.0 23 2.1940
2.767 24.0 24 2.1701
2.767 25.0 25 2.1468
2.767 26.0 26 2.1236
2.767 27.0 27 2.1001
2.767 28.0 28 2.0772
2.767 29.0 29 2.0543
2.767 30.0 30 2.0314
2.767 31.0 31 2.0088
2.767 32.0 32 1.9860
2.767 33.0 33 1.9644
2.767 34.0 34 1.9425
2.767 35.0 35 1.9207
2.767 36.0 36 1.8995
2.767 37.0 37 1.8785
2.767 38.0 38 1.8575
2.767 39.0 39 1.8370
2.767 40.0 40 1.8163
2.767 41.0 41 1.7959
2.767 42.0 42 1.7752
2.767 43.0 43 1.7550
2.767 44.0 44 1.7349
2.767 45.0 45 1.7146
2.767 46.0 46 1.6944
2.767 47.0 47 1.6746
2.767 48.0 48 1.6544
2.767 49.0 49 1.6346
2.767 50.0 50 1.6150
2.767 51.0 51 1.5955
2.767 52.0 52 1.5760
2.767 53.0 53 1.5566
2.767 54.0 54 1.5377
2.767 55.0 55 1.5191
2.767 56.0 56 1.5005
2.767 57.0 57 1.4819
2.767 58.0 58 1.4646
2.767 59.0 59 1.4469
2.767 60.0 60 1.4297
2.767 61.0 61 1.4130
2.767 62.0 62 1.3963
2.767 63.0 63 1.3803
2.767 64.0 64 1.3645
2.767 65.0 65 1.3488
2.767 66.0 66 1.3336
2.767 67.0 67 1.3183
2.767 68.0 68 1.3041
2.767 69.0 69 1.2896
2.767 70.0 70 1.2756
2.767 71.0 71 1.2623
2.767 72.0 72 1.2494
2.767 73.0 73 1.2368
2.767 74.0 74 1.2244
2.767 75.0 75 1.2128
2.767 76.0 76 1.2019
2.767 77.0 77 1.1909
2.767 78.0 78 1.1804
2.767 79.0 79 1.1706
2.767 80.0 80 1.1607
2.767 81.0 81 1.1516
2.767 82.0 82 1.1430
2.767 83.0 83 1.1347
2.767 84.0 84 1.1268
2.767 85.0 85 1.1196
2.767 86.0 86 1.1125
2.767 87.0 87 1.1058
2.767 88.0 88 1.0998
2.767 89.0 89 1.0939
2.767 90.0 90 1.0889
2.767 91.0 91 1.0843
2.767 92.0 92 1.0799
2.767 93.0 93 1.0766
2.767 94.0 94 1.0734
2.767 95.0 95 1.0707
2.767 96.0 96 1.0682
2.767 97.0 97 1.0669
2.767 98.0 98 1.0655
2.767 99.0 99 1.0646
1.7143 100.0 100 1.0641

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
49
Unable to determine this model’s pipeline type. Check the docs .

Adapter for