final_llama / README.md
abhi317's picture
DONE
edd9507 verified
|
raw
history blame
No virus
6.4 kB
metadata
license: llama3
library_name: peft
tags:
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
  - name: final_llama
    results: []

final_llama

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.4315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
20.3519 1.0 1 20.2665
20.3519 2.0 2 20.1486
20.3519 3.0 3 20.0148
20.3519 4.0 4 19.8717
20.3519 5.0 5 19.7162
20.3519 6.0 6 19.5484
20.3519 7.0 7 19.3671
20.3519 8.0 8 19.1726
20.3519 9.0 9 18.9658
20.3519 10.0 10 18.7509
20.3519 11.0 11 18.5206
20.3519 12.0 12 18.2873
20.3519 13.0 13 18.0473
20.3519 14.0 14 17.8122
20.3519 15.0 15 17.5813
20.3519 16.0 16 17.3524
20.3519 17.0 17 17.1373
20.3519 18.0 18 16.9325
20.3519 19.0 19 16.7373
20.3519 20.0 20 16.5515
20.3519 21.0 21 16.3717
20.3519 22.0 22 16.1991
20.3519 23.0 23 16.0269
20.3519 24.0 24 15.8545
20.3519 25.0 25 15.6744
20.3519 26.0 26 15.4867
20.3519 27.0 27 15.2966
20.3519 28.0 28 15.0983
20.3519 29.0 29 14.8900
20.3519 30.0 30 14.6800
20.3519 31.0 31 14.4623
20.3519 32.0 32 14.2411
20.3519 33.0 33 14.0190
20.3519 34.0 34 13.7946
20.3519 35.0 35 13.5692
20.3519 36.0 36 13.3422
20.3519 37.0 37 13.1188
20.3519 38.0 38 12.8947
20.3519 39.0 39 12.6666
20.3519 40.0 40 12.4454
20.3519 41.0 41 12.2206
20.3519 42.0 42 11.9955
20.3519 43.0 43 11.7648
20.3519 44.0 44 11.5387
20.3519 45.0 45 11.3104
20.3519 46.0 46 11.0794
20.3519 47.0 47 10.8506
20.3519 48.0 48 10.6189
20.3519 49.0 49 10.3891
20.3519 50.0 50 10.1577
20.3519 51.0 51 9.9252
20.3519 52.0 52 9.6967
20.3519 53.0 53 9.4668
20.3519 54.0 54 9.2420
20.3519 55.0 55 9.0153
20.3519 56.0 56 8.7923
20.3519 57.0 57 8.5711
20.3519 58.0 58 8.3488
20.3519 59.0 59 8.1307
20.3519 60.0 60 7.9147
20.3519 61.0 61 7.7034
20.3519 62.0 62 7.4925
20.3519 63.0 63 7.2867
20.3519 64.0 64 7.0867
20.3519 65.0 65 6.8855
20.3519 66.0 66 6.6916
20.3519 67.0 67 6.5061
20.3519 68.0 68 6.3185
20.3519 69.0 69 6.1380
20.3519 70.0 70 5.9656
20.3519 71.0 71 5.7981
20.3519 72.0 72 5.6359
20.3519 73.0 73 5.4777
20.3519 74.0 74 5.3250
20.3519 75.0 75 5.1813
20.3519 76.0 76 5.0366
20.3519 77.0 77 4.9050
20.3519 78.0 78 4.7747
20.3519 79.0 79 4.6546
20.3519 80.0 80 4.5398
20.3519 81.0 81 4.4281
20.3519 82.0 82 4.3260
20.3519 83.0 83 4.2277
20.3519 84.0 84 4.1377
20.3519 85.0 85 4.0487
20.3519 86.0 86 3.9703
20.3519 87.0 87 3.8984
20.3519 88.0 88 3.8298
20.3519 89.0 89 3.7664
20.3519 90.0 90 3.7087
20.3519 91.0 91 3.6577
20.3519 92.0 92 3.6108
20.3519 93.0 93 3.5709
20.3519 94.0 94 3.5354
20.3519 95.0 95 3.5040
20.3519 96.0 96 3.4798
20.3519 97.0 97 3.4595
20.3519 98.0 98 3.4442
20.3519 99.0 99 3.4354
10.6394 100.0 100 3.4315

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1