metadata

license: llama3
library_name: peft
tags:
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
  - name: final_llama
    results: []

final_llama

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.4315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
20.3519	1.0	1	20.2665
20.3519	2.0	2	20.1486
20.3519	3.0	3	20.0148
20.3519	4.0	4	19.8717
20.3519	5.0	5	19.7162
20.3519	6.0	6	19.5484
20.3519	7.0	7	19.3671
20.3519	8.0	8	19.1726
20.3519	9.0	9	18.9658
20.3519	10.0	10	18.7509
20.3519	11.0	11	18.5206
20.3519	12.0	12	18.2873
20.3519	13.0	13	18.0473
20.3519	14.0	14	17.8122
20.3519	15.0	15	17.5813
20.3519	16.0	16	17.3524
20.3519	17.0	17	17.1373
20.3519	18.0	18	16.9325
20.3519	19.0	19	16.7373
20.3519	20.0	20	16.5515
20.3519	21.0	21	16.3717
20.3519	22.0	22	16.1991
20.3519	23.0	23	16.0269
20.3519	24.0	24	15.8545
20.3519	25.0	25	15.6744
20.3519	26.0	26	15.4867
20.3519	27.0	27	15.2966
20.3519	28.0	28	15.0983
20.3519	29.0	29	14.8900
20.3519	30.0	30	14.6800
20.3519	31.0	31	14.4623
20.3519	32.0	32	14.2411
20.3519	33.0	33	14.0190
20.3519	34.0	34	13.7946
20.3519	35.0	35	13.5692
20.3519	36.0	36	13.3422
20.3519	37.0	37	13.1188
20.3519	38.0	38	12.8947
20.3519	39.0	39	12.6666
20.3519	40.0	40	12.4454
20.3519	41.0	41	12.2206
20.3519	42.0	42	11.9955
20.3519	43.0	43	11.7648
20.3519	44.0	44	11.5387
20.3519	45.0	45	11.3104
20.3519	46.0	46	11.0794
20.3519	47.0	47	10.8506
20.3519	48.0	48	10.6189
20.3519	49.0	49	10.3891
20.3519	50.0	50	10.1577
20.3519	51.0	51	9.9252
20.3519	52.0	52	9.6967
20.3519	53.0	53	9.4668
20.3519	54.0	54	9.2420
20.3519	55.0	55	9.0153
20.3519	56.0	56	8.7923
20.3519	57.0	57	8.5711
20.3519	58.0	58	8.3488
20.3519	59.0	59	8.1307
20.3519	60.0	60	7.9147
20.3519	61.0	61	7.7034
20.3519	62.0	62	7.4925
20.3519	63.0	63	7.2867
20.3519	64.0	64	7.0867
20.3519	65.0	65	6.8855
20.3519	66.0	66	6.6916
20.3519	67.0	67	6.5061
20.3519	68.0	68	6.3185
20.3519	69.0	69	6.1380
20.3519	70.0	70	5.9656
20.3519	71.0	71	5.7981
20.3519	72.0	72	5.6359
20.3519	73.0	73	5.4777
20.3519	74.0	74	5.3250
20.3519	75.0	75	5.1813
20.3519	76.0	76	5.0366
20.3519	77.0	77	4.9050
20.3519	78.0	78	4.7747
20.3519	79.0	79	4.6546
20.3519	80.0	80	4.5398
20.3519	81.0	81	4.4281
20.3519	82.0	82	4.3260
20.3519	83.0	83	4.2277
20.3519	84.0	84	4.1377
20.3519	85.0	85	4.0487
20.3519	86.0	86	3.9703
20.3519	87.0	87	3.8984
20.3519	88.0	88	3.8298
20.3519	89.0	89	3.7664
20.3519	90.0	90	3.7087
20.3519	91.0	91	3.6577
20.3519	92.0	92	3.6108
20.3519	93.0	93	3.5709
20.3519	94.0	94	3.5354
20.3519	95.0	95	3.5040
20.3519	96.0	96	3.4798
20.3519	97.0	97	3.4595
20.3519	98.0	98	3.4442
20.3519	99.0	99	3.4354
10.6394	100.0	100	3.4315

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.1.2
Datasets 2.19.2
Tokenizers 0.19.1