final_llama

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.0641

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
2.767	1.0	1	2.7479
2.767	2.0	2	2.7247
2.767	3.0	3	2.7010
2.767	4.0	4	2.6766
2.767	5.0	5	2.6526
2.767	6.0	6	2.6286
2.767	7.0	7	2.6044
2.767	8.0	8	2.5795
2.767	9.0	9	2.5543
2.767	10.0	10	2.5293
2.767	11.0	11	2.5038
2.767	12.0	12	2.4779
2.767	13.0	13	2.4519
2.767	14.0	14	2.4254
2.767	15.0	15	2.3991
2.767	16.0	16	2.3726
2.767	17.0	17	2.3458
2.767	18.0	18	2.3199
2.767	19.0	19	2.2934
2.767	20.0	20	2.2677
2.767	21.0	21	2.2425
2.767	22.0	22	2.2178
2.767	23.0	23	2.1940
2.767	24.0	24	2.1701
2.767	25.0	25	2.1468
2.767	26.0	26	2.1236
2.767	27.0	27	2.1001
2.767	28.0	28	2.0772
2.767	29.0	29	2.0543
2.767	30.0	30	2.0314
2.767	31.0	31	2.0088
2.767	32.0	32	1.9860
2.767	33.0	33	1.9644
2.767	34.0	34	1.9425
2.767	35.0	35	1.9207
2.767	36.0	36	1.8995
2.767	37.0	37	1.8785
2.767	38.0	38	1.8575
2.767	39.0	39	1.8370
2.767	40.0	40	1.8163
2.767	41.0	41	1.7959
2.767	42.0	42	1.7752
2.767	43.0	43	1.7550
2.767	44.0	44	1.7349
2.767	45.0	45	1.7146
2.767	46.0	46	1.6944
2.767	47.0	47	1.6746
2.767	48.0	48	1.6544
2.767	49.0	49	1.6346
2.767	50.0	50	1.6150
2.767	51.0	51	1.5955
2.767	52.0	52	1.5760
2.767	53.0	53	1.5566
2.767	54.0	54	1.5377
2.767	55.0	55	1.5191
2.767	56.0	56	1.5005
2.767	57.0	57	1.4819
2.767	58.0	58	1.4646
2.767	59.0	59	1.4469
2.767	60.0	60	1.4297
2.767	61.0	61	1.4130
2.767	62.0	62	1.3963
2.767	63.0	63	1.3803
2.767	64.0	64	1.3645
2.767	65.0	65	1.3488
2.767	66.0	66	1.3336
2.767	67.0	67	1.3183
2.767	68.0	68	1.3041
2.767	69.0	69	1.2896
2.767	70.0	70	1.2756
2.767	71.0	71	1.2623
2.767	72.0	72	1.2494
2.767	73.0	73	1.2368
2.767	74.0	74	1.2244
2.767	75.0	75	1.2128
2.767	76.0	76	1.2019
2.767	77.0	77	1.1909
2.767	78.0	78	1.1804
2.767	79.0	79	1.1706
2.767	80.0	80	1.1607
2.767	81.0	81	1.1516
2.767	82.0	82	1.1430
2.767	83.0	83	1.1347
2.767	84.0	84	1.1268
2.767	85.0	85	1.1196
2.767	86.0	86	1.1125
2.767	87.0	87	1.1058
2.767	88.0	88	1.0998
2.767	89.0	89	1.0939
2.767	90.0	90	1.0889
2.767	91.0	91	1.0843
2.767	92.0	92	1.0799
2.767	93.0	93	1.0766
2.767	94.0	94	1.0734
2.767	95.0	95	1.0707
2.767	96.0	96	1.0682
2.767	97.0	97	1.0669
2.767	98.0	98	1.0655
2.767	99.0	99	1.0646
1.7143	100.0	100	1.0641

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.1.2
Datasets 2.19.2
Tokenizers 0.19.1

abhi317
/

final_llama

final_llama

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Adapter for

Evaluation results

final_llama

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Adapter for meta-llama/Meta-Llama-3-8B-Instruct

Evaluation results

Adapter for