Llama3-20240602

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

Loss: 1.4100

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_steps: 0.03
training_steps: 960
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.1356	40	1.3411
No log	0.2712	80	1.3121
1.335	0.4068	120	1.2957
1.335	0.5424	160	1.2854
1.258	0.6780	200	1.2772
1.258	0.8136	240	1.2706
1.258	0.9492	280	1.2642
1.2379	1.0847	320	1.2746
1.2379	1.2203	360	1.2682
1.1301	1.3559	400	1.2697
1.1301	1.4915	440	1.2713
1.1301	1.6271	480	1.2671
1.1256	1.7627	520	1.2633
1.1256	1.8983	560	1.2620
1.0987	2.0339	600	1.2888
1.0987	2.1695	640	1.3127
1.0987	2.3051	680	1.3148
0.9445	2.4407	720	1.3093
0.9445	2.5763	760	1.3086
0.9553	2.7119	800	1.3095
0.9553	2.8475	840	1.3029
0.9553	2.9831	880	1.3066
0.9298	3.1186	920	1.4147
0.9298	3.2542	960	1.4100

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Nhut
/

Llama3-20240602

Llama3-20240602

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Nhut/Llama3-20240602

Evaluation results