SneakyLemon
/

results

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

results / README.md

SneakyLemon's picture

SneakyLemon/Llama3LoraCauseEffect

783947d verified 7 months ago

|

2.64 kB

	---
	license: llama3
	base_model: meta-llama/Meta-Llama-3-8B
	tags:
	- generated_from_trainer
	metrics:
	- f1
	model-index:
	- name: results
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results

	This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4487
	- F1: 0.8063

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 128
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 35
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------:\|
	\| 0.8917 \| 0.1368 \| 16 \| 0.9228 \| 0.5606 \|
	\| 0.8219 \| 0.2735 \| 32 \| 0.7617 \| 0.6112 \|
	\| 0.7154 \| 0.4103 \| 48 \| 0.6455 \| 0.6687 \|
	\| 0.6278 \| 0.5470 \| 64 \| 0.5976 \| 0.6955 \|
	\| 0.5923 \| 0.6838 \| 80 \| 0.5443 \| 0.7327 \|
	\| 0.5417 \| 0.8205 \| 96 \| 0.5212 \| 0.7479 \|
	\| 0.5094 \| 0.9573 \| 112 \| 0.5087 \| 0.7586 \|
	\| 0.4866 \| 1.0940 \| 128 \| 0.4835 \| 0.7719 \|
	\| 0.4743 \| 1.2308 \| 144 \| 0.5172 \| 0.7609 \|
	\| 0.4887 \| 1.3675 \| 160 \| 0.4905 \| 0.7718 \|
	\| 0.452 \| 1.5043 \| 176 \| 0.4706 \| 0.7817 \|
	\| 0.4592 \| 1.6410 \| 192 \| 0.4658 \| 0.7795 \|
	\| 0.4372 \| 1.7778 \| 208 \| 0.4726 \| 0.7782 \|
	\| 0.4387 \| 1.9145 \| 224 \| 0.4769 \| 0.7775 \|
	\| 0.4242 \| 2.0513 \| 240 \| 0.4526 \| 0.7929 \|
	\| 0.3881 \| 2.1880 \| 256 \| 0.4541 \| 0.7975 \|
	\| 0.4081 \| 2.3248 \| 272 \| 0.4524 \| 0.8002 \|
	\| 0.3768 \| 2.4615 \| 288 \| 0.4609 \| 0.7931 \|
	\| 0.3838 \| 2.5983 \| 304 \| 0.4511 \| 0.8037 \|
	\| 0.3888 \| 2.7350 \| 320 \| 0.4483 \| 0.8011 \|
	\| 0.3791 \| 2.8718 \| 336 \| 0.4487 \| 0.8063 \|


	### Framework versions

	- Transformers 4.41.2
	- Pytorch 2.3.1+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1