wookidoki
/

autofix10k

Generated from Trainer

Model card Files Files and versions Community

autofix10k / README.md

wookidoki's picture

End of training

8dbfbc1 verified 8 months ago

|

history blame contribute delete

4.4 kB

	---
	license: llama2
	base_model: codellama/CodeLlama-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: autofix10k
	results: []
	library_name: peft
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# autofix10k

	This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4372

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- quant_method: QuantizationMethod.BITS_AND_BYTES
	- _load_in_8bit: True
	- _load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32
	- bnb_4bit_quant_storage: uint8
	- load_in_4bit: False
	- load_in_8bit: True
	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 1
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 0.7922 \| 0.2 \| 20 \| 0.5237 \|
	\| 0.5053 \| 0.4 \| 40 \| 0.4857 \|
	\| 0.4071 \| 0.6 \| 60 \| 0.4356 \|
	\| 0.4297 \| 0.8 \| 80 \| 0.4154 \|
	\| 0.5313 \| 1.0 \| 100 \| 0.3827 \|
	\| 0.4814 \| 1.2 \| 120 \| 0.3785 \|
	\| 0.3739 \| 1.4 \| 140 \| 0.3774 \|
	\| 0.3279 \| 1.6 \| 160 \| 0.3761 \|
	\| 0.3149 \| 1.8 \| 180 \| 0.3732 \|
	\| 0.4086 \| 2.0 \| 200 \| 0.3658 \|
	\| 0.3724 \| 2.2 \| 220 \| 0.3664 \|
	\| 0.3691 \| 2.4 \| 240 \| 0.3644 \|
	\| 0.3065 \| 2.6 \| 260 \| 0.3679 \|
	\| 0.2688 \| 2.8 \| 280 \| 0.3767 \|
	\| 0.3431 \| 3.0 \| 300 \| 0.3633 \|
	\| 0.333 \| 3.2 \| 320 \| 0.3641 \|
	\| 0.3052 \| 3.4 \| 340 \| 0.3597 \|
	\| 0.2444 \| 3.6 \| 360 \| 0.3779 \|
	\| 0.2455 \| 3.8 \| 380 \| 0.3712 \|
	\| 0.3078 \| 4.0 \| 400 \| 0.3578 \|
	\| 0.2877 \| 4.2 \| 420 \| 0.3650 \|
	\| 0.2659 \| 4.4 \| 440 \| 0.3731 \|
	\| 0.2496 \| 4.6 \| 460 \| 0.3764 \|
	\| 0.218 \| 4.8 \| 480 \| 0.3781 \|
	\| 0.219 \| 5.0 \| 500 \| 0.3742 \|
	\| 0.2119 \| 5.2 \| 520 \| 0.3808 \|
	\| 0.2435 \| 5.4 \| 540 \| 0.3871 \|
	\| 0.2331 \| 5.6 \| 560 \| 0.3818 \|
	\| 0.1738 \| 5.8 \| 580 \| 0.3758 \|
	\| 0.1772 \| 6.0 \| 600 \| 0.3731 \|
	\| 0.1607 \| 6.2 \| 620 \| 0.4121 \|
	\| 0.1942 \| 6.4 \| 640 \| 0.3943 \|
	\| 0.2312 \| 6.6 \| 660 \| 0.3867 \|
	\| 0.1528 \| 6.8 \| 680 \| 0.4160 \|
	\| 0.1155 \| 7.0 \| 700 \| 0.4100 \|
	\| 0.1495 \| 7.2 \| 720 \| 0.4081 \|
	\| 0.1674 \| 7.4 \| 740 \| 0.4015 \|
	\| 0.1849 \| 7.6 \| 760 \| 0.4075 \|
	\| 0.1231 \| 7.8 \| 780 \| 0.4238 \|
	\| 0.0905 \| 8.0 \| 800 \| 0.4128 \|
	\| 0.1156 \| 8.2 \| 820 \| 0.4278 \|
	\| 0.1628 \| 8.4 \| 840 \| 0.4203 \|
	\| 0.1545 \| 8.6 \| 860 \| 0.4219 \|
	\| 0.1236 \| 8.8 \| 880 \| 0.4294 \|
	\| 0.0799 \| 9.0 \| 900 \| 0.4224 \|
	\| 0.0991 \| 9.2 \| 920 \| 0.4399 \|
	\| 0.1176 \| 9.4 \| 940 \| 0.4350 \|
	\| 0.1711 \| 9.6 \| 960 \| 0.4362 \|
	\| 0.1106 \| 9.8 \| 980 \| 0.4414 \|
	\| 0.0582 \| 10.0 \| 1000 \| 0.4372 \|


	### Framework versions

	- PEFT 0.4.0
	- Transformers 4.40.0.dev0
	- Pytorch 2.2.0+cu121
	- Datasets 2.17.1
	- Tokenizers 0.15.2