Copycats
/

Llama-2-7b-hf_alpacaGPT4-qlora

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-2-7b-hf_alpacaGPT4-qlora / README.md

JaeHyeong's picture

Update README.md

b0234a8 12 months ago

|

1.41 kB

	---
	license: llama2
	datasets:
	- vicgalle/alpaca-gpt4
	pipeline_tag: text-generation
	language:
	- en
	tags:
	- llama-2
	---
	## Fine-tuning
	- Base Model: [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf)
	- Dataset for fine-tuning: [vicgalle/alpaca-gpt4](https://huggingface.co/vicgalle/gpt2-alpaca-gpt4)
	- Training
	- BitsAndBytesConfig
	```
	BitsAndBytesConfig(
	load_in_4bit= True,
	bnb_4bit_quant_type= "nf4",
	bnb_4bit_compute_dtype= torch.bfloat16,
	bnb_4bit_use_double_quant= False,
	)
	```
	- LoRA Config
	```
	LoraConfig(
	r=16,
	lora_alpha= 8, # alpha = rank * 2 !
	lora_dropout= 0.1,
	bias="none",
	task_type="CAUSAL_LM",
	target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj"]
	)
	```
	- Training Arguments
	```
	TrainingArguments(
	output_dir= "./results",
	num_train_epochs= 1,
	per_device_train_batch_size= 8,
	gradient_accumulation_steps= 2,
	optim = "paged_adamw_8bit",
	save_steps= 1000,
	logging_steps= 30,
	learning_rate= 2e-4,
	weight_decay= 0.001,
	fp16= False,
	bf16= False,
	max_grad_norm= 0.3,
	max_steps= -1,
	warmup_ratio= 0.3,
	group_by_length= True,
	lr_scheduler_type= "linear",
	report_to="wandb",
	)
	```