Phi-2 gives me CUDA Out of Memory Error but Mistral7b works fine.
#12
by
dpasch01
- opened
I don't get why the following code throws CUDA OOM error but when used with ehartford/dolphin-2.2.1-mistral-7b
works just fine.
!autotrain llm --train \
--project-name "phi-2-test" \
--model "cognitivecomputations/dolphin-2_6-phi-2" \
--data-path ./ \
--train-split "train" \
--valid-split "test" \
--text_column "prompt" \
--lr 2e-5 \
--batch-size 2\
--gradient-accumulation 5 \
--epochs 2 \
--merge_adapter \
--model_max_length 512 \
--trainer sft \
--use-peft \
--quantization int4 \
--mixed-precision fp16 \
--optimizer paged_adamw_32bit \
--add_eos_token \
--lora_r 16 \
--lora_alpha 32 \
--lora_dropout 0.05 \
--target_modules "q_proj,k_proj,v_proj,dense,fc1,fc2"
You will need to ask this "autotrain" team, I don't know about that project.
ehartford
changed discussion status to
closed