nferruz/ProtGPT2 · Model fine-tuning does not work well

Hello,

I am trying to fine-tune ProtGPT2 using a training dataset of about 10000 sequences. However, the loss of the train set and the validation set is always around 3. I have tried adjusting data size and the learning rate, but nothing seemed to work. Has anyone else run into this?

"python run_clm.py --model_name_or_path nferruz/ProtGPT2 --train_file /home/dell/train.txt --tokenizer_name nferruz/ProtGPT2 --do_train --do_eval --output_dir /home/dell/result --learning_rate 1e-06 --num_train_epochs 30 --gradient_accumulation_steps=4 --per_device_train_batch_size=8 --overwrite_output_dir --gradient_checkpointing=True --fp16=True --logging_steps 1 --evaluation_strategy epoch --validation_split_percentage 10"

Thanks in advance!
rqh