microsoft/phi-2 · How to finetune Phi2 using RoPE and QLoRA for long text summary generation?

Jan 30, 2024

•

edited Jan 30, 2024

I want to increase Phi2 context length from 2048---->5k tokens, So that I can finetune this model on my custom dataset (approx 5000 tokens per sample) using QLoRA.
I heard about RoPE but couldn't find any documentation or code to increase the context length by finetuning.

model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
torch_dtype="auto",
device_map=device_map,
rope_scaling={"type": "linear", "factor": 3}, # Is this enough
use_cache=True,
use_flash_attention_2=False,
)

Also, how to verify that my rotatory_embedding has changed ?
Please Help

ramkrish120595

Feb 19, 2024

not correct use this one bro ......
print(model.config)
model.config.rope_scaling = {"type": "linear", "factor": 3}
print(model.config)

LoadingALIAS

Mar 22, 2024

Did @ramkrish120595 suggestion work for you @parikshit1619 ? I'm exploring the possibility of fine-tuning a Phi2 model myself but without extending the context length WELL beyond 2k it's useless. Did you successfully FT Phi2 using RoPE? What was your length?

ramkrish120595

Apr 2, 2024

•

edited Apr 2, 2024

hi , I am using dynamic ROPE scaling technique.
model.config.rope_scaling = {"type": "dynamic", "factor": 8.0} ### context length extend up to 16k. It is working successfully for me.

ramkrish120595

Apr 2, 2024

if you want extend the context length in FT you can use linear ROPE scaling technique.
model.config.rope_scaling = {"type": "linear", "factor": 8.0} ### context length extend up to 16k.