Could you please explain more details about fine-tuning LLaMA-2-7B to LLaMA-2-7B-32k? Such as the fine-tuning steps and batch size. Thanks!

#32

by Mooler - opened Oct 31, 2023

Oct 31, 2023

Hi! I've read the original PI paper. It says they only fine-tune about 1000 steps to extend the context window. Did you tune the same steps (i.e. 1000 steps) as the original paper? Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment