is this model the instruct version

by shi-zheng-qxhs - opened

Hi, i checked the tokenizer and found both gemma-7b-bnb-4bit and gemma-7b-it-bnb-4bit share the same tokenizer. Are both models fine-tuned instruct version?

Unsloth AI org

@shi-zheng-qxhs Oh no the it is the instruct one. I manually edited the tokenizer to expose the tokens for <start_of_turn> and <end_of_turn>. Interestingly both the instruct and base models have these tokens

Thanks, just wanted to make sure. :)

Just a few follow-up questions:

  1. is there any specific reason the padding_side is set to right?
  2. Can I use unsloth to perform custom training, i.e., without using any of trainer class, but with pytorch native training loop, for example.
Unsloth AI org
  1. @shi-zheng-qxhs padding_side = "right" is for training purposes only. Change it to "left" for inference.
  2. Yes it should work!

fine tune and inference flash attn:
padding_side= "left"


Could you explain well the (padding_side = "right")

Unsloth AI org

@NickyNicky You can use padding side left, however it makes things slower for training. I don't advise it. Unsloth itself must require right padding for training.

Yes simply after training, set tokenizer.padding_side = "left" before model.generate

but wouldn't it trigger alerts from flash attn padding_side = "right"?

now I'm confused haha

Unsloth AI org

@NickyNicky Oh if you're simply using HF, just use whatever they provide. Unsloth itself uses right padding

or interesting I see that after the merge the library deletes the 'padding_side', but when the weights are saved "lora" there is 'padding_side'

example merge:


fine tune unsloth peft


So after training it's time to add padding_side

This comment has been hidden
shi-zheng-qxhs changed discussion status to closed

Sign up or log in to comment