Please Consider Adding A Chat Template To The Model Tokenizer

#5
by The0 - opened

See here: https://huggingface.co/docs/transformers/v4.35.1/en/chat_templating#introduction

As it's currently setup if you do something like below it will use the wrong chat template

from transformers import pipeline

pipe = pipeline("text-generation", model="NousResearch/Nous-Capybara-34B", trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto")
prompt = pipe.tokenizer.apply_chat_template(conversation_history, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt)

Thank you! Will consider.

What is the proper chat format anyways? The info on the model card is not helpful at all - the actual expected format as string for a few messages would be. It also seems off, as </s> is not part of the special tokens of the tokenizer...

Sign up or log in to comment