Text Generation
Transformers
PyTorch
English
llama
Inference Endpoints
text-generation-inference

Inconsistencies with end_of_turn generation

#3
by JohanAR - opened

It seems a bit inconsistent with generating <|end_of_turn|> at the end of responses, sometimes I get "|<|end_of_turn|>" or "|end_of_turn]" etc, especially when asking about subjects that trigger the censoring, e.g. "where can I buy drugs?"

I've copied the oobabooga settings from https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B but I don't know if I'm doing something wrong, it's a bug in text-generation-webui/llama.cpp, or if this is expected from this model.

Same here. I see inconsistent tokens for end of turn. The model responds well but had various versions of <|end_of_turn|> often with few missing preceeding or end characters.

It looks like this model has <|end_of_turn|> added as a special token, and I have recently learned that this might not be supported by exllama and llama.cpp. I don't have a copy of this model around, but if you're using ooba's webui try using one of the model loaders with _HF suffix.

Sign up or log in to comment