about bos token

#4
by coconut00 - opened

I'm getting this warning while testing the model on Colab. What should I do?

this is my code

def load_model(repo_id, filename):
model = Llama.from_pretrained(
repo_id=repo_id,
filename=filename,
n_gpu_layers=-1,
chat_format = 'llama-3'
)
return model

model=load_model('bartowski/Llama-3-8B-Instruct-Gradient-1048k-GGUF', filename = 'Llama-3-8B-Instruct-Gradient-1048k-Q4_K_M.gguf')

output = model.create_chat_completion(
messages = [
{"role": "system", "content": 'you are helpful assistant'},
{"role": "user", "content": 'hello'}
]
)

==> llama_tokenize_internal: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. Are you sure this is what you want?

I really can't figure out why this problem is happening.
Where should I fix this code? Thanks in advance!

Sign up or log in to comment