About tokens used in this model.

#8
by icoicqico - opened

Hello, I would like to know what kind of token format is used to train this model, is it the Llama3 token format? Did you train this with any instruction prompt and padding tokens like <|start_header_id|>system<|end_header_id|>, <|eot_id|>? I would like to know because this may help me to desgin the prompt when fine tuning or using the model, thanks!

NVIDIA org

Hi,
We do not use the special tokens like <|start_header_id|>, <|end_header_id|>, and <|eot_id|>. You can pretty much follow the format in the sample code we provide. that's the format we use to train our model. In addition, you can also refer to https://huggingface.co/datasets/nvidia/ChatQA-Training-Data, where we provide the training recipe.

Sign up or log in to comment