Pad_token_id of MPT-7B
#49
by
Trung-Dung
- opened
I want to use MPT-7B with text-generation pipeline. To do batch processing, I need to set the pad_token_id. However, the tokenizer doesn't have pad, eos and bos tokens. What value should I set in this case?
Hi @Trung-Dung , we use the GPT NeoX tokenizer which should have an EOS token id. I think you can safely reuse the EOS token id as the PAD token id at inference time.
sam-mosaic
changed discussion status to
closed
As a follow-up to this discussion. When using the EOS as the PAD token, is there any recommendation for the padding side?