About max sequence length

#14

by jorisfu - opened Jan 11

Jan 11

Was this model trained on a corpus with a length of 32k characters, and how does it perform on a corpus of the 32k length?

Jan 11

"Using this model for inputs longer than 4096 tokens is not recommended."

Feb 19

Is it fair to use tiktoken library and clk100 as the tokenizer for estimation purpose?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment