Text Generation
Transformers
PyTorch
llama
Not-For-All-Audiences
nsfw
text-generation-inference
Inference Endpoints
Context Length?
#4
by
lazyDataScientist
- opened
I am guessing the context length is 4k tokens, but llama.cpp is suggesting 2k tokens. Just wanted to be sure what it is.
It should be native 4k yes!