Text Generation
Transformers
Safetensors
English
llama
conversational
Inference Endpoints
text-generation-inference

What would be its context length?

#2
by Jenish-23 - opened

Congratulations! Cudos to your efforts.
I want to make a RAG application with it, what would be its context size?

If you look in the config.json and search for "max_position_embeddings", you'll see that it has a context window size of 2048.

Currently, it's 2048 tokens, but I know that there are several methods of extending the context length of Llama-based models. One of which is YaRN, and that may be worth applying here just to see how well it works.

Jenish-23 changed discussion status to closed

Sign up or log in to comment