Context Length

#2
by FineMist - opened

Such a good model but I notice every llama 3 model starts to develop Shakespearean language once it get to a context length of 16k - 32k. Is it unavoidable? My settings follow whatever the models suggest.

Unavoidable, they are not trained to go past 8k context after all.

Bummer. πŸ˜• Still, this model is amazing. I like it even more than Stheno.

Sign up or log in to comment