Context extension?

#4
by droussis - opened

Hi and congratulations for the great model! A European LLM was long overdue!

I observed that the base model has 4k max_position_embeddings, while the Instruct model has 8k.
Did you follow a specific methodology to increase its context length? I didn't see anything in the paper.

Thank you in advance and please correct me if I am wrong.

P.S. Also wanted to ask if you plan to extend the context length of your models in the future.

UTTER - Unified Transcription and Translation for Extended Reality org

Hi. Thank you.

This is actually a typo. I'll fix it.
But we're planning to experiment increasing the context length for the bigger models that we're training.

phmartins changed discussion status to closed

Sign up or log in to comment