Is there a limitation with the input token length with Helsinki-NLP/opus-mt-en-ROMANCE model ?

#5
by Arsh25 - opened

Hello experts,

I am trying to do POC with Helsinki-NLP/opus-mt-en-ROMANCE model for language translation and I see the model starts truncating after 190+ tokens. Is there any max_length parameter we can explore to instruct the model to go beyond 190+ tokens? Do we have more insights on the token length during training time? Do you recommend breaking the input paragraphs into multiple lines using delimiters like period(.) and sending one line at a time for translation? Any other inputs to come out from the max token limitations would be highly appreciated.

Sign up or log in to comment