Does nllb-200 take context into account when splitting into sentences?

#2
by Vladislav951 - opened

I need to translate large text >10000 characters

I split the text into sentences and pass this list of sentences into pipeline:

sentences = # list of splitted text into sentences

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
pipe = pipeline('translation', model="facebook/nllb-200-distilled-1.3B", src_lang='rus_Cyrl', tgt_lang='eng_Latn', device=0)
result = pipe(sentences, max_length=400, batch_size=64)

Does the model take into account the context from neighboring sentences? If not, is it possible to make it?

Sign up or log in to comment