Does nllb-200 take context into account when splitting into sentences?
#2
by
Vladislav951
- opened
I need to translate large text >10000 characters
I split the text into sentences and pass this list of sentences into pipeline:
sentences = # list of splitted text into sentences
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
pipe = pipeline('translation', model="facebook/nllb-200-distilled-1.3B", src_lang='rus_Cyrl', tgt_lang='eng_Latn', device=0)
result = pipe(sentences, max_length=400, batch_size=64)
Does the model take into account the context from neighboring sentences? If not, is it possible to make it?