What's the best way to get the most accurate translations?

#5
by acraber - opened

I noticed at least for short translations, putting num_beams=1 in the model.generate call seems to be the way to go, but there's still something to be desired from the translations... Any other suggestions?

well, there is a tutorial for it and these are the settings he used, I dunno but maybe you can reach out to him

or test them out, and some A/B test with ur current configs

    inputs = tokenizer.encode(text, return_tensors="pt")
    outputs = model.generate(inputs, num_beams=4, max_length=50, early_stopping=True)
    translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

ref: https://drlee.io/translate-text-from-any-language-to-any-language-with-hugging-face-transformers-and-google-colab-272876150a93

Sign up or log in to comment