[Question] How to keep the model from translating unknow tokens ?

#8
by Fransferdy - opened

For example I have a text, in which I want to preserve person names, sometimes the model will translate John as João for portuguese/spanish, and I would rather keep it as John. Using google translate/bing/ibm watson I'm able to change known names to absurd tokens such as itaquabucetuba555 and they are usually preserved during translation. However when I tried this with the facebook model, it still tries to change the absurd tokens to something else.

Is there a way to prevent the model from changing specific words ?

How about wrapping these specific words in special tokens such as "$$ word $$"

Sign up or log in to comment