It works for Chinese and English, is it possible to use for other languages, such as french, kerea

#2
by sk2mm2 - opened

It works for Chinese and English, is it possible to use for other languages, such as french, kerea

Moka HR SaSS org

Which language we can support depends mainly on two factors, one is whether the pretrained encoder model we use supports the language, and the other is whether our training set includes samples of the language. There are many good multilingual pretrained models on HuggingFace, but we do not have a multilingual training data set. If you can provide the relevant data set, we will be happy to train a more languages supporting the Embedding model.

The format of the training sample, just like this
@dataclass(slots=True)
class PairRecord:
text: str
text_pos: str

MokaHR changed discussion status to closed

Sign up or log in to comment