Language Detection Model

The model presented in the following repository represents a fine-tuned version of BertForSequenceClassification pretrained on multilingual texts.

Training/fine-tuning

The model has been fine-tuned based on Language Detection dataset found on Kaggle. The entire process of the dataset analysis as well as a complete description of the training procedure can be found in one of my Kaggle notebooks which has been used for the purpose of a faster model training on GPU.

Supported languages

The model has been fine-tuned to detect one of the following 17 languages:

Arabic
Danish
Dutch
English
French
German
Greek
Hindi
Italian
Kannada
Malayalam
Portugeese
Russian
Spanish
Sweedish
Tamil
Turkish

References

BERT multilingual base model (uncased)
Language Detection Dataset

spolivin
/

lang-recogn-model

Language Detection Model

Training/fine-tuning

Supported languages

References

Model tree for spolivin/lang-recogn-model