Edit model card

Language Detection Model

The model presented in the following repository represents a fine-tuned version of BertForSequenceClassification pretrained on multilingual texts.

Training/fine-tuning

The model has been fine-tuned based on Language Detection dataset found on Kaggle. The entire process of the dataset analysis as well as a complete description of the training procedure can be found in one of my Kaggle notebooks which has been used for the purpose of a faster model training on GPU.

Supported languages

The model has been fine-tuned to detect one of the following 17 languages:

  • Arabic
  • Danish
  • Dutch
  • English
  • French
  • German
  • Greek
  • Hindi
  • Italian
  • Kannada
  • Malayalam
  • Portugeese
  • Russian
  • Spanish
  • Sweedish
  • Tamil
  • Turkish

References

  1. BERT multilingual base model (uncased)
  2. Language Detection Dataset
Downloads last month
3
Safetensors
Model size
167M params
Tensor type
F32
·

Finetuned from