HingBERT-LID

HingBERT-LID is a Hindi-English code-mixed language identification BERT model. It is a HingBERT model fine-tuned on L3Cube-HingLID dataset.
[dataset link] (https://github.com/l3cube-pune/code-mixed-nlp)

More details on the dataset, models, and baseline results can be found in our [paper] (https://arxiv.org/abs/2204.08398)

Other models from HingBERT family:
HingBERT
HingMBERT
HingBERT-Mixed
HingBERT-Mixed-v2
HingRoBERTa
HingRoBERTa-Mixed
HingGPT
HingGPT-Devanagari
HingBERT-LID

@inproceedings{nayak-joshi-2022-l3cube,
    title = "{L}3{C}ube-{H}ing{C}orpus and {H}ing{BERT}: A Code Mixed {H}indi-{E}nglish Dataset and {BERT} Language Models",
    author = "Nayak, Ravindra  and Joshi, Raviraj",
    booktitle = "Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.wildre-1.2",
    pages = "7--12",
}
Downloads last month
60
Safetensors
Model size
109M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.