metadata

language:
  - ne
thumbnail: null
tags:
  - roberta
  - nepali-laguage-model
license: MIT
datasets:
  - cc100
widget:
  - text: कस्तो <mask> छ।

nepbert

Model description

Roberta trained from scratch on the Nepali CC-100 dataset with 12 million sentences.

Intended uses & limitations

How to use

from transformers import pipeline

pipe = pipeline(
    "fill-mask",
    model="amitness/nepbert",
    tokenizer="amitness/nepbert"
)
print(pipe(u"कस्तो <mask> छ।"))

Limitations and bias

Provide examples of latent issues and potential remediations.

Training data

The data was taken from the nepali language subset of CC-100 dataset.

Training procedure

The model was trained on Google Colab using 1x Tesla V100.