--- language: - ne thumbnail: tags: - roberta - nepali-laguage-model license: MIT datasets: - cc100 --- # nepbert ## Model description Roberta trained from scratch on the Nepali CC-100 dataset with 12 million sentences. ## Intended uses & limitations #### How to use ```python from transformers import pipeline pipe = pipeline( "fill-mask", model="amitness/nepbert", tokenizer="amitness/nepbert" ) print(pipe(u"कस्तो छ।")) ``` #### Limitations and bias Provide examples of latent issues and potential remediations. ## Training data The data was taken from the nepali language subset of CC-100 dataset. ## Training procedure The model was trained on Google Colab using `1x Tesla V100`.