Roberta trained from scratch on the Nepali CC-100 dataset with 12 million sentences.
from transformers import pipeline pipe = pipeline( "fill-mask", model="amitness/nepbert", tokenizer="amitness/nepbert" ) print(pipe(u"तिमीलाई कस्तो <mask>?"))
The data was taken from the nepali language subset of CC-100 dataset.
The model was trained on Google Colab using
1x Tesla V100.
- Downloads last month