--- license: bsd-3-clause-clear language: - ne metrics: - perplexity library_name: transformers pipeline_tag: text-generation --- # NepaliGPT: Nepali Language Generative Pretrained Transformer Model This is an experiment for developing a language generation model for the Nepali language. Causal Language Model which can predict the next possible tokens given a context in Nepali language. # Dataset Used A large corpus of 9.3 GB size has been collected from different sources on the internet. The sources include - Nepali Books found online. - Nepali News Article from Nepali news portals. - Nepali text collected from different open source Nepali NLP datasets. # Hyperparameters Used Learning rate -> 2e-5 \ Weight Decay -> 0.01 \ Number of training epochs -> 5 \ bf16 -> True \ Base Model Architecture -> GPT-2 \ ## Training Results It achieves the following results on the evaluation set: | Training Loss | Validation Loss | Perplexity |:-------------:|:---------------:|:----------:| | 3.3968 | 3.2705 | 26.3245