--- license: mit tags: - pytorch - gpt2 model-index: - name: sinhala-gpt2 results: [] widget: - text: මහ - text: සංවිධ - text: දුර්ලභ - text: තනිවීලා - text: ඔබ # inference: # parameters: # do_sample: false # temperature: 0.2 # max_new_tokens: 30 language: - si --- # sinhala-gpt2 This particular model has undergone fine-tuning based on the [gpt2](https://huggingface.co/gpt2) architecture, utilizing a dataset of Sinhala NEWS from various sources. ## Training procedure The model was trained for 12+ hours on Kaggle GPUs. ## Usage Details ```python from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline tokenizer = AutoTokenizer.from_pretrained("Ransaka/sinhala-gpt2") model = AutoModelForCausalLM.from_pretrained("Ransaka/sinhala-gpt2") generator("දුර") ``` or using git ```bash git lfs install git clone https://huggingface.co/Ransaka/sinhala-gpt2 ``` ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 2.0233 | 1.0 | 15323 | 2.3348 | | 1.6938 | 2.0 | 30646 | 1.8377 | | 1.4938 | 3.0 | 45969 | 1.6498 | ### Framework versions - Transformers 4.26.1 - Pytorch 1.13.0 - Datasets 2.1.0 - Tokenizers 0.13.2