Ransaka
/

sinhala-gpt2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

sinhala-gpt2 / README.md

Ransaka's picture

Update README.md

fb6bede over 1 year ago

|

No virus

2.58 kB

	---
	license: mit
	tags:
	- pytorch
	- gpt2
	model-index:
	- name: sinhala-gpt2
	results: []
	widget:
	- text: මහ
	- text: සංවිධ
	- text: දුර්ලභ
	- text: තනිවීලා
	- text: ඔබ
	# inference:
	# parameters:
	# do_sample: false
	# temperature: 0.2
	# max_new_tokens: 30
	language:
	- si
	---

	# sinhala-gpt2

	This particular model has undergone fine-tuning based on the [gpt2](https://huggingface.co/gpt2) architecture, utilizing a dataset of Sinhala NEWS from various sources.
	Even though this is quite simple to train, it is still capable of generating news articles that are identical. Take, for example, the following samples(Some of them are hilarious though :D):
	- "ඔබ විසින් මෙම විරෝධතාව සංවිධානය කර තිබුණේ නැහැ කියලා හිටපු ජනාධිපති මහ"
	- "දුර්ලභ ගණයේ විශ්වවිද්යාල ප්රතිපාදන කොමිෂන් සභාවේ සභාපති මහාචාර්ය ජී එල්"

	⚠️ Since the dataset used for this model is mostly composed of news articles, it is heavily biased toward generating news content. This bias may become apparent during the generation process.

	## Training procedure
	The model was trained for 12+ hours on Kaggle GPUs.

	## Usage Details

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline

	tokenizer = AutoTokenizer.from_pretrained("Ransaka/sinhala-gpt2")
	model = AutoModelForCausalLM.from_pretrained("Ransaka/sinhala-gpt2")
	generator("දුර") #දුර ඈත පාසැල් වියේ පසුවූයේ මෙම සිද්ධිය සම්බන්ධයෙන් විමර්ශන සිදුකරන බවයි
	```
	or using git
	```bash
	git lfs install
	git clone https://huggingface.co/Ransaka/sinhala-gpt2
	```

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 2.0233 \| 1.0 \| 15323 \| 2.3348 \|
	\| 1.6938 \| 2.0 \| 30646 \| 1.8377 \|
	\| 1.4938 \| 3.0 \| 45969 \| 1.6498 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.0
	- Datasets 2.1.0
	- Tokenizers 0.13.2