bookbot/gpt2-indo-medium-kids-stories

GPT-2 Indonesian Medium Kids Stories

GPT-2 Indonesian Medium Kids Stories is a causal language model based on the OpenAI GPT-2 model. The model was originally the pre-trained GPT2 Medium Indonesian model, which was then fine-tuned on Indonesian kids' stories from Room To Read and Let's Read.

10% of the dataset was kept for evaluation purposes. The pre-trained model was fine-tuned and achieved an evaluation loss of 3.579 and an evaluation perplexity of 35.84.

Hugging Face's Trainer class from the Transformers library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with other frameworks nonetheless.

Model

Model	#params	Arch.	Training/Validation data (text)
`gpt2-indo-medium-kids-stories`	345M	GPT2 Medium	Indonesian Kids' Stories (860 KB)

Evaluation Results

The model was fine-tuned for 3 epochs.

Epoch	Training Loss	Validation Loss
1	3.909100	3.627678
2	3.375300	3.562854
3	3.113300	3.578999

How to Use (PyTorch)

As Causal Language Model

from transformers import pipeline

pretrained_name = "bookbot/gpt2-indo-medium-kids-stories"

nlp = pipeline(
    "text-generation",
    model=pretrained_name,
    tokenizer=pretrained_name
)

nlp("Archie sedang mengendarai roket ke planet Mars.")

Feature Extraction in PyTorch

from transformers import GPT2LMHeadModel, GPT2TokenizerFast

pretrained_name = "bookbot/gpt2-indo-medium-kids-stories"
model = GPT2LMHeadModel.from_pretrained(pretrained_name)
tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_name)

prompt = "Archie sedang mengendarai roket ke planet Mars."
encoded_input = tokenizer(prompt, return_tensors='pt')
output = model(**encoded_input)

Disclaimer

Do consider the biases which come from both the pre-trained GPT-2 model and the Indonesian Kids' Stories dataset that may be carried over into the results of this model.

Author

GPT-2 Indonesian Medium Kids Stories was trained and evaluated by Wilson Wongso. All computation and development are done on Google Colaboratory using their free GPU access.