GPT-2 Indonesian Medium Kids Stories is a causal language model based on the OpenAI GPT-2 model. The model was originally the pre-trained GPT2 Medium Indonesian model, which was then fine-tuned on Indonesian kids' stories from Room To Read and Let's Read.
10% of the dataset was kept for evaluation purposes. The pre-trained model was fine-tuned and achieved an evaluation loss of 3.579 and an evaluation perplexity of 35.84.
Trainer class from the Transformers library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with other frameworks nonetheless.
|Model||#params||Arch.||Training/Validation data (text)|
||345M||GPT2 Medium||Indonesian Kids' Stories (860 KB)|
The model was fine-tuned for 3 epochs.
|Epoch||Training Loss||Validation Loss|
from transformers import pipeline pretrained_name = "bookbot/gpt2-indo-medium-kids-stories" nlp = pipeline( "text-generation", model=pretrained_name, tokenizer=pretrained_name ) nlp("Archie sedang mengendarai roket ke planet Mars.")
from transformers import GPT2LMHeadModel, GPT2TokenizerFast pretrained_name = "bookbot/gpt2-indo-medium-kids-stories" model = GPT2LMHeadModel.from_pretrained(pretrained_name) tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_name) prompt = "Archie sedang mengendarai roket ke planet Mars." encoded_input = tokenizer(prompt, return_tensors='pt') output = model(**encoded_input)
Do consider the biases which come from both the pre-trained GPT-2 model and the Indonesian Kids' Stories dataset that may be carried over into the results of this model.
GPT-2 Indonesian Medium Kids Stories was trained and evaluated by Wilson Wongso. All computation and development are done on Google Colaboratory using their free GPU access.
- Downloads last month