GPT-2 Base Thai is a causal language model based on the OpenAI GPT-2 model. It was trained on the OSCAR dataset, specifically the
unshuffled_deduplicated_th subset. The model was trained from scratch and achieved an evaluation loss of 1.708 and an evaluation perplexity of 5.516.
This model was trained using HuggingFace's Flax framework and is part of the JAX/Flax Community Week organized by HuggingFace. All training was done on a TPUv3-8 VM, sponsored by the Google Cloud team.
|Model||#params||Arch.||Training/Validation data (text)|
The model was trained for 3 epochs and the following is the final result once the training ended.
|train loss||valid loss||valid PPL||total time|
from transformers import pipeline pretrained_name = "flax-community/gpt2-base-thai" nlp = pipeline( "text-generation", model=pretrained_name, tokenizer=pretrained_name ) nlp("สวัสดีตอนเช้า")
from transformers import GPT2Model, GPT2TokenizerFast pretrained_name = "flax-community/gpt2-base-thai" model = GPT2Model.from_pretrained(pretrained_name) tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_name) prompt = "สวัสดีตอนเช้า" encoded_input = tokenizer(prompt, return_tensors='pt') output = model(**encoded_input)
- Downloads last month