w11wo
/

javanese-gpt2-small-imdb

Text Generation

javanese-gpt2-small-imdb

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

w11wo commited on May 14, 2021

Commit

e408ae3

•

1 Parent(s): eba22c1

Updated README with the latest results

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ widget:
 ## Javanese GPT-2 Small IMDB
 Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
-The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of 55.09 on the validation dataset. Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger).
 Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
@@ -26,7 +26,7 @@ The model was trained for 5 epochs and the following is the final result once th
 | train loss | valid loss | perplexity | total time |
 |------------|------------|------------|------------|
-|    3.789   |    4.008   |   55.09    |   1:9:57   |
 ## How to Use (PyTorch)
 ### As Causal Language Model

 ## Javanese GPT-2 Small IMDB
 Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
+The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of 60.54 on the validation dataset. Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger).
 Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
 | train loss | valid loss | perplexity | total time |
 |------------|------------|------------|------------|
+|    4.135   |    4.103   |   60.54    |   6:22:40  |
 ## How to Use (PyTorch)
 ### As Causal Language Model