Updated README with the latest results
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
## Javanese GPT-2 Small IMDB
|
13 |
Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
|
14 |
|
15 |
-
The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of
|
16 |
|
17 |
Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
|
18 |
|
@@ -26,7 +26,7 @@ The model was trained for 5 epochs and the following is the final result once th
|
|
26 |
|
27 |
| train loss | valid loss | perplexity | total time |
|
28 |
|------------|------------|------------|------------|
|
29 |
-
|
|
30 |
|
31 |
## How to Use (PyTorch)
|
32 |
### As Causal Language Model
|
|
|
12 |
## Javanese GPT-2 Small IMDB
|
13 |
Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
|
14 |
|
15 |
+
The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of 60.54 on the validation dataset. Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger).
|
16 |
|
17 |
Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
|
18 |
|
|
|
26 |
|
27 |
| train loss | valid loss | perplexity | total time |
|
28 |
|------------|------------|------------|------------|
|
29 |
+
| 4.135 | 4.103 | 60.54 | 6:22:40 |
|
30 |
|
31 |
## How to Use (PyTorch)
|
32 |
### As Causal Language Model
|