w11wo commited on
Commit
e408ae3
1 Parent(s): eba22c1

Updated README with the latest results

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -12,7 +12,7 @@ widget:
12
  ## Javanese GPT-2 Small IMDB
13
  Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
14
 
15
- The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of 55.09 on the validation dataset. Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger).
16
 
17
  Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
18
 
@@ -26,7 +26,7 @@ The model was trained for 5 epochs and the following is the final result once th
26
 
27
  | train loss | valid loss | perplexity | total time |
28
  |------------|------------|------------|------------|
29
- | 3.789 | 4.008 | 55.09 | 1:9:57 |
30
 
31
  ## How to Use (PyTorch)
32
  ### As Causal Language Model
 
12
  ## Javanese GPT-2 Small IMDB
13
  Javanese GPT-2 Small IMDB is a causal language model based on the [GPT-2 model](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). It was trained on Javanese IMDB movie reviews.
14
 
15
+ The model was originally the pretrained [Javanese GPT-2 Small model](https://huggingface.co/w11wo/javanese-gpt2-small) and is later fine-tuned on the Javanese IMDB movie review dataset. It achieved a perplexity of 60.54 on the validation dataset. Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger).
16
 
17
  Hugging Face's `Trainer` class from the [Transformers]((https://huggingface.co/transformers)) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
18
 
 
26
 
27
  | train loss | valid loss | perplexity | total time |
28
  |------------|------------|------------|------------|
29
+ | 4.135 | 4.103 | 60.54 | 6:22:40 |
30
 
31
  ## How to Use (PyTorch)
32
  ### As Causal Language Model