w11wo
/

javanese-bert-small

@@ -12,7 +12,7 @@ widget:
 ## Javanese BERT Small
 Javanese BERT Small is a masked language model based on the [BERT model](https://arxiv.org/abs/1810.04805). It was trained on the latest (late December 2020) Javanese Wikipedia articles.
-The model was originally HuggingFace's pretrained [English BERT model](https://huggingface.co/bert-base-uncased) and is later fine-tuned on the Javanese dataset. It achieved a perplexity of 49.43 on the validation dataset (20% of the articles). Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger), and [fine-tuning tutorial notebook](https://github.com/piegu/fastai-projects/blob/master/finetuning-English-GPT2-any-language-Portuguese-HuggingFace-fastaiv2.ipynb) written by [Pierre Guillou](https://huggingface.co/pierreguillou).
 Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was used to train the model -- utilizing the base BERT model and their `Trainer` class. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
@@ -22,11 +22,11 @@ Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was
 | `javanese-bert-small` |   110M   |   BERT Small   | Javanese Wikipedia (319 MB of text) |
 ## Evaluation Results
-The model was trained for 15 epochs and the following is the final result once the training ended.
 | train loss | valid loss | perplexity | total time |
 |------------|------------|------------|------------|
-|    3.918   |    3.900   |   49.43    |   5:19:36  |
 ## How to Use
 ### As Masked Language Model

 ## Javanese BERT Small
 Javanese BERT Small is a masked language model based on the [BERT model](https://arxiv.org/abs/1810.04805). It was trained on the latest (late December 2020) Javanese Wikipedia articles.
+The model was originally HuggingFace's pretrained [English BERT model](https://huggingface.co/bert-base-uncased) and is later fine-tuned on the Javanese dataset. It achieved a perplexity of 22.00 on the validation dataset (20% of the articles). Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger), and [fine-tuning tutorial notebook](https://github.com/piegu/fastai-projects/blob/master/finetuning-English-GPT2-any-language-Portuguese-HuggingFace-fastaiv2.ipynb) written by [Pierre Guillou](https://huggingface.co/pierreguillou).
 Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was used to train the model -- utilizing the base BERT model and their `Trainer` class. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
 | `javanese-bert-small` |   110M   |   BERT Small   | Javanese Wikipedia (319 MB of text) |
 ## Evaluation Results
+The model was trained for 5 epochs and the following is the final result once the training ended.
 | train loss | valid loss | perplexity | total time |
 |------------|------------|------------|------------|
+|    3.116   |    3.091   |   22.00    |   2:7:42   |
 ## How to Use
 ### As Masked Language Model