Updated README according to new model's results
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
## Javanese BERT Small
|
13 |
Javanese BERT Small is a masked language model based on the [BERT model](https://arxiv.org/abs/1810.04805). It was trained on the latest (late December 2020) Javanese Wikipedia articles.
|
14 |
|
15 |
-
The model was originally HuggingFace's pretrained [English BERT model](https://huggingface.co/bert-base-uncased) and is later fine-tuned on the Javanese dataset. It achieved a perplexity of
|
16 |
|
17 |
Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was used to train the model -- utilizing the base BERT model and their `Trainer` class. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
|
18 |
|
@@ -22,11 +22,11 @@ Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was
|
|
22 |
| `javanese-bert-small` | 110M | BERT Small | Javanese Wikipedia (319 MB of text) |
|
23 |
|
24 |
## Evaluation Results
|
25 |
-
The model was trained for
|
26 |
|
27 |
| train loss | valid loss | perplexity | total time |
|
28 |
|------------|------------|------------|------------|
|
29 |
-
| 3.
|
30 |
|
31 |
## How to Use
|
32 |
### As Masked Language Model
|
|
|
12 |
## Javanese BERT Small
|
13 |
Javanese BERT Small is a masked language model based on the [BERT model](https://arxiv.org/abs/1810.04805). It was trained on the latest (late December 2020) Javanese Wikipedia articles.
|
14 |
|
15 |
+
The model was originally HuggingFace's pretrained [English BERT model](https://huggingface.co/bert-base-uncased) and is later fine-tuned on the Javanese dataset. It achieved a perplexity of 22.00 on the validation dataset (20% of the articles). Many of the techniques used are based on a Hugging Face tutorial [notebook](https://github.com/huggingface/notebooks/blob/master/examples/language_modeling.ipynb) written by [Sylvain Gugger](https://github.com/sgugger), and [fine-tuning tutorial notebook](https://github.com/piegu/fastai-projects/blob/master/finetuning-English-GPT2-any-language-Portuguese-HuggingFace-fastaiv2.ipynb) written by [Pierre Guillou](https://huggingface.co/pierreguillou).
|
16 |
|
17 |
Hugging Face's [Transformers]((https://huggingface.co/transformers)) library was used to train the model -- utilizing the base BERT model and their `Trainer` class. PyTorch was used as the backend framework during training, but the model remains compatible with TensorFlow nonetheless.
|
18 |
|
|
|
22 |
| `javanese-bert-small` | 110M | BERT Small | Javanese Wikipedia (319 MB of text) |
|
23 |
|
24 |
## Evaluation Results
|
25 |
+
The model was trained for 5 epochs and the following is the final result once the training ended.
|
26 |
|
27 |
| train loss | valid loss | perplexity | total time |
|
28 |
|------------|------------|------------|------------|
|
29 |
+
| 3.116 | 3.091 | 22.00 | 2:7:42 |
|
30 |
|
31 |
## How to Use
|
32 |
### As Masked Language Model
|