flax-community
/

gpt2-medium-indonesian

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

Galuh commited on Jul 17, 2021

Commit

79140ed

•

1 Parent(s): 311c294

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -60,17 +60,17 @@ The training data used for this model has not been released as a dataset one can
 The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
 ## Training procedure
-The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `4d 14h 50m 47s`.
 ### Evaluation results
 The model achieves the following results without any fine-tuning (zero-shot):
 | dataset | train loss | eval loss | eval perplexity |
 | ---------- | ---------- | -------------- | ---------- |
-| ID OSCAR+mc4 (29GB)      | 3.046      | 2.926         | 18.66   |
 ### Tracking
-The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-small-indonesian/tensorboard) and [Weights and Biases](https://wandb.ai/wandb/hf-flax-gpt2-indonesian?workspace=user-cahya).
 ## Team members
 - Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))

 The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
 ## Training procedure
+The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `6d 3h 7m 26s`.
 ### Evaluation results
 The model achieves the following results without any fine-tuning (zero-shot):
 | dataset | train loss | eval loss | eval perplexity |
 | ---------- | ---------- | -------------- | ---------- |
+| ID OSCAR+mc4 (29GB)      |  2.79     | 2.696         | 14.826   |
 ### Tracking
+The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-medium-indonesian/tensorboard) and [Weights and Biases](https://wandb.ai/wandb/hf-flax-gpt2-indonesian?workspace=user-cahya).
 ## Team members
 - Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))