Galuh commited on
Commit
79140ed
1 Parent(s): 311c294

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -60,17 +60,17 @@ The training data used for this model has not been released as a dataset one can
60
  The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
61
 
62
  ## Training procedure
63
- The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `4d 14h 50m 47s`.
64
 
65
  ### Evaluation results
66
  The model achieves the following results without any fine-tuning (zero-shot):
67
 
68
  | dataset | train loss | eval loss | eval perplexity |
69
  | ---------- | ---------- | -------------- | ---------- |
70
- | ID OSCAR+mc4 (29GB) | 3.046 | 2.926 | 18.66 |
71
 
72
  ### Tracking
73
- The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-small-indonesian/tensorboard) and [Weights and Biases](https://wandb.ai/wandb/hf-flax-gpt2-indonesian?workspace=user-cahya).
74
 
75
  ## Team members
76
  - Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))
 
60
  The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
61
 
62
  ## Training procedure
63
+ The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `6d 3h 7m 26s`.
64
 
65
  ### Evaluation results
66
  The model achieves the following results without any fine-tuning (zero-shot):
67
 
68
  | dataset | train loss | eval loss | eval perplexity |
69
  | ---------- | ---------- | -------------- | ---------- |
70
+ | ID OSCAR+mc4 (29GB) | 2.79 | 2.696 | 14.826 |
71
 
72
  ### Tracking
73
+ The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-medium-indonesian/tensorboard) and [Weights and Biases](https://wandb.ai/wandb/hf-flax-gpt2-indonesian?workspace=user-cahya).
74
 
75
  ## Team members
76
  - Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))