nhanv commited on
Commit
eae29b9
1 Parent(s): b3416f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -35,7 +35,7 @@ A 12-layer, 768-hidden-size transformer-based language model.
35
  # Training
36
  The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
37
 
38
- ### GPT-2 Fineturning
39
 
40
  The following example fine-tunes GPT-2 on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before
41
  the tokenization). The loss here is that of causal language modeling.
 
35
  # Training
36
  The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
37
 
38
+ ### GPT-2 Finetuning
39
 
40
  The following example fine-tunes GPT-2 on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before
41
  the tokenization). The loss here is that of causal language modeling.