nhanv commited on
Commit
b3416f2
1 Parent(s): b8b43df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -34,3 +34,22 @@ A 12-layer, 768-hidden-size transformer-based language model.
34
 
35
  # Training
36
  The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  # Training
36
  The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
37
+
38
+ ### GPT-2 Fineturning
39
+
40
+ The following example fine-tunes GPT-2 on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before
41
+ the tokenization). The loss here is that of causal language modeling.
42
+
43
+ The script [here](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) .
44
+
45
+ ```bash
46
+ python run_clm.py \
47
+ --model_name_or_path NlpHUST/gpt2-vietnamese \
48
+ --dataset_name wikitext \
49
+ --dataset_config_name wikitext-2-raw-v1 \
50
+ --per_device_train_batch_size 8 \
51
+ --per_device_eval_batch_size 8 \
52
+ --do_train \
53
+ --do_eval \
54
+ --output_dir /tmp/test-clm
55
+ ```