nhanv commited on
Commit
27f0101
1 Parent(s): 0fda306

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -31,3 +31,6 @@ model = AutoModelForCausalLM.from_pretrained("nhanv/vi-gpt2")
31
 
32
  # Model architecture
33
  A 12-layer, 768-hidden-size transformer-based language model.
 
 
 
 
31
 
32
  # Model architecture
33
  A 12-layer, 768-hidden-size transformer-based language model.
34
+
35
+ # Training
36
+ The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.