First version of the chinese_roberta_L-12_H-128 model and tokenizer.
Browse files
README.md
CHANGED
@@ -115,7 +115,7 @@ python3 preprocess.py --corpus_path corpora/cluecorpus.txt \
|
|
115 |
```
|
116 |
```
|
117 |
python3 pretrain.py --dataset_path cluecorpus_seq512_dataset.pt \
|
118 |
-
--pretrained_model_path
|
119 |
--vocab_path models/google_zh_vocab.txt \
|
120 |
--config_path models/bert_l12h128_config.json \
|
121 |
--output_model_path models/cluecorpus_roberta_l12h128_seq512_model.bin \
|
|
|
115 |
```
|
116 |
```
|
117 |
python3 pretrain.py --dataset_path cluecorpus_seq512_dataset.pt \
|
118 |
+
--pretrained_model_path models/cluecorpus_roberta_l12h128_seq512_model.bin-1000000 \
|
119 |
--vocab_path models/google_zh_vocab.txt \
|
120 |
--config_path models/bert_l12h128_config.json \
|
121 |
--output_model_path models/cluecorpus_roberta_l12h128_seq512_model.bin \
|