Update README.md
Browse files
README.md
CHANGED
@@ -5,9 +5,9 @@ widget:
|
|
5 |
- text: "最近一趟去北京的[MASK]几点发车"
|
6 |
|
7 |
|
8 |
-
|
9 |
---
|
10 |
|
|
|
11 |
# Chinese word-based RoBERTa Miniatures
|
12 |
|
13 |
## Model description
|
@@ -144,7 +144,7 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
|
|
144 |
--spm_model_path models/cluecorpussmall_spm.model \
|
145 |
--dataset_path cluecorpussmall_word_seq128_dataset.pt \
|
146 |
--processes_num 32 --seq_length 128 \
|
147 |
-
--dynamic_masking --
|
148 |
```
|
149 |
|
150 |
```
|
@@ -155,7 +155,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_word_seq128_dataset.pt \
|
|
155 |
--world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
|
156 |
--total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
|
157 |
--learning_rate 1e-4 --batch_size 64 \
|
158 |
-
--
|
159 |
```
|
160 |
|
161 |
Stage2:
|
@@ -165,19 +165,19 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
|
|
165 |
--spm_model_path models/cluecorpussmall_spm.model \
|
166 |
--dataset_path cluecorpussmall_word_seq512_dataset.pt \
|
167 |
--processes_num 32 --seq_length 512 \
|
168 |
-
--dynamic_masking --
|
169 |
```
|
170 |
|
171 |
```
|
172 |
python3 pretrain.py --dataset_path cluecorpussmall_word_seq512_dataset.pt \
|
173 |
-
--pretrained_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-1000000 \
|
174 |
--spm_model_path models/cluecorpussmall_spm.model \
|
|
|
175 |
--config_path models/bert/medium_config.json \
|
176 |
--output_model_path models/cluecorpussmall_word_roberta_medium_seq512_model.bin \
|
177 |
--world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
|
178 |
--total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
|
179 |
--learning_rate 5e-5 --batch_size 16 \
|
180 |
-
--
|
181 |
```
|
182 |
|
183 |
Finally, we convert the pre-trained model into Huggingface's format:
|
|
|
5 |
- text: "最近一趟去北京的[MASK]几点发车"
|
6 |
|
7 |
|
|
|
8 |
---
|
9 |
|
10 |
+
|
11 |
# Chinese word-based RoBERTa Miniatures
|
12 |
|
13 |
## Model description
|
|
|
144 |
--spm_model_path models/cluecorpussmall_spm.model \
|
145 |
--dataset_path cluecorpussmall_word_seq128_dataset.pt \
|
146 |
--processes_num 32 --seq_length 128 \
|
147 |
+
--dynamic_masking --data_processor mlm
|
148 |
```
|
149 |
|
150 |
```
|
|
|
155 |
--world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
|
156 |
--total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
|
157 |
--learning_rate 1e-4 --batch_size 64 \
|
158 |
+
--data_processor mlm --target mlm
|
159 |
```
|
160 |
|
161 |
Stage2:
|
|
|
165 |
--spm_model_path models/cluecorpussmall_spm.model \
|
166 |
--dataset_path cluecorpussmall_word_seq512_dataset.pt \
|
167 |
--processes_num 32 --seq_length 512 \
|
168 |
+
--dynamic_masking --data_processor mlm
|
169 |
```
|
170 |
|
171 |
```
|
172 |
python3 pretrain.py --dataset_path cluecorpussmall_word_seq512_dataset.pt \
|
|
|
173 |
--spm_model_path models/cluecorpussmall_spm.model \
|
174 |
+
--pretrained_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-1000000 \
|
175 |
--config_path models/bert/medium_config.json \
|
176 |
--output_model_path models/cluecorpussmall_word_roberta_medium_seq512_model.bin \
|
177 |
--world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
|
178 |
--total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
|
179 |
--learning_rate 5e-5 --batch_size 16 \
|
180 |
+
--data_processor mlm --target mlm
|
181 |
```
|
182 |
|
183 |
Finally, we convert the pre-trained model into Huggingface's format:
|