uer commited on
Commit
9c4c488
1 Parent(s): c98e41f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -25
README.md CHANGED
@@ -140,51 +140,51 @@ Taking the case of word-based RoBERTa-Medium
140
  Stage1:
141
 
142
  ```
143
- python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \\\\
144
- --spm_model_path models/cluecorpussmall_spm.model \\\\
145
- --dataset_path cluecorpussmall_word_seq128_dataset.pt \\\\
146
- --processes_num 32 --seq_length 128 \\\\
147
  --dynamic_masking --target mlm
148
  ```
149
 
150
  ```
151
- python3 pretrain.py --dataset_path cluecorpussmall_word_seq128_dataset.pt \\\\
152
- --spm_model_path models/cluecorpussmall_spm.model \\\\
153
- --config_path models/bert/medium_config.json \\\\
154
- --output_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin \\\\
155
- --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \\\\
156
- --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \\\\
157
- --learning_rate 1e-4 --batch_size 64 \\\\
158
  --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm --tie_weights
159
  ```
160
 
161
  Stage2:
162
 
163
  ```
164
- python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \\\\
165
- --spm_model_path models/cluecorpussmall_spm.model \\\\
166
- --dataset_path cluecorpussmall_word_seq512_dataset.pt \\\\
167
- --processes_num 32 --seq_length 512 \\\\
168
  --dynamic_masking --target mlm
169
  ```
170
 
171
  ```
172
- python3 pretrain.py --dataset_path cluecorpussmall_word_seq512_dataset.pt \\\\
173
- --pretrained_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-1000000 \\\\
174
- --spm_model_path models/cluecorpussmall_spm.model \\\\
175
- --config_path models/bert/medium_config.json \\\\
176
- --output_model_path models/cluecorpussmall_word_roberta_medium_seq512_model.bin \\\\
177
- --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \\\\
178
- --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \\\\
179
- --learning_rate 5e-5 --batch_size 16 \\\\
180
  --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm --tie_weights
181
  ```
182
 
183
  Finally, we convert the pre-trained model into Huggingface's format:
184
 
185
  ```
186
- python3 scripts/convert_bert_from_uer_to_huggingface.py --input_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-250000 \\\\
187
- --output_model_path pytorch_model.bin \\\\
188
  --layers_num 8 --target mlm
189
  ```
190
 
 
140
  Stage1:
141
 
142
  ```
143
+ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
144
+ --spm_model_path models/cluecorpussmall_spm.model \
145
+ --dataset_path cluecorpussmall_word_seq128_dataset.pt \
146
+ --processes_num 32 --seq_length 128 \
147
  --dynamic_masking --target mlm
148
  ```
149
 
150
  ```
151
+ python3 pretrain.py --dataset_path cluecorpussmall_word_seq128_dataset.pt \
152
+ --spm_model_path models/cluecorpussmall_spm.model \
153
+ --config_path models/bert/medium_config.json \
154
+ --output_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin \
155
+ --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
156
+ --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
157
+ --learning_rate 1e-4 --batch_size 64 \
158
  --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm --tie_weights
159
  ```
160
 
161
  Stage2:
162
 
163
  ```
164
+ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
165
+ --spm_model_path models/cluecorpussmall_spm.model \
166
+ --dataset_path cluecorpussmall_word_seq512_dataset.pt \
167
+ --processes_num 32 --seq_length 512 \
168
  --dynamic_masking --target mlm
169
  ```
170
 
171
  ```
172
+ python3 pretrain.py --dataset_path cluecorpussmall_word_seq512_dataset.pt \
173
+ --pretrained_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-1000000 \
174
+ --spm_model_path models/cluecorpussmall_spm.model \
175
+ --config_path models/bert/medium_config.json \
176
+ --output_model_path models/cluecorpussmall_word_roberta_medium_seq512_model.bin \
177
+ --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
178
+ --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
179
+ --learning_rate 5e-5 --batch_size 16 \
180
  --embedding word_pos_seg --encoder transformer --mask fully_visible --target mlm --tie_weights
181
  ```
182
 
183
  Finally, we convert the pre-trained model into Huggingface's format:
184
 
185
  ```
186
+ python3 scripts/convert_bert_from_uer_to_huggingface.py --input_model_path models/cluecorpussmall_word_roberta_medium_seq128_model.bin-250000 \
187
+ --output_model_path pytorch_model.bin \
188
  --layers_num 8 --target mlm
189
  ```
190