uer commited on
Commit
d084870
1 Parent(s): a9d5d43

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -11
README.md CHANGED
@@ -7,6 +7,7 @@ widget:
7
 
8
  ---
9
 
 
10
  # Chinese T5
11
 
12
  ## Model description
@@ -54,7 +55,7 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
54
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
55
  --dataset_path cluecorpussmall_t5_seq128_dataset.pt \
56
  --processes_num 32 --seq_length 128 \
57
- --dynamic_masking --target t5
58
  ```
59
 
60
  ```
@@ -65,10 +66,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5_seq128_dataset.pt \
65
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
66
  --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
67
  --learning_rate 1e-3 --batch_size 64 \
68
- --span_masking --span_geo_prob 0.3 --span_max_length 5 \
69
- --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
70
- --encoder transformer --mask fully_visible --layernorm_positioning pre --decoder transformer \
71
- --target t5 --tie_weights
72
 
73
  ```
74
 
@@ -79,22 +77,19 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
79
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
80
  --dataset_path cluecorpussmall_t5_small_seq512_dataset.pt \
81
  --processes_num 32 --seq_length 512 \
82
- --dynamic_masking --target t5
83
  ```
84
 
85
  ```
86
  python3 pretrain.py --dataset_path cluecorpussmall_t5_seq512_dataset.pt \
87
- --pretrained_model_path models/cluecorpussmall_t5_small_seq128_model.bin-1000000 \
88
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
 
89
  --config_path models/t5/small_config.json \
90
  --output_model_path models/cluecorpussmall_t5_small_seq512_model.bin \
91
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
92
  --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
93
  --learning_rate 5e-4 --batch_size 16 \
94
- --span_masking --span_geo_prob 0.3 --span_max_length 5 \
95
- --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
96
- --encoder transformer --mask fully_visible --layernorm_positioning pre --decoder transformer \
97
- --target t5 --tie_weights
98
  ```
99
 
100
  Finally, we convert the pre-trained model into Huggingface's format:
 
7
 
8
  ---
9
 
10
+
11
  # Chinese T5
12
 
13
  ## Model description
 
55
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
56
  --dataset_path cluecorpussmall_t5_seq128_dataset.pt \
57
  --processes_num 32 --seq_length 128 \
58
+ --dynamic_masking --data_processor t5
59
  ```
60
 
61
  ```
 
66
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
67
  --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
68
  --learning_rate 1e-3 --batch_size 64 \
69
+ --span_masking --span_geo_prob 0.3 --span_max_length 5
 
 
 
70
 
71
  ```
72
 
 
77
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
78
  --dataset_path cluecorpussmall_t5_small_seq512_dataset.pt \
79
  --processes_num 32 --seq_length 512 \
80
+ --dynamic_masking --data_processor t5
81
  ```
82
 
83
  ```
84
  python3 pretrain.py --dataset_path cluecorpussmall_t5_seq512_dataset.pt \
 
85
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
86
+ --pretrained_model_path models/cluecorpussmall_t5_small_seq128_model.bin-1000000 \
87
  --config_path models/t5/small_config.json \
88
  --output_model_path models/cluecorpussmall_t5_small_seq512_model.bin \
89
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
90
  --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
91
  --learning_rate 5e-4 --batch_size 16 \
92
+ --span_masking --span_geo_prob 0.3 --span_max_length 5
 
 
 
93
  ```
94
 
95
  Finally, we convert the pre-trained model into Huggingface's format: