uer commited on
Commit
761550d
1 Parent(s): 12fafe3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -11
README.md CHANGED
@@ -5,9 +5,9 @@ widget:
5
  - text: "作为电子extra0的平台,京东绝对是领先者。如今的刘强extra1已经是身价过extra2的老板。"
6
 
7
 
8
-
9
  ---
10
 
 
11
  # Chinese T5 Version 1.1
12
 
13
  ## Model description
@@ -61,7 +61,7 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
61
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
62
  --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
63
  --processes_num 32 --seq_length 128 \
64
- --dynamic_masking --target t5
65
  ```
66
 
67
  ```
@@ -72,10 +72,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
72
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
73
  --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
74
  --learning_rate 1e-3 --batch_size 64 \
75
- --span_masking --span_geo_prob 0.3 --span_max_length 5 \
76
- --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
77
- --encoder transformer --mask fully_visible --layernorm_positioning pre \
78
- --feed_forward gated --decoder transformer --target t5
79
  ```
80
 
81
  Stage2:
@@ -85,7 +82,7 @@ python3 preprocess.py --corpus_path corpora/cluecorpussmall.txt \
85
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
86
  --dataset_path cluecorpussmall_t5-v1_1_seq512_dataset.pt \
87
  --processes_num 32 --seq_length 512 \
88
- --dynamic_masking --target t5
89
  ```
90
 
91
  ```
@@ -97,10 +94,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq512_dataset.pt \
97
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
98
  --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
99
  --learning_rate 5e-4 --batch_size 16 \
100
- --span_masking --span_geo_prob 0.3 --span_max_length 5 \
101
- --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
102
- --encoder transformer --mask fully_visible --layernorm_positioning pre \
103
- --feed_forward gated --decoder transformer --target t5
104
  ```
105
 
106
  Finally, we convert the pre-trained model into Huggingface's format:
5
  - text: "作为电子extra0的平台,京东绝对是领先者。如今的刘强extra1已经是身价过extra2的老板。"
6
 
7
 
 
8
  ---
9
 
10
+
11
  # Chinese T5 Version 1.1
12
 
13
  ## Model description
61
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
62
  --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
63
  --processes_num 32 --seq_length 128 \
64
+ --dynamic_masking --data_processor t5
65
  ```
66
 
67
  ```
72
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
73
  --total_steps 1000000 --save_checkpoint_steps 100000 --report_steps 50000 \
74
  --learning_rate 1e-3 --batch_size 64 \
75
+ --span_masking --span_geo_prob 0.3 --span_max_length 5
 
 
 
76
  ```
77
 
78
  Stage2:
82
  --vocab_path models/google_zh_with_sentinel_vocab.txt \
83
  --dataset_path cluecorpussmall_t5-v1_1_seq512_dataset.pt \
84
  --processes_num 32 --seq_length 512 \
85
+ --dynamic_masking --data_processor t5
86
  ```
87
 
88
  ```
94
  --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \
95
  --total_steps 250000 --save_checkpoint_steps 50000 --report_steps 10000 \
96
  --learning_rate 5e-4 --batch_size 16 \
97
+ --span_masking --span_geo_prob 0.3 --span_max_length 5
 
 
 
98
  ```
99
 
100
  Finally, we convert the pre-trained model into Huggingface's format: