uer commited on
Commit
e3ff021
1 Parent(s): b5fe765

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -72,9 +72,8 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
72
  --learning_rate 1e-3 --batch_size 64 \
73
  --span_masking --span_geo_prob 0.3 --span_max_length 5 \
74
  --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
75
- --encoder transformer --mask fully_visible --layernorm_positioning pre \
76
- --feed_forward gated --decoder transformer \
77
- --target t5 --tie_weights
78
 
79
  ```
80
 
@@ -100,8 +99,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq512_dataset.pt \
100
  --span_masking --span_geo_prob 0.3 --span_max_length 5 \
101
  --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
102
  --encoder transformer --mask fully_visible --layernorm_positioning pre \
103
- --feed_forward gated --decoder transformer \
104
- --target t5 --tie_weights
105
  ```
106
 
107
  Finally, we convert the pre-trained model into Huggingface's format:
 
72
  --learning_rate 1e-3 --batch_size 64 \
73
  --span_masking --span_geo_prob 0.3 --span_max_length 5 \
74
  --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
75
+ --encoder transformer --mask fully_visible --layernorm_positioning pre \
76
+ --feed_forward gated --decoder transformer --target t5
 
77
 
78
  ```
79
 
 
99
  --span_masking --span_geo_prob 0.3 --span_max_length 5 \
100
  --embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
101
  --encoder transformer --mask fully_visible --layernorm_positioning pre \
102
+ --feed_forward gated --decoder transformer --target t5
 
103
  ```
104
 
105
  Finally, we convert the pre-trained model into Huggingface's format: