Update README.md
Browse files
README.md
CHANGED
@@ -72,9 +72,8 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
|
|
72 |
--learning_rate 1e-3 --batch_size 64 \
|
73 |
--span_masking --span_geo_prob 0.3 --span_max_length 5 \
|
74 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
75 |
-
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
76 |
-
--feed_forward gated --decoder transformer
|
77 |
-
--target t5 --tie_weights
|
78 |
|
79 |
```
|
80 |
|
@@ -100,8 +99,7 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq512_dataset.pt \
|
|
100 |
--span_masking --span_geo_prob 0.3 --span_max_length 5 \
|
101 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
102 |
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
103 |
-
--feed_forward gated --decoder transformer
|
104 |
-
--target t5 --tie_weights
|
105 |
```
|
106 |
|
107 |
Finally, we convert the pre-trained model into Huggingface's format:
|
|
|
72 |
--learning_rate 1e-3 --batch_size 64 \
|
73 |
--span_masking --span_geo_prob 0.3 --span_max_length 5 \
|
74 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
75 |
+
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
76 |
+
--feed_forward gated --decoder transformer --target t5
|
|
|
77 |
|
78 |
```
|
79 |
|
|
|
99 |
--span_masking --span_geo_prob 0.3 --span_max_length 5 \
|
100 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
101 |
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
102 |
+
--feed_forward gated --decoder transformer --target t5
|
|
|
103 |
```
|
104 |
|
105 |
Finally, we convert the pre-trained model into Huggingface's format:
|