IDEA-CCNL
/

Randeng-T5-784M

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

roygan commited on Jun 8, 2022

Commit

19a36b4

•

1 Parent(s): 064ad4f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ widget:
 ---
 # Randeng-T5-784M, one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
-Based on mt5-large, Randeng-T5-784M only retains the vocabulary and embedding corresponding to Chinese and English, and continues to train on the basis of 180G Chinese general pre-training corpus. The pretrain target is span corruption. We pretrain the model based on our [fengshen framework](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen), use 16 * A100 for 98 hours.
 ## Usage
 ```python
 from transformers import T5ForConditionalGeneration, AutoTokenizer

 ---
 # Randeng-T5-784M, one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
+Based on mt5-large, Randeng-T5-784M only retains the vocabulary and embedding corresponding to Chinese and English, and continues to train on the basis of 180G Chinese general pre-training corpus. Because we continue pretraining on mt5-large, the tokenizer use T5tokenizer(sentencepiece). The pretrain target is span corruption. We pretrain the model based on our [fengshen framework](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen), use 16 * A100 for 98 hours.
 ## Usage
 ```python
 from transformers import T5ForConditionalGeneration, AutoTokenizer