uer
/

t5-small-chinese-cluecorpussmall

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

uer commited on Mar 19, 2021

Commit

b140588

•

1 Parent(s): 72ca583

Update README.md

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -16,8 +16,10 @@ The Text-to-Text Transfer Transformer (T5) leveraged a unified text-to-text form
 |          |           Link           |
 | -------- | :-----------------------: |
-| **Small**  | [**2/128 (Tiny)**][2_128] |
-| **Base**  | [**4/256 (Mini)**][4_256] |
 ## How to use
@@ -101,6 +103,8 @@ python3 scripts/convert_t5_from_uer_to_huggingface.py --input_model_path cluecor
                                                       --type t5
 ```
 ### BibTeX entry and citation info
 ```
@@ -113,3 +117,5 @@ python3 scripts/convert_t5_from_uer_to_huggingface.py --input_model_path cluecor
 }
 ```

 |          |           Link           |
 | -------- | :-----------------------: |
+| **Small**  | [**Small**][small] |
+| **Base**  | [**Base**][base] |
+In T5, spans of the input sequence are masked by so-called sentinel token. Each sentinel token represents a unique mask token for the input sequence and should start with <extra_id_0>, <extra_id_1>, … up to <extra_id_199>. However, <extra_id_xxx> is separated into multiple parts in Huggingface's Hosted inference API. Therefore, we replace <extra_id_xxx> with extraxxx in vocabulary and BertTokenizer regards extraxxx as one sentinel token.
 ## How to use
                                                       --type t5
 ```
+Notice that
 ### BibTeX entry and citation info
 ```
 }
 ```
+[small]:https://huggingface.co/uer/t5-small-chinese-cluecorpussmall
+[base]:https://huggingface.co/uer/t5-base-chinese-cluecorpussmall