Update README.md
Browse files
README.md
CHANGED
@@ -18,8 +18,8 @@ The Text-to-Text Transfer Transformer (T5) leveraged a unified text-to-text form
|
|
18 |
|
19 |
| | Link |
|
20 |
| -------- | :-----------------------: |
|
21 |
-
| **Small** | [**Small**][small] |
|
22 |
-
| **Base** | [**Base**][base] |
|
23 |
|
24 |
In T5, spans of the input sequence are masked by so-called sentinel token. Each sentinel token represents a unique mask token for the input sequence and should start with <extra_id_0>, <extra_id_1>, … up to <extra_id_199>. However, <extra_id_xxx> is separated into multiple parts in Huggingface's Hosted inference API. Therefore, we replace <extra_id_xxx> with extraxxx in vocabulary and BertTokenizer regards extraxxx as one sentinel token.
|
25 |
|
|
|
18 |
|
19 |
| | Link |
|
20 |
| -------- | :-----------------------: |
|
21 |
+
| **T5-Small** | [**L=6/H=512 (Small)**][small] |
|
22 |
+
| **T5-Base** | [**L=12/H=768 (Base)**][base] |
|
23 |
|
24 |
In T5, spans of the input sequence are masked by so-called sentinel token. Each sentinel token represents a unique mask token for the input sequence and should start with <extra_id_0>, <extra_id_1>, … up to <extra_id_199>. However, <extra_id_xxx> is separated into multiple parts in Huggingface's Hosted inference API. Therefore, we replace <extra_id_xxx> with extraxxx in vocabulary and BertTokenizer regards extraxxx as one sentinel token.
|
25 |
|