EleutherAI
/

pile-t5-xl

Text2Text Generation

encoder-decoder

Inference Endpoints

Model card Files Files and versions Community

lintang commited on Apr 15

Commit

51118ee

•

1 Parent(s): 39902d8

Update README.md

Files changed (1) hide show

README.md +9 -6

README.md CHANGED Viewed

@@ -6,10 +6,11 @@ language:
 pipeline_tag: text2text-generation
 tags:
 - t5x
-- encode-decoder
 ---
 Pile-T5 XL is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
 ### Model Details
@@ -30,7 +31,7 @@ ai](mailto:contact@eleuther.ai).
 | Hyperparameter             | Value       |
 | -------------------------- | ----------- |
-| n<sub>parameters</sub>     |             |
 | n<sub>encoder layers</sub> | 24          |
 | n<sub>decoder layers</sub> | 24          |
 | d<sub>model</sub>          | 5120        |
@@ -133,16 +134,18 @@ checkpoints that can be used for finetuning with the T5x library, refer to [here
 ### Evaluations
-TBD
 ### BibTeX
 ```
-@article{2024t5v2,
   author  = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
-  title   = {Pile T5, an update of T5},
   year    = {2024},
-  url     = {}
 }
 ```

 pipeline_tag: text2text-generation
 tags:
 - t5x
+- encoder-decoder
 ---
 Pile-T5 XL is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
+The HF version of Pile-T5 XL borrows UMT5's model implementation as it uses scalable model implementation from T5x and uses `LlamaTokenizer`.
 ### Model Details
 | Hyperparameter             | Value       |
 | -------------------------- | ----------- |
+| n<sub>parameters</sub>     | 2849804288  |
 | n<sub>encoder layers</sub> | 24          |
 | n<sub>decoder layers</sub> | 24          |
 | d<sub>model</sub>          | 5120        |
 ### Evaluations
+Pile-T5 XL was evaluated on SuperGLUE, CodeXGLUE. A Flan-finetuned version was evaluated on Flan Held In tasks, MMLU and BBH.
+Results can be seen in the [blogpost](https://blog.eleuther.ai/pile-t5/)
 ### BibTeX
 ```
+@misc{2024PileT5,
   author  = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
+  title   = {Pile-T5},
   year    = {2024},
+  url     = {https://blog.eleuther.ai/pile-t5/},
+  note    = {Blog post},
 }
 ```