lintang commited on
Commit
51118ee
1 Parent(s): 39902d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -6,10 +6,11 @@ language:
6
  pipeline_tag: text2text-generation
7
  tags:
8
  - t5x
9
- - encode-decoder
10
  ---
11
 
12
  Pile-T5 XL is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
 
13
 
14
  ### Model Details
15
 
@@ -30,7 +31,7 @@ ai](mailto:contact@eleuther.ai).
30
 
31
  | Hyperparameter | Value |
32
  | -------------------------- | ----------- |
33
- | n<sub>parameters</sub> | |
34
  | n<sub>encoder layers</sub> | 24 |
35
  | n<sub>decoder layers</sub> | 24 |
36
  | d<sub>model</sub> | 5120 |
@@ -133,16 +134,18 @@ checkpoints that can be used for finetuning with the T5x library, refer to [here
133
 
134
  ### Evaluations
135
 
136
- TBD
 
137
 
138
  ### BibTeX
139
 
140
  ```
141
- @article{2024t5v2,
142
  author = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
143
- title = {Pile T5, an update of T5},
144
  year = {2024},
145
- url = {}
 
146
  }
147
  ```
148
 
 
6
  pipeline_tag: text2text-generation
7
  tags:
8
  - t5x
9
+ - encoder-decoder
10
  ---
11
 
12
  Pile-T5 XL is an Encoder-Decoder model trained on [the Pile](https://pile.eleuther.ai/) using the [T5x](https://github.com/google-research/t5x) library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model.
13
+ The HF version of Pile-T5 XL borrows UMT5's model implementation as it uses scalable model implementation from T5x and uses `LlamaTokenizer`.
14
 
15
  ### Model Details
16
 
 
31
 
32
  | Hyperparameter | Value |
33
  | -------------------------- | ----------- |
34
+ | n<sub>parameters</sub> | 2849804288 |
35
  | n<sub>encoder layers</sub> | 24 |
36
  | n<sub>decoder layers</sub> | 24 |
37
  | d<sub>model</sub> | 5120 |
 
134
 
135
  ### Evaluations
136
 
137
+ Pile-T5 XL was evaluated on SuperGLUE, CodeXGLUE. A Flan-finetuned version was evaluated on Flan Held In tasks, MMLU and BBH.
138
+ Results can be seen in the [blogpost](https://blog.eleuther.ai/pile-t5/)
139
 
140
  ### BibTeX
141
 
142
  ```
143
+ @misc{2024PileT5,
144
  author = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
145
+ title = {Pile-T5},
146
  year = {2024},
147
+ url = {https://blog.eleuther.ai/pile-t5/},
148
+ note = {Blog post},
149
  }
150
  ```
151