pszemraj commited on
Commit
6615769
1 Parent(s): f6037b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,13 +12,13 @@ library_name: transformers
12
  # tFINE-900m-e16-d32-1024ctx
13
 
14
 
15
- Pretrained T5 model with nanoT5:
16
 
17
  - ~900m parameters, 16 layers in encoder, 32 layers in decoder
18
  - sentencepiece tokenizer with 48k vocab & byte-pair fallback
19
- - handles whitespaces etc correctly (unlike standard T5 tokenizer)
20
  - 1024 ctx during pretrain
21
- - `relative_attention_num_buckets` increased to 48 from standard 32 for context length upscaling
22
 
23
  ## Experiment logs
24
 
 
12
  # tFINE-900m-e16-d32-1024ctx
13
 
14
 
15
+ Pretrained T5 model with [nanoT5](https://github.com/pszemraj/nanoT5/tree/fineweb-edu-test):
16
 
17
  - ~900m parameters, 16 layers in encoder, 32 layers in decoder
18
  - sentencepiece tokenizer with 48k vocab & byte-pair fallback
19
+ - handles whitespaces etc correctly (_unlike original T5 tokenizer_)
20
  - 1024 ctx during pretrain
21
+ - `relative_attention_num_buckets` increased to 48 from 32 for context length upscaling
22
 
23
  ## Experiment logs
24