pubmed_gpt_tokenizer / tokenizer_config.json

Commit History

50k vocab, prefix_space=false,trained on PubMed Abstracts
39545d2

J38 commited on

add of |endoftext|
7a0b15c

J38 commited on

use lowercase normalizer
514166f

J38 commited on

does bert normalizer work
7fd39d4

J38 commited on

change lowercase key
5940b13

J38 commited on

add normalizer
15bb604

J38 commited on

do lower case
4bea54c

J38 commited on

config for tokenizer
c682350

J38 commited on