Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
stanford-crfm
/
pubmed_gpt_tokenizer
like
1
Model card
Files
Files and versions
Community
39545d2
pubmed_gpt_tokenizer
Commit History
50k vocab, prefix_space=false,trained on PubMed Abstracts
39545d2
J38
commited on
Sep 15, 2022
experiment with 50k vocab
9d29e9b
J38
commited on
Sep 14, 2022
add of |endoftext|
19ffb18
J38
commited on
Sep 9, 2022
add of |endoftext|
7a0b15c
J38
commited on
Sep 9, 2022
use lowercase normalizer
514166f
J38
commited on
Sep 5, 2022
does bert normalizer work
7fd39d4
J38
commited on
Sep 5, 2022
change lowercase key
5940b13
J38
commited on
Sep 5, 2022
add normalizer
15bb604
J38
commited on
Sep 5, 2022
add lowercase normalizer
f344ee9
J38
commited on
Sep 5, 2022
add merges.txt
c19b598
J38
commited on
Sep 5, 2022
add vocab.json
5241c39
J38
commited on
Sep 5, 2022
add vocab file
cab4e59
J38
commited on
Sep 5, 2022
do lower case
4bea54c
J38
commited on
Sep 5, 2022
config for tokenizer
c682350
J38
commited on
Sep 5, 2022
tokenizer model
db29f29
J38
commited on
Sep 5, 2022
initial commit
c2f9a73
J38
commited on
Sep 5, 2022