tokenizer: add config (no accent stripping) and vocab

Files changed (2) hide show

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"do_lower_case": true, "max_len": 512, "init_inputs": [], "strip_accents":false}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff