nouamanetazi commited on
Commit
ec600bb
1 Parent(s): b659e5f

add tokenizer

Browse files

.gitattributes CHANGED
File without changes
.gitignore CHANGED
File without changes
config.json CHANGED
File without changes
merges.txt ADDED
The diff for this file is too large to render. See raw diff
pytorch_model.bin CHANGED
File without changes
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": "<|endoftext|>", "bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "distilgpt2", "tokenizer_class": "GPT2Tokenizer"}
training_args.bin CHANGED
File without changes
vocab.json ADDED
The diff for this file is too large to render. See raw diff