pszemraj commited on
Commit
9d6dd6b
1 Parent(s): 7d13bea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -38,3 +38,5 @@ GPT2TokenizerFast(name_or_path='pszemraj/claude-tokenizer-mlm', vocab_size=65000
38
  65003: AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=True, special=True),
39
  }
40
  ```
 
 
 
38
  65003: AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=True, special=True),
39
  }
40
  ```
41
+
42
+ the `<CLS>` token is added but unused, both the CLS and BOS tokens are set to `<bos>` - see `tokenizer_config.json` for details