sam-mosaic commited on
Commit
8cf8528
1 Parent(s): 246cc63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -1,3 +1,11 @@
1
  # Pile of Law Tokenizer
2
 
3
  This tokenizer should be a drop-in replacement for the GPT2Tokenizer. It has the same vocabulary size and special tokens, but was trained on a random 1M samples from [the pile of law](https://huggingface.co/datasets/pile-of-law/pile-of-law) train split.
 
 
 
 
 
 
 
 
 
1
  # Pile of Law Tokenizer
2
 
3
  This tokenizer should be a drop-in replacement for the GPT2Tokenizer. It has the same vocabulary size and special tokens, but was trained on a random 1M samples from [the pile of law](https://huggingface.co/datasets/pile-of-law/pile-of-law) train split.
4
+
5
+ Usage:
6
+
7
+ ```python
8
+ from transformers import AutoTokenizer
9
+
10
+ tokenizer = AutoTokenizer.from_pretrained("sam-mosaic/pile-of-law-tokenizer")
11
+ ```