Kristijan commited on
Commit
744c3c2
1 Parent(s): 28880b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -54,11 +54,13 @@ model = GPT2LMHeadModel.from_pretrained(path_to_folder_with_checkpoint_files)
54
  You should first pretokenize your text using the [MosesTokenizer](https://pypi.org/project/mosestokenizer/):
55
 
56
  ```python
 
 
57
  with MosesTokenizer('en') as pretokenize:
58
  pretokenized_text = " ".join(pretokenize(text_string))
59
  ```
60
 
61
- To tokenize your text for this model, you should use the [tokenizer trained on Wikitext-103](https://huggingface.co/Kristijan/wikitext-103-tokenizer_v2):
62
 
63
  ```python
64
  from transformers import GPT2TokenizerFast
 
54
  You should first pretokenize your text using the [MosesTokenizer](https://pypi.org/project/mosestokenizer/):
55
 
56
  ```python
57
+ from mosestokenizer import MosesTokenizer
58
+
59
  with MosesTokenizer('en') as pretokenize:
60
  pretokenized_text = " ".join(pretokenize(text_string))
61
  ```
62
 
63
+ Then, to BPE tokenize your text for this model, you should use the [tokenizer trained on Wikitext-103](https://huggingface.co/Kristijan/wikitext-103-tokenizer_v2):
64
 
65
  ```python
66
  from transformers import GPT2TokenizerFast