ctoraman
/

RoBERTa-TR-medium-bpe-16k

Inference Endpoints

Model card Files Files and versions Community

ctoraman commited on Mar 8, 2022

Commit

a796138

·

1 Parent(s): da3ff5e

readme updated

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -18,6 +18,19 @@ Model architecture is similar to bert-medium (8 layers, 8 heads, and 512 hidden
 The details can be found at this paper:
 https://arxiv.org/...
 ### BibTeX entry and citation info
 ```bibtex
 @article{}

 The details can be found at this paper:
 https://arxiv.org/...
+The following code segment can be used for initializing the tokenizer, example max length (514) can be changed:
+```
+	tokenizer = PreTrainedTokenizerFast(tokenizer_file=[file_path])
+	tokenizer.mask_token = "[MASK]"
+	tokenizer.cls_token = "[CLS]"
+	tokenizer.sep_token = "[SEP]"
+	tokenizer.pad_token = "[PAD]"
+	tokenizer.unk_token = "[UNK]"
+	tokenizer.bos_token = "[CLS]"
+	tokenizer.eos_token = "[SEP]"
+	tokenizer.model_max_length = 514
+```
 ### BibTeX entry and citation info
 ```bibtex
 @article{}