alayaran commited on
Commit
402944d
·
1 Parent(s): 0da77fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -2,3 +2,17 @@
2
  license: mit
3
  ---
4
  This is a roberta based configuration model for Bodo.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
  ---
4
  This is a roberta based configuration model for Bodo.
5
+ It does not contain checkpoints for pretrained model. Its has only two things
6
+ - Byte Level BPE Tokenizer for Bodo
7
+ - Roberta base configuration
8
+
9
+ # Uses
10
+ You can use tokenizer as following
11
+ ```
12
+ from transformers import AutoTokenizer
13
+ tokenizer = AutoTokenizer.from_pretrained('alayaran/bodo-roberta-base')
14
+
15
+ tokenizer('कौटि नख'राव दैनि कानेक्सन होबाय')
16
+
17
+ # {'input_ids': [310, 294, 313, 267, 503, 11, 268, 263, 277, 298, 287, 265, 267, 321, 263, 265, 272, 310, 273, 378, 295, 266, 271, 263, 269], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
18
+ ```