Eyvaz's picture
add tokenizer
e85a52c
raw
history blame
355 Bytes
{"с": 0, "р": 1, "ю": 2, "ж": 3, "щ": 4, "ё": 5, "и": 6, "я": 7, "й": 8, "д": 9, "ч": 10, "ь": 12, "г": 13, "ш": 14, "е": 15, "х": 16, "т": 17, "н": 18, "ы": 19, "б": 20, "у": 21, "л": 22, "в": 23, "к": 24, "э": 25, "з": 26, "о": 27, "ф": 28, "ъ": 29, "м": 30, "ц": 31, "а": 32, "п": 33, "|": 11, "[UNK]": 34, "[PAD]": 35}