anuragshas's picture
add tokenizer
264e19e
raw
history blame
349 Bytes
{"ê": 0, "ï": 1, "â": 2, "z": 3, "k": 4, "x": 5, "ö": 6, "s": 7, "l": 8, "n": 9, "è": 10, "ò": 11, "f": 12, "h": 13, "d": 14, "c": 15, "r": 16, "q": 17, "p": 18, "g": 19, "t": 20, "m": 21, "ä": 22, "u": 23, "e": 24, "b": 25, "à": 26, "o": 27, "v": 28, "ü": 29, "y": 30, "w": 32, "j": 33, "a": 34, "i": 35, "|": 31, "[UNK]": 36, "[PAD]": 37}