tokenizer-dna-mlm / README.md
gonzalobenegas's picture
Update README.md
fc3ac98
|
raw
history blame
223 Bytes
metadata
license: mit

Tokenizer for masked language modeling of DNA sequences

    "vocab": {
      "[PAD]": 0,
      "[MASK]": 1,
      "[UNK]": 2,
      "a": 3,
      "c": 4,
      "g": 5,
      "t": 6
    },