File size: 262 Bytes
eaa6a2b
 
7181f4c
 
 
 
eaa6a2b
fc3ac98
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
license: mit
tags:
  - dna
  - biology
  - genomics
---
# Tokenizer for masked language modeling of DNA sequences

```json
    "vocab": {
      "[PAD]": 0,
      "[MASK]": 1,
      "[UNK]": 2,
      "a": 3,
      "c": 4,
      "g": 5,
      "t": 6
    },
```