tokenizer-dna-clm / README.md
gonzalobenegas's picture
Create README.md
9849233 verified
---
license: mit
tags:
- biology
- genomics
- dna
---
# Tokenizer for causal language modeling of DNA sequences
```json
"vocab": {
"[PAD]": 0,
"[UNK]": 1,
"a": 2,
"c": 3,
"g": 4,
"t": 5,
},
```