flopml
/

tokenizer

File size: 255 Bytes

120dee6

Implements the flop tokenizer, a sub-word tokenizer for autoregressive language modeling.


TODO:
    - Better printing during encoding of file and loading / exporting?
    - Include Python script for BPE training
    - Add time to logging during encoding