Implements the flop tokenizer, a sub-word tokenizer for autoregressive language modeling.
TODO: - Better printing during encoding of file and loading / exporting? - Include Python script for BPE training - Add time to logging during encoding
Implements the flop tokenizer, a sub-word tokenizer for autoregressive language modeling.
TODO: - Better printing during encoding of file and loading / exporting? - Include Python script for BPE training - Add time to logging during encoding