ali-issa/bpe-arb-diac-tokenizer-byte-level-32768-trained-on-test-set-10000-example Updated 3 days ago
ali-issa/eng_filtered_short_sentences_less_than_5_words_training_data_for_opus_aya_xnli_with_vocab Viewer • Updated 28 days ago • 142M • 416
ali-issa/eng_filtered_short_sentences_less_than_5_words_training_data_for_opus_aya_xnli Viewer • Updated 29 days ago • 142M • 489
ali-issa/eng_tokenized_filtered_dataset_with_eng-bpe-tokenizer-32768 Viewer • Updated Jan 20 • 142M • 104