text
stringclasses
5 values
hello
world
how
are
you

Datasets used for datatrove testing. Each split contains the same data:

dst = [
    {"text": "hello"},
    {"text": "world"},
    {"text": "how"},
    {"text": "are"},
    {"text": "you"},
]

But based on the split name the data are sharded into n-bins

Downloads last month
1,519
Edit dataset card