Why the numbers of data is not same as XFUND?

#1
by BrianPY - opened

from datasets import load_dataset
dataset = load_dataset("nielsr/XFUN", "xfun.fr")

DatasetDict({
train: Dataset({
features: ['id', 'input_ids', 'bbox', 'labels', 'image', 'original_image', 'entities', 'relations'],
num_rows: 202
})
validation: Dataset({
features: ['id', 'input_ids', 'bbox', 'labels', 'image', 'original_image', 'entities', 'relations'],
num_rows: 71
})
})

Hello, I have question that why the numbers of training data is not 149 and validation data is not 50?
I download XFUND french language, and found 149 training data and 50 validation data.

Thank you.

Sign up or log in to comment