Spaces:

Dataset-Tools
/

pdf-to-page-images-dataset

Running

pdf-to-page-images-dataset / dataset_card_template.py

davanstrien HF staff

create card template

662b961 3 months ago

1.04 kB

	DATASET_CARD_TEMPLATE = """
	# Dataset Card for {hf_repo}

	## Dataset Description

	This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.

	- Number of images: {num_images}
	- Number of PDFs processed: {num_pdfs}
	- Sample size per PDF: {sample_size}
	- Created on: {creation_date}

	## Dataset Creation

	### Source Data

	The images in this dataset were generated from user-uploaded PDF files.

	### Processing Steps

	1. PDF files were uploaded to the PDFs to Page Images Converter.
	2. Each PDF was processed, converting selected pages to images.
	3. The resulting images were saved and uploaded to this dataset.

	## Dataset Structure

	The dataset consists of JPEG images, each representing a single page from the source PDFs.

	### Data Fields

	- `images/`: A folder containing all the converted images.

	### Data Splits

	This dataset does not have specific splits.

	## Additional Information

	- Contributions: Thanks to the PDFs to Page Images Converter for creating this dataset.
	"""