|
DATASET_CARD_TEMPLATE = """ |
|
# Dataset Card for {hf_repo} |
|
|
|
## Dataset Description |
|
|
|
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space. |
|
|
|
- **Number of images:** {num_images} |
|
- **Number of PDFs processed:** {num_pdfs} |
|
- **Sample size per PDF:** {sample_size} |
|
- **Created on:** {creation_date} |
|
|
|
## Dataset Creation |
|
|
|
### Source Data |
|
|
|
The images in this dataset were generated from user-uploaded PDF files. |
|
|
|
### Processing Steps |
|
|
|
1. PDF files were uploaded to the PDFs to Page Images Converter. |
|
2. Each PDF was processed, converting selected pages to images. |
|
3. The resulting images were saved and uploaded to this dataset. |
|
|
|
## Dataset Structure |
|
|
|
The dataset consists of JPEG images, each representing a single page from the source PDFs. |
|
|
|
### Data Fields |
|
|
|
- `images/`: A folder containing all the converted images. |
|
|
|
### Data Splits |
|
|
|
This dataset does not have specific splits. |
|
|
|
## Additional Information |
|
|
|
- **Contributions:** Thanks to the PDFs to Page Images Converter for creating this dataset. |
|
""" |
|
|