YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
- MET Coin Scraper for LoRA Training
- Overview
- Architecture
- Base Model Options
- Features
- Installation
- Quick Start
- Usage
- Configuration
- Output Structure
- Dataset Metadata
- Caption Styles
- Training with the Dataset
- API Rate Limiting
- Troubleshooting
- Data Sources
- Next Steps
- Technical Specification
- Publishing to HuggingFace
- Trigger Token
- Contributing
- License
- Acknowledgments
- Support
- Overview
MET Coin Scraper for LoRA Training
A comprehensive Python toolkit for scraping coin images from the Metropolitan Museum of Art (MET) collection to create training datasets for LoRA (Low-Rank Adaptation) models. This tool enables you to generate high-quality coin images based on historical references.
Overview
This project implements the data ingestion layer from the SCR Coin Generator technical specification. It scrapes coin imagery and metadata from the MET's public API, preprocesses the images, and generates training-ready captions for LoRA fine-tuning.
Architecture
How LoRA Training Works
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TRAINING (train_lora.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββ β
β β Stable Diffusion 1.5 β β Downloaded from HuggingFace β
β β (FROZEN - not trained) β ~4GB, pre-trained on billions β
β β β of images β
β βββββββββββββ¬ββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββ β
β β LoRA Adapter Layers β β THESE get trained β
β β (rank=32, alpha=16) β on your MET coin images β
β βββββββββββββ¬ββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββ β
β β scrcoin-lora.safetensorsβ β Output: just the adapter β
β β (~50-200MB) β weights, NOT a full model β
β βββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INFERENCE (inference.py) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β Stable Diffusion 1.5 β + β Your LoRA weights β β
β β (base model) β β (scrcoin-lora) β β
β βββββββββββββ¬ββββββββββββββ βββββββββββββ¬ββββββββββββββ β
β β β β
β ββββββββββββ¬βββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββ β
β β "roman gold coin, β β
β β scrcoin style" β β
β βββββββββββββ¬ββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββ β
β β Generated Coin β β
β β [IMAGE OUTPUT] β β
β βββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Concepts
- Base Model: Stable Diffusion (SD 1.5 or SDXL) - a large pre-trained model that knows how to generate images
- LoRA: Low-Rank Adaptation - a small adapter that teaches the base model a specific style (coins)
- Style Token:
scrcoin style- the trigger phrase that activates your trained style
Base Model Options
| Model | Quality | VRAM Required | Training Time | Best For |
|---|---|---|---|---|
| SD 1.5 | Good | 6-8 GB | Fast | Limited hardware, quick experiments |
| SDXL 1.0 | Excellent | 12-24 GB | Slower | Production quality |
| FLUX.1 | Best | 24+ GB | Slowest | Cutting-edge results |
Current default: runwayml/stable-diffusion-v1-5 (SD 1.5)
To change, edit training_config.yaml:
model:
base_model: "stabilityai/stable-diffusion-xl-base-1.0" # For SDXL
resolution: 1024 # Change to 1024 for SDXL
Features
- Automated Scraping: Download coin images and metadata from MET's public API
- Smart Preprocessing:
- Duplicate detection using perceptual hashing
- Blur detection and quality filtering
- Image normalization and resizing
- Automatic cleanup of poor-quality images
- Caption Generation:
- Multiple caption styles for training flexibility
- Configurable style tokens for LoRA training
- Metadata-driven descriptive captions
- Configurable Pipeline: JSON-based configuration for all parameters
- Progress Tracking: Detailed logging and progress bars
- Retry Logic: Automatic retry with exponential backoff for network issues
Installation
Prerequisites
- Python 3.8 or higher
- pip package manager
Setup
- Clone the repository:
git clone <repository-url>
cd coin-scrape
- Install dependencies:
pip install -r requirements.txt
Quick Start
Test Mode (Recommended First Run)
Test the scraper with just 10 coins to ensure everything works:
python run_pipeline.py --test
Full Pipeline
Scrape all coins from the MET collection:
python run_pipeline.py
Limited Scraping
Scrape a specific number of coins:
python run_pipeline.py --limit 500
Usage
Complete Pipeline
The run_pipeline.py script orchestrates the entire process:
python run_pipeline.py [options]
Options:
--config CONFIG: Path to configuration file (default: config.json)--limit N: Limit number of coins to scrape--test: Test mode - scrape only 10 coins--skip-scraping: Skip scraping if you already have data--skip-preprocessing: Skip image preprocessing--skip-captions: Skip caption generation--caption-style STYLE: Caption style (basic|detailed|style|template)
Examples:
# Test with 10 coins
python run_pipeline.py --test
# Scrape 100 coins
python run_pipeline.py --limit 100
# Re-generate captions only
python run_pipeline.py --skip-scraping --skip-preprocessing
# Use detailed captions
python run_pipeline.py --caption-style detailed
Individual Components
1. Scraping Only
python met_scraper.py [options]
Options:
--config CONFIG: Path to config file--limit N: Maximum coins to scrape--test: Test mode (10 coins)
2. Preprocessing Only
python preprocessor.py [options]
Options:
--config CONFIG: Path to config file--stats-only: Only generate statistics without processing
3. Caption Generation Only
python caption_generator.py [options]
Options:
--config CONFIG: Path to config file--style STYLE: Caption style (basic|detailed|style|template)--preview: Preview captions without generating files--samples N: Number of samples to preview
Configuration
Edit config.json to customize the scraping behavior:
MET API Settings
{
"met_api": {
"base_url": "https://collectionapi.metmuseum.org/public/collection/v1",
"search_query": "coin",
"has_images": true,
"rate_limit_delay": 0.5,
"max_retries": 3,
"timeout": 30
}
}
Scraping Settings
{
"scraping": {
"output_dir": "./dataset",
"images_dir": "./dataset/images",
"metadata_dir": "./dataset/metadata",
"logs_dir": "./logs",
"min_image_size": 512,
"max_image_size": 2048,
"image_format": "png",
"batch_size": 100
}
}
Preprocessing Settings
{
"preprocessing": {
"remove_duplicates": true,
"blur_threshold": 100,
"detect_blur": true,
"normalize_size": 1024,
"background_removal": false
}
}
Caption Settings
{
"captions": {
"include_culture": true,
"include_period": true,
"include_medium": true,
"include_dimensions": false,
"style_token": "scrcoin style",
"template": "{culture} {medium} coin, embossed metal, museum artifact, {style_token}"
}
}
Output Structure
After running the pipeline, your dataset will be organized as follows:
dataset/
βββ images/
β βββ 12345_0.png # Primary image
β βββ 12345_0.txt # Caption for primary image
β βββ 12345_1.png # Additional image (if any)
β βββ 12345_1.txt # Caption for additional image
β βββ ...
βββ metadata/
β βββ 12345.json # Full metadata
β βββ ...
βββ summary_TIMESTAMP.json # Scraping summary with statistics
logs/
βββ scraper_TIMESTAMP.log # Detailed logs
Dataset Metadata
Each coin's metadata JSON file contains:
{
"objectID": 12345,
"title": "Coin of Emperor Augustus",
"culture": "Roman",
"period": "Early Imperial",
"medium": "Silver",
"dimensions": "Diam. 2.5 cm",
"date": "27 B.C.βA.D. 14",
"primaryImage": "https://...",
"additionalImages": ["https://...", ...],
"local_primary_image": "dataset/images/12345_0.png",
"local_additional_images": ["dataset/images/12345_1.png", ...],
"tags": ["coin", "denarius", "portrait"],
"objectURL": "https://www.metmuseum.org/art/collection/search/12345"
}
Caption Styles
Basic
roman silver coin, embossed metal, relief sculpture, circular, museum artifact
Detailed
roman silver coin, embossed metal, relief sculpture, circular, museum artifact, denarius, portrait of augustus
Style (Recommended for LoRA)
roman silver coin, embossed metal, relief sculpture, circular, museum artifact, scrcoin style
Template
roman silver coin, embossed metal, museum artifact, scrcoin style
Training with the Dataset
Once you have prepared your dataset, you can use it to train a LoRA model:
Using Kohya_ss
# Point to your dataset directory
--train_data_dir="./dataset/images" \
--caption_extension=".txt" \
--resolution=1024 \
--network_dim=32 \
--network_alpha=16
Using ComfyUI LoRA Training
- Load your images from
dataset/images/ - Ensure corresponding
.txtcaption files exist - Configure training parameters based on your base model (SD1.5 or SDXL)
API Rate Limiting
The scraper implements respectful rate limiting:
- Default delay: 0.5 seconds between requests
- Automatic retry with exponential backoff (2s, 4s, 8s)
- Maximum 3 retries per request
- Configurable timeout: 30 seconds
Troubleshooting
Network Access Issues
If you get 403 or connection errors:
- Check your internet connection
- Verify the MET API is accessible: https://collectionapi.metmuseum.org/
- Try increasing
rate_limit_delayin config.json - Check firewall/proxy settings
Image Quality Issues
If images are too small or blurry:
- Adjust
min_image_sizein config.json - Modify
blur_threshold(higher = more strict) - Use
--stats-onlyto preview quality before processing
Memory Issues
If running out of memory with large datasets:
- Process in batches using
--limit - Reduce
normalize_sizein config.json - Disable
background_removalif enabled
Data Sources
- MET Collection API: https://metmuseum.github.io/
- License: Images are from the MET's Open Access collection (Public Domain)
- Usage: Free for any purpose, including commercial use
Next Steps
After preparing your dataset:
- Review Quality: Check sample images and captions
- Train LoRA: Use Kohya_ss or ComfyUI with your dataset
- Test Generation: Experiment with different prompts using your style token
- Iterate: Adjust caption style and retrain if needed
Technical Specification
This tool implements Section 3: Data Pipeline from the SCR Coin Generator technical specification:
- β MET API integration
- β Metadata extraction and normalization
- β Image downloading with quality control
- β Duplicate detection via perceptual hashing
- β Blur detection and filtering
- β Automatic captioning for training
- β Style token integration
Publishing to HuggingFace
You can host your trained LoRA model on HuggingFace for easy sharing and deployment.
Setup HuggingFace CLI
# Install huggingface_hub
pip install huggingface_hub
# Login to HuggingFace (get token from https://huggingface.co/settings/tokens)
huggingface-cli login
Push Your LoRA Model
Option 1: Using the CLI
# Create a new model repository
huggingface-cli repo create scrcoin-lora --type model
# Upload your trained LoRA
huggingface-cli upload your-username/scrcoin-lora ./lora_output/final
Option 2: Using Python Script
from huggingface_hub import HfApi, create_repo
# Create repository
create_repo("scrcoin-lora", repo_type="model")
# Upload files
api = HfApi()
api.upload_folder(
folder_path="./lora_output/final",
repo_id="your-username/scrcoin-lora",
repo_type="model"
)
Option 3: Add to this project
We've included a helper script. After training:
python push_to_hub.py --repo your-username/scrcoin-lora --lora-path ./lora_output/final
Create a Model Card
Create a README.md in your HuggingFace repo:
---
license: mit
base_model: runwayml/stable-diffusion-v1-5
tags:
- stable-diffusion
- lora
- coin
- numismatics
---
# SCR Coin Style LoRA
A LoRA trained on historical coin images from the Metropolitan Museum of Art.
## Usage
```python
from diffusers import StableDiffusionPipeline
import torch
pipe = StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16
)
pipe.load_lora_weights("your-username/scrcoin-lora")
pipe.to("cuda")
image = pipe("ancient roman gold coin, emperor portrait, scrcoin style").images[0]
Trigger Token
Use scrcoin style in your prompts to activate the coin style.
### Using Your Hosted Model
Once published, anyone can use your model:
```python
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe.load_lora_weights("your-username/scrcoin-lora")
image = pipe("greek silver coin, owl design, scrcoin style").images[0]
HuggingFace Inference API
After publishing, your model gets a free inference endpoint:
curl https://api-inference.huggingface.co/models/your-username/scrcoin-lora \
-X POST \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{"inputs": "roman gold coin, emperor portrait, scrcoin style"}'
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
License
MIT License - See LICENSE file for details
Acknowledgments
- Metropolitan Museum of Art for their excellent Open Access API
- The open-source LoRA training community
- Kohya_ss and ComfyUI developers
Support
For issues, questions, or feature requests, please open an issue on GitHub.
Ready to create your own coin-generating AI model! πͺβ¨
- Downloads last month
- 1