YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

MET Coin Scraper for LoRA Training

A comprehensive Python toolkit for scraping coin images from the Metropolitan Museum of Art (MET) collection to create training datasets for LoRA (Low-Rank Adaptation) models. This tool enables you to generate high-quality coin images based on historical references.

Overview

This project implements the data ingestion layer from the SCR Coin Generator technical specification. It scrapes coin imagery and metadata from the MET's public API, preprocesses the images, and generates training-ready captions for LoRA fine-tuning.

Architecture

How LoRA Training Works

┌─────────────────────────────────────────────────────────────────┐
│                     TRAINING (train_lora.py)                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────┐                                    │
│  │ Stable Diffusion 1.5    │  ← Downloaded from HuggingFace     │
│  │ (FROZEN - not trained)  │    ~4GB, pre-trained on billions   │
│  │                         │    of images                       │
│  └───────────┬─────────────┘                                    │
│              │                                                  │
│              ▼                                                  │
│  ┌─────────────────────────┐                                    │
│  │ LoRA Adapter Layers     │  ← THESE get trained               │
│  │ (rank=32, alpha=16)     │    on your MET coin images         │
│  └───────────┬─────────────┘                                    │
│              │                                                  │
│              ▼                                                  │
│  ┌─────────────────────────┐                                    │
│  │ scrcoin-lora.safetensors│  ← Output: just the adapter        │
│  │ (~50-200MB)             │    weights, NOT a full model       │
│  └─────────────────────────┘                                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                    INFERENCE (inference.py)                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────┐   ┌─────────────────────────┐      │
│  │ Stable Diffusion 1.5    │ + │ Your LoRA weights       │      │
│  │ (base model)            │   │ (scrcoin-lora)          │      │
│  └───────────┬─────────────┘   └───────────┬─────────────┘      │
│              │                             │                    │
│              └──────────┬──────────────────┘                    │
│                         ▼                                       │
│              ┌─────────────────────────┐                        │
│              │ "roman gold coin,       │                        │
│              │  scrcoin style"         │                        │
│              └───────────┬─────────────┘                        │
│                          ▼                                      │
│              ┌─────────────────────────┐                        │
│              │    Generated Coin       │                        │
│              │    [IMAGE OUTPUT]       │                        │
│              └─────────────────────────┘                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key Concepts

Base Model: Stable Diffusion (SD 1.5 or SDXL) - a large pre-trained model that knows how to generate images
LoRA: Low-Rank Adaptation - a small adapter that teaches the base model a specific style (coins)
Style Token: scrcoin style - the trigger phrase that activates your trained style

Base Model Options

Model	Quality	VRAM Required	Training Time	Best For
SD 1.5	Good	6-8 GB	Fast	Limited hardware, quick experiments
SDXL 1.0	Excellent	12-24 GB	Slower	Production quality
FLUX.1	Best	24+ GB	Slowest	Cutting-edge results

Current default: runwayml/stable-diffusion-v1-5 (SD 1.5)

To change, edit training_config.yaml:

model:
  base_model: "stabilityai/stable-diffusion-xl-base-1.0"  # For SDXL
  resolution: 1024  # Change to 1024 for SDXL

Features

Automated Scraping: Download coin images and metadata from MET's public API
Smart Preprocessing:
- Duplicate detection using perceptual hashing
- Blur detection and quality filtering
- Image normalization and resizing
- Automatic cleanup of poor-quality images
Caption Generation:
- Multiple caption styles for training flexibility
- Configurable style tokens for LoRA training
- Metadata-driven descriptive captions
Configurable Pipeline: JSON-based configuration for all parameters
Progress Tracking: Detailed logging and progress bars
Retry Logic: Automatic retry with exponential backoff for network issues

Installation

Prerequisites

Python 3.8 or higher
pip package manager

Setup

Clone the repository:

git clone <repository-url>
cd coin-scrape

Install dependencies:

pip install -r requirements.txt

Quick Start

Test Mode (Recommended First Run)

Test the scraper with just 10 coins to ensure everything works:

python run_pipeline.py --test

Full Pipeline

Scrape all coins from the MET collection:

python run_pipeline.py

Limited Scraping

Scrape a specific number of coins:

python run_pipeline.py --limit 500

Usage

Complete Pipeline

The run_pipeline.py script orchestrates the entire process:

python run_pipeline.py [options]

Options:

--config CONFIG: Path to configuration file (default: config.json)
--limit N: Limit number of coins to scrape
--test: Test mode - scrape only 10 coins
--skip-scraping: Skip scraping if you already have data
--skip-preprocessing: Skip image preprocessing
--skip-captions: Skip caption generation
--caption-style STYLE: Caption style (basic|detailed|style|template)

Examples:

# Test with 10 coins
python run_pipeline.py --test

# Scrape 100 coins
python run_pipeline.py --limit 100

# Re-generate captions only
python run_pipeline.py --skip-scraping --skip-preprocessing

# Use detailed captions
python run_pipeline.py --caption-style detailed

Individual Components

1. Scraping Only

python met_scraper.py [options]

Options:

--config CONFIG: Path to config file
--limit N: Maximum coins to scrape
--test: Test mode (10 coins)

2. Preprocessing Only

python preprocessor.py [options]

Options:

--config CONFIG: Path to config file
--stats-only: Only generate statistics without processing

3. Caption Generation Only

python caption_generator.py [options]

Options:

--config CONFIG: Path to config file
--style STYLE: Caption style (basic|detailed|style|template)
--preview: Preview captions without generating files
--samples N: Number of samples to preview

Configuration

Edit config.json to customize the scraping behavior:

MET API Settings

{
  "met_api": {
    "base_url": "https://collectionapi.metmuseum.org/public/collection/v1",
    "search_query": "coin",
    "has_images": true,
    "rate_limit_delay": 0.5,
    "max_retries": 3,
    "timeout": 30
  }
}

Scraping Settings

{
  "scraping": {
    "output_dir": "./dataset",
    "images_dir": "./dataset/images",
    "metadata_dir": "./dataset/metadata",
    "logs_dir": "./logs",
    "min_image_size": 512,
    "max_image_size": 2048,
    "image_format": "png",
    "batch_size": 100
  }
}

Preprocessing Settings

{
  "preprocessing": {
    "remove_duplicates": true,
    "blur_threshold": 100,
    "detect_blur": true,
    "normalize_size": 1024,
    "background_removal": false
  }
}

Caption Settings

{
  "captions": {
    "include_culture": true,
    "include_period": true,
    "include_medium": true,
    "include_dimensions": false,
    "style_token": "scrcoin style",
    "template": "{culture} {medium} coin, embossed metal, museum artifact, {style_token}"
  }
}

Output Structure

After running the pipeline, your dataset will be organized as follows:

dataset/
├── images/
│   ├── 12345_0.png          # Primary image
│   ├── 12345_0.txt          # Caption for primary image
│   ├── 12345_1.png          # Additional image (if any)
│   ├── 12345_1.txt          # Caption for additional image
│   └── ...
├── metadata/
│   ├── 12345.json           # Full metadata
│   └── ...
└── summary_TIMESTAMP.json   # Scraping summary with statistics

logs/
└── scraper_TIMESTAMP.log    # Detailed logs

Dataset Metadata

Each coin's metadata JSON file contains:

{
  "objectID": 12345,
  "title": "Coin of Emperor Augustus",
  "culture": "Roman",
  "period": "Early Imperial",
  "medium": "Silver",
  "dimensions": "Diam. 2.5 cm",
  "date": "27 B.C.–A.D. 14",
  "primaryImage": "https://...",
  "additionalImages": ["https://...", ...],
  "local_primary_image": "dataset/images/12345_0.png",
  "local_additional_images": ["dataset/images/12345_1.png", ...],
  "tags": ["coin", "denarius", "portrait"],
  "objectURL": "https://www.metmuseum.org/art/collection/search/12345"
}

Caption Styles

Basic

roman silver coin, embossed metal, relief sculpture, circular, museum artifact

Detailed

roman silver coin, embossed metal, relief sculpture, circular, museum artifact, denarius, portrait of augustus

Style (Recommended for LoRA)

roman silver coin, embossed metal, relief sculpture, circular, museum artifact, scrcoin style

Template

roman silver coin, embossed metal, museum artifact, scrcoin style

Training with the Dataset

Once you have prepared your dataset, you can use it to train a LoRA model:

Using Kohya_ss

# Point to your dataset directory
--train_data_dir="./dataset/images" \
--caption_extension=".txt" \
--resolution=1024 \
--network_dim=32 \
--network_alpha=16

Using ComfyUI LoRA Training

Load your images from dataset/images/
Ensure corresponding .txt caption files exist
Configure training parameters based on your base model (SD1.5 or SDXL)

API Rate Limiting

The scraper implements respectful rate limiting:

Default delay: 0.5 seconds between requests
Automatic retry with exponential backoff (2s, 4s, 8s)
Maximum 3 retries per request
Configurable timeout: 30 seconds

Troubleshooting

Network Access Issues

If you get 403 or connection errors:

Check your internet connection
Verify the MET API is accessible: https://collectionapi.metmuseum.org/
Try increasing rate_limit_delay in config.json
Check firewall/proxy settings

Image Quality Issues

If images are too small or blurry:

Adjust min_image_size in config.json
Modify blur_threshold (higher = more strict)
Use --stats-only to preview quality before processing

Memory Issues

If running out of memory with large datasets:

Process in batches using --limit
Reduce normalize_size in config.json
Disable background_removal if enabled

Data Sources

MET Collection API: https://metmuseum.github.io/
License: Images are from the MET's Open Access collection (Public Domain)
Usage: Free for any purpose, including commercial use

Next Steps

After preparing your dataset:

Review Quality: Check sample images and captions
Train LoRA: Use Kohya_ss or ComfyUI with your dataset
Test Generation: Experiment with different prompts using your style token
Iterate: Adjust caption style and retrain if needed

Technical Specification

This tool implements Section 3: Data Pipeline from the SCR Coin Generator technical specification:

✅ MET API integration
✅ Metadata extraction and normalization
✅ Image downloading with quality control
✅ Duplicate detection via perceptual hashing
✅ Blur detection and filtering
✅ Automatic captioning for training
✅ Style token integration

Publishing to HuggingFace

You can host your trained LoRA model on HuggingFace for easy sharing and deployment.

Setup HuggingFace CLI

# Install huggingface_hub
pip install huggingface_hub

# Login to HuggingFace (get token from https://huggingface.co/settings/tokens)
huggingface-cli login

Push Your LoRA Model

Option 1: Using the CLI

# Create a new model repository
huggingface-cli repo create scrcoin-lora --type model

# Upload your trained LoRA
huggingface-cli upload your-username/scrcoin-lora ./lora_output/final

Option 2: Using Python Script

from huggingface_hub import HfApi, create_repo

# Create repository
create_repo("scrcoin-lora", repo_type="model")

# Upload files
api = HfApi()
api.upload_folder(
    folder_path="./lora_output/final",
    repo_id="your-username/scrcoin-lora",
    repo_type="model"
)

Option 3: Add to this project

We've included a helper script. After training:

python push_to_hub.py --repo your-username/scrcoin-lora --lora-path ./lora_output/final

Create a Model Card

Create a README.md in your HuggingFace repo:

---
license: mit
base_model: runwayml/stable-diffusion-v1-5
tags:
  - stable-diffusion
  - lora
  - coin
  - numismatics
---

# SCR Coin Style LoRA

A LoRA trained on historical coin images from the Metropolitan Museum of Art.

## Usage

```python
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
)
pipe.load_lora_weights("your-username/scrcoin-lora")
pipe.to("cuda")

image = pipe("ancient roman gold coin, emperor portrait, scrcoin style").images[0]

Trigger Token

Use scrcoin style in your prompts to activate the coin style.


### Using Your Hosted Model

Once published, anyone can use your model:

```python
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe.load_lora_weights("your-username/scrcoin-lora")

image = pipe("greek silver coin, owl design, scrcoin style").images[0]

HuggingFace Inference API

After publishing, your model gets a free inference endpoint:

curl https://api-inference.huggingface.co/models/your-username/scrcoin-lora \
  -X POST \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"inputs": "roman gold coin, emperor portrait, scrcoin style"}'

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

MIT License - See LICENSE file for details

Acknowledgments

Metropolitan Museum of Art for their excellent Open Access API
The open-source LoRA training community
Kohya_ss and ComfyUI developers

Support

For issues, questions, or feature requests, please open an issue on GitHub.

Ready to create your own coin-generating AI model! 🪙✨

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support