Training in progress, step 1000

09b13b3 verified 7 months ago

3.67 kB

	# Train Pix2pix on your custom data

	This folder contains a script to train [pix2pix](https://arxiv.org/abs/1611.07004) for conditional image generation, leveraging the [Hugging Face](https://huggingface.co/) ecosystem for processing data and pushing the model to the Hub.

	The script leverages 🤗 Datasets for loading and processing data, and 🤗 Accelerate for instantly running on CPU, single, multi-GPUs or TPU, also supporting fp16/mixed precision.

	<p align="center">
	<img src="https://raw.githubusercontent.com/huggingface/community-events/main/huggan/assets/pix2pix_maps.png" alt="drawing" width="300"/>
	</p>

	Pix2pix trained on the [huggan/maps](https://huggingface.co/datasets/huggan/maps) dataset to translate satellite images into maps à la Google Maps. First row: input, second row: prediction, third row: ground truth.

	## Launching the script

	To train the model with the default parameters (200 epochs, 256x256 images, etc.) on [huggan/facades](https://huggingface.co/datasets/huggan/facades) on your environment, first run:

	```bash
	accelerate config
	```

	and answer the questions asked about your environment. Next, launch the script as follows:

	```
	accelerate launch train.py
	```

	This will create local "images" and "saved_models" directories, containing generated images and saved checkpoints over the course of the training.

	To train on another dataset available on the hub, simply do (for instance):

	```
	accelerate launch train.py --dataset huggan/night2day
	```

	Make sure to pick a dataset which has "imageA" and "imageB" columns defined. One can always tweak the script in case the column names are different.

	In case you'd like to tweak the script to your liking, first fork the "community-events" [repo](https://github.com/huggingface/community-events) (see the button on the top right), then clone it locally:

	```bash
	git clone https://github.com/<your Github username>/community-events.git
	```

	and edit to your liking.

	## Training on your own data

	You can of course also train on your own images. For this, one can leverage Datasets' [ImageFolder](https://huggingface.co/docs/datasets/v2.0.0/en/image_process#imagefolder). Make sure to authenticate with the hub first, either by running the `huggingface-cli login` command in a terminal, or the following in case you're working in a notebook:

	```python
	from huggingface_hub import notebook_login

	notebook_login()
	```

	Next, run the following in a notebook/script:

	```python
	from datasets import load_dataset

	# first: load dataset
	# option 1: from local folder
	dataset = load_dataset("imagefolder", data_dir="path_to_folder")
	# option 2: from remote URL (e.g. a zip file)
	dataset = load_dataset("imagefolder", data_files="URL to .zip file")

	# optional: remove "label" column, in case there are no subcategories
	dataset['train'] = dataset['train'].remove_columns(column_names="label")

	# next: push to the hub (assuming git-LFS is installed)
	dataset.push_to_hub("huggan/my-awesome-dataset")
	```

	You can then simply pass the name of the dataset to the script:

	```
	accelerate launch train.py --dataset huggan/my-awesome-dataset
	```

	## Pushing model to the Hub

	You can push your trained generator to the hub during training by specifying the `push_to_hub` flag, along with a `model_name`.

	```bash
	accelerate launch train.py --push_to_hub --model_name pix2pix-facades
	```

	This is made possible by making the generator inherit from `PyTorchModelHubMixin` available in the `huggingface_hub` library.

	# Citation

	This repo is entirely based on Erik Linder-Norén's [PyTorch-GAN repo](https://github.com/eriklindernoren/PyTorch-GAN), but with added HuggingFace goodies.