Spaces:
Build error
Build error
Upload folder using huggingface_hub
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +1 -0
- .github/ISSUE_TEMPLATE/bug_report.md +35 -0
- .github/workflows/update_space.yml +28 -0
- .gitignore +2 -0
- Dockerfile +22 -0
- LICENSE.txt +97 -0
- README.md +382 -7
- app.py +64 -0
- calc_metrics.py +190 -0
- dataset_tool.py +444 -0
- dnnlib/__init__.py +9 -0
- dnnlib/util.py +477 -0
- docker_run.sh +38 -0
- docs/dataset-tool-help.txt +50 -0
- docs/license.html +153 -0
- docs/stylegan2-ada-teaser-1024x252.png +0 -0
- docs/stylegan2-ada-training-curves.png +0 -0
- docs/train-help.txt +70 -0
- ffhq.pkl +3 -0
- ffhq.pkl.1 +3 -0
- fine_tuned_stylegan.pth +3 -0
- generate.py +129 -0
- legacy.py +320 -0
- metrics/__init__.py +9 -0
- metrics/frechet_inception_distance.py +41 -0
- metrics/inception_score.py +38 -0
- metrics/kernel_inception_distance.py +46 -0
- metrics/metric_main.py +152 -0
- metrics/metric_utils.py +275 -0
- metrics/perceptual_path_length.py +131 -0
- metrics/precision_recall.py +62 -0
- projector.py +212 -0
- requirements.txt +6 -0
- style_mixing.py +118 -0
- torch_utils/__init__.py +9 -0
- torch_utils/custom_ops.py +126 -0
- torch_utils/misc.py +262 -0
- torch_utils/ops/__init__.py +9 -0
- torch_utils/ops/bias_act.cpp +99 -0
- torch_utils/ops/bias_act.cu +173 -0
- torch_utils/ops/bias_act.h +38 -0
- torch_utils/ops/bias_act.py +212 -0
- torch_utils/ops/conv2d_gradfix.py +170 -0
- torch_utils/ops/conv2d_resample.py +156 -0
- torch_utils/ops/fma.py +60 -0
- torch_utils/ops/grid_sample_gradfix.py +83 -0
- torch_utils/ops/upfirdn2d.cpp +103 -0
- torch_utils/ops/upfirdn2d.cu +350 -0
- torch_utils/ops/upfirdn2d.h +59 -0
- torch_utils/ops/upfirdn2d.py +384 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ffhq.pkl.1 filter=lfs diff=lfs merge=lfs -text
|
.github/ISSUE_TEMPLATE/bug_report.md
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
name: Bug report
|
3 |
+
about: Create a report to help us improve
|
4 |
+
title: ''
|
5 |
+
labels: ''
|
6 |
+
assignees: ''
|
7 |
+
|
8 |
+
---
|
9 |
+
|
10 |
+
**Describe the bug**
|
11 |
+
A clear and concise description of what the bug is.
|
12 |
+
|
13 |
+
**To Reproduce**
|
14 |
+
Steps to reproduce the behavior:
|
15 |
+
1. In '...' directory, run command '...'
|
16 |
+
2. See error (copy&paste full log, including exceptions and **stacktraces**).
|
17 |
+
|
18 |
+
Please copy&paste text instead of screenshots for better searchability.
|
19 |
+
|
20 |
+
**Expected behavior**
|
21 |
+
A clear and concise description of what you expected to happen.
|
22 |
+
|
23 |
+
**Screenshots**
|
24 |
+
If applicable, add screenshots to help explain your problem.
|
25 |
+
|
26 |
+
**Desktop (please complete the following information):**
|
27 |
+
- OS: [e.g. Linux Ubuntu 20.04, Windows 10]
|
28 |
+
- PyTorch version (e.g., pytorch 1.7.1)
|
29 |
+
- CUDA toolkit version (e.g., CUDA 11.0)
|
30 |
+
- NVIDIA driver version
|
31 |
+
- GPU [e.g., Titan V, RTX 3090]
|
32 |
+
- Docker: did you use Docker? If yes, specify docker image URL (e.g., nvcr.io/nvidia/pytorch:20.12-py3)
|
33 |
+
|
34 |
+
**Additional context**
|
35 |
+
Add any other context about the problem here.
|
.github/workflows/update_space.yml
ADDED
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Run Python script
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches:
|
6 |
+
- main
|
7 |
+
|
8 |
+
jobs:
|
9 |
+
build:
|
10 |
+
runs-on: ubuntu-latest
|
11 |
+
|
12 |
+
steps:
|
13 |
+
- name: Checkout
|
14 |
+
uses: actions/checkout@v2
|
15 |
+
|
16 |
+
- name: Set up Python
|
17 |
+
uses: actions/setup-python@v2
|
18 |
+
with:
|
19 |
+
python-version: '3.9'
|
20 |
+
|
21 |
+
- name: Install Gradio
|
22 |
+
run: python -m pip install gradio
|
23 |
+
|
24 |
+
- name: Log in to Hugging Face
|
25 |
+
run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
|
26 |
+
|
27 |
+
- name: Deploy to Spaces
|
28 |
+
run: gradio deploy
|
.gitignore
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
__pycache__/
|
2 |
+
.cache/
|
Dockerfile
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
FROM nvcr.io/nvidia/pytorch:20.12-py3
|
10 |
+
|
11 |
+
ENV PYTHONDONTWRITEBYTECODE 1
|
12 |
+
ENV PYTHONUNBUFFERED 1
|
13 |
+
|
14 |
+
RUN pip install imageio-ffmpeg==0.4.3 pyspng==0.1.0
|
15 |
+
|
16 |
+
WORKDIR /workspace
|
17 |
+
|
18 |
+
# Unset TORCH_CUDA_ARCH_LIST and exec. This makes pytorch run-time
|
19 |
+
# extension builds significantly faster as we only compile for the
|
20 |
+
# currently active GPU configuration.
|
21 |
+
RUN (printf '#!/bin/bash\nunset TORCH_CUDA_ARCH_LIST\nexec \"$@\"\n' >> /entry.sh) && chmod a+x /entry.sh
|
22 |
+
ENTRYPOINT ["/entry.sh"]
|
LICENSE.txt
ADDED
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Copyright (c) 2021, NVIDIA Corporation. All rights reserved.
|
2 |
+
|
3 |
+
|
4 |
+
NVIDIA Source Code License for StyleGAN2 with Adaptive Discriminator Augmentation (ADA)
|
5 |
+
|
6 |
+
|
7 |
+
=======================================================================
|
8 |
+
|
9 |
+
1. Definitions
|
10 |
+
|
11 |
+
"Licensor" means any person or entity that distributes its Work.
|
12 |
+
|
13 |
+
"Software" means the original work of authorship made available under
|
14 |
+
this License.
|
15 |
+
|
16 |
+
"Work" means the Software and any additions to or derivative works of
|
17 |
+
the Software that are made available under this License.
|
18 |
+
|
19 |
+
The terms "reproduce," "reproduction," "derivative works," and
|
20 |
+
"distribution" have the meaning as provided under U.S. copyright law;
|
21 |
+
provided, however, that for the purposes of this License, derivative
|
22 |
+
works shall not include works that remain separable from, or merely
|
23 |
+
link (or bind by name) to the interfaces of, the Work.
|
24 |
+
|
25 |
+
Works, including the Software, are "made available" under this License
|
26 |
+
by including in or with the Work either (a) a copyright notice
|
27 |
+
referencing the applicability of this License to the Work, or (b) a
|
28 |
+
copy of this License.
|
29 |
+
|
30 |
+
2. License Grants
|
31 |
+
|
32 |
+
2.1 Copyright Grant. Subject to the terms and conditions of this
|
33 |
+
License, each Licensor grants to you a perpetual, worldwide,
|
34 |
+
non-exclusive, royalty-free, copyright license to reproduce,
|
35 |
+
prepare derivative works of, publicly display, publicly perform,
|
36 |
+
sublicense and distribute its Work and any resulting derivative
|
37 |
+
works in any form.
|
38 |
+
|
39 |
+
3. Limitations
|
40 |
+
|
41 |
+
3.1 Redistribution. You may reproduce or distribute the Work only
|
42 |
+
if (a) you do so under this License, (b) you include a complete
|
43 |
+
copy of this License with your distribution, and (c) you retain
|
44 |
+
without modification any copyright, patent, trademark, or
|
45 |
+
attribution notices that are present in the Work.
|
46 |
+
|
47 |
+
3.2 Derivative Works. You may specify that additional or different
|
48 |
+
terms apply to the use, reproduction, and distribution of your
|
49 |
+
derivative works of the Work ("Your Terms") only if (a) Your Terms
|
50 |
+
provide that the use limitation in Section 3.3 applies to your
|
51 |
+
derivative works, and (b) you identify the specific derivative
|
52 |
+
works that are subject to Your Terms. Notwithstanding Your Terms,
|
53 |
+
this License (including the redistribution requirements in Section
|
54 |
+
3.1) will continue to apply to the Work itself.
|
55 |
+
|
56 |
+
3.3 Use Limitation. The Work and any derivative works thereof only
|
57 |
+
may be used or intended for use non-commercially. Notwithstanding
|
58 |
+
the foregoing, NVIDIA and its affiliates may use the Work and any
|
59 |
+
derivative works commercially. As used herein, "non-commercially"
|
60 |
+
means for research or evaluation purposes only.
|
61 |
+
|
62 |
+
3.4 Patent Claims. If you bring or threaten to bring a patent claim
|
63 |
+
against any Licensor (including any claim, cross-claim or
|
64 |
+
counterclaim in a lawsuit) to enforce any patents that you allege
|
65 |
+
are infringed by any Work, then your rights under this License from
|
66 |
+
such Licensor (including the grant in Section 2.1) will terminate
|
67 |
+
immediately.
|
68 |
+
|
69 |
+
3.5 Trademarks. This License does not grant any rights to use any
|
70 |
+
Licensor’s or its affiliates’ names, logos, or trademarks, except
|
71 |
+
as necessary to reproduce the notices described in this License.
|
72 |
+
|
73 |
+
3.6 Termination. If you violate any term of this License, then your
|
74 |
+
rights under this License (including the grant in Section 2.1) will
|
75 |
+
terminate immediately.
|
76 |
+
|
77 |
+
4. Disclaimer of Warranty.
|
78 |
+
|
79 |
+
THE WORK IS PROVIDED "AS IS" WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
80 |
+
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF
|
81 |
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR
|
82 |
+
NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER
|
83 |
+
THIS LICENSE.
|
84 |
+
|
85 |
+
5. Limitation of Liability.
|
86 |
+
|
87 |
+
EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL
|
88 |
+
THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE
|
89 |
+
SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT,
|
90 |
+
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF
|
91 |
+
OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK
|
92 |
+
(INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION,
|
93 |
+
LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER
|
94 |
+
COMMERCIAL DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF
|
95 |
+
THE POSSIBILITY OF SUCH DAMAGES.
|
96 |
+
|
97 |
+
=======================================================================
|
README.md
CHANGED
@@ -1,12 +1,387 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
-
|
4 |
-
colorFrom: indigo
|
5 |
-
colorTo: pink
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.40.0
|
8 |
-
app_file: app.py
|
9 |
-
pinned: false
|
10 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
-
|
|
|
1 |
---
|
2 |
+
title: My_StyleGAN2-ADA_Image_Generator
|
3 |
+
app_file: app.py
|
|
|
|
|
4 |
sdk: gradio
|
5 |
sdk_version: 4.40.0
|
|
|
|
|
6 |
---
|
7 |
+
## StyleGAN2-ADA — Official PyTorch implementation
|
8 |
+
|
9 |
+
![Teaser image](./docs/stylegan2-ada-teaser-1024x252.png)
|
10 |
+
|
11 |
+
**Training Generative Adversarial Networks with Limited Data**<br>
|
12 |
+
Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila<br>
|
13 |
+
https://arxiv.org/abs/2006.06676<br>
|
14 |
+
|
15 |
+
Abstract: *Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes. The approach does not require changes to loss functions or network architectures, and is applicable both when training from scratch and when fine-tuning an existing GAN on another dataset. We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images. We expect this to open up new application domains for GANs. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42.*
|
16 |
+
|
17 |
+
For business inquiries, please visit our website and submit the form: [NVIDIA Research Licensing](https://www.nvidia.com/en-us/research/inquiries/)
|
18 |
+
|
19 |
+
## Release notes
|
20 |
+
|
21 |
+
This repository is a faithful reimplementation of [StyleGAN2-ADA](https://github.com/NVlabs/stylegan2-ada/) in PyTorch, focusing on correctness, performance, and compatibility.
|
22 |
+
|
23 |
+
**Correctness**
|
24 |
+
* Full support for all primary training configurations.
|
25 |
+
* Extensive verification of image quality, training curves, and quality metrics against the TensorFlow version.
|
26 |
+
* Results are expected to match in all cases, excluding the effects of pseudo-random numbers and floating-point arithmetic.
|
27 |
+
|
28 |
+
**Performance**
|
29 |
+
* Training is typically 5%–30% faster compared to the TensorFlow version on NVIDIA Tesla V100 GPUs.
|
30 |
+
* Inference is up to 35% faster in high resolutions, but it may be slightly slower in low resolutions.
|
31 |
+
* GPU memory usage is comparable to the TensorFlow version.
|
32 |
+
* Faster startup time when training new networks (<50s), and also when using pre-trained networks (<4s).
|
33 |
+
* New command line options for tweaking the training performance.
|
34 |
+
|
35 |
+
**Compatibility**
|
36 |
+
* Compatible with old network pickles created using the TensorFlow version.
|
37 |
+
* New ZIP/PNG based dataset format for maximal interoperability with existing 3rd party tools.
|
38 |
+
* TFRecords datasets are no longer supported — they need to be converted to the new format.
|
39 |
+
* New JSON-based format for logs, metrics, and training curves.
|
40 |
+
* Training curves are also exported in the old TFEvents format if TensorBoard is installed.
|
41 |
+
* Command line syntax is mostly unchanged, with a few exceptions (e.g., `dataset_tool.py`).
|
42 |
+
* Comparison methods are not supported (`--cmethod`, `--dcap`, `--cfg=cifarbaseline`, `--aug=adarv`)
|
43 |
+
* **Truncation is now disabled by default.**
|
44 |
+
|
45 |
+
## Data repository
|
46 |
+
|
47 |
+
| Path | Description
|
48 |
+
| :--- | :----------
|
49 |
+
| [stylegan2-ada-pytorch](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/) | Main directory hosted on Amazon S3
|
50 |
+
|   ├ [ada-paper.pdf](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/ada-paper.pdf) | Paper PDF
|
51 |
+
|   ├ [images](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/images/) | Curated example images produced using the pre-trained models
|
52 |
+
|   ├ [videos](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/videos/) | Curated example interpolation videos
|
53 |
+
|   └ [pretrained](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/) | Pre-trained models
|
54 |
+
|     ├ ffhq.pkl | FFHQ at 1024x1024, trained using original StyleGAN2
|
55 |
+
|     ├ metfaces.pkl | MetFaces at 1024x1024, transfer learning from FFHQ using ADA
|
56 |
+
|     ├ afhqcat.pkl | AFHQ Cat at 512x512, trained from scratch using ADA
|
57 |
+
|     ├ afhqdog.pkl | AFHQ Dog at 512x512, trained from scratch using ADA
|
58 |
+
|     ├ afhqwild.pkl | AFHQ Wild at 512x512, trained from scratch using ADA
|
59 |
+
|     ├ cifar10.pkl | Class-conditional CIFAR-10 at 32x32
|
60 |
+
|     ├ brecahad.pkl | BreCaHAD at 512x512, trained from scratch using ADA
|
61 |
+
|     ├ [paper-fig7c-training-set-sweeps](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig7c-training-set-sweeps/) | Models used in Fig.7c (sweep over training set size)
|
62 |
+
|     ├ [paper-fig11a-small-datasets](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig11a-small-datasets/) | Models used in Fig.11a (small datasets & transfer learning)
|
63 |
+
|     ├ [paper-fig11b-cifar10](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/paper-fig11b-cifar10/) | Models used in Fig.11b (CIFAR-10)
|
64 |
+
|     ├ [transfer-learning-source-nets](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/transfer-learning-source-nets/) | Models used as starting point for transfer learning
|
65 |
+
|     └ [metrics](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/) | Feature detectors used by the quality metrics
|
66 |
+
|
67 |
+
## Requirements
|
68 |
+
|
69 |
+
* Linux and Windows are supported, but we recommend Linux for performance and compatibility reasons.
|
70 |
+
* 1–8 high-end NVIDIA GPUs with at least 12 GB of memory. We have done all testing and development using NVIDIA DGX-1 with 8 Tesla V100 GPUs.
|
71 |
+
* 64-bit Python 3.7 and PyTorch 1.7.1. See [https://pytorch.org/](https://pytorch.org/) for PyTorch install instructions.
|
72 |
+
* CUDA toolkit 11.0 or later. Use at least version 11.1 if running on RTX 3090. (Why is a separate CUDA toolkit installation required? See comments in [#2](https://github.com/NVlabs/stylegan2-ada-pytorch/issues/2#issuecomment-779457121).)
|
73 |
+
* Python libraries: `pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3`. We use the Anaconda3 2020.11 distribution which installs most of these by default.
|
74 |
+
* Docker users: use the [provided Dockerfile](./Dockerfile) to build an image with the required library dependencies.
|
75 |
+
|
76 |
+
The code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing [Visual Studio Community Edition](https://visualstudio.microsoft.com/vs/) and adding it into `PATH` using `"C:\Program Files (x86)\Microsoft Visual Studio\<VERSION>\Community\VC\Auxiliary\Build\vcvars64.bat"`.
|
77 |
+
|
78 |
+
## Getting started
|
79 |
+
|
80 |
+
Pre-trained networks are stored as `*.pkl` files that can be referenced using local filenames or URLs:
|
81 |
+
|
82 |
+
```.bash
|
83 |
+
# Generate curated MetFaces images without truncation (Fig.10 left)
|
84 |
+
python generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \
|
85 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
86 |
+
|
87 |
+
# Generate uncurated MetFaces images with truncation (Fig.12 upper left)
|
88 |
+
python generate.py --outdir=out --trunc=0.7 --seeds=600-605 \
|
89 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
90 |
+
|
91 |
+
# Generate class conditional CIFAR-10 images (Fig.17 left, Car)
|
92 |
+
python generate.py --outdir=out --seeds=0-35 --class=1 \
|
93 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/cifar10.pkl
|
94 |
+
|
95 |
+
# Style mixing example
|
96 |
+
python style_mixing.py --outdir=out --rows=85,100,75,458,1500 --cols=55,821,1789,293 \
|
97 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
98 |
+
```
|
99 |
+
|
100 |
+
Outputs from the above commands are placed under `out/*.png`, controlled by `--outdir`. Downloaded network pickles are cached under `$HOME/.cache/dnnlib`, which can be overridden by setting the `DNNLIB_CACHE_DIR` environment variable. The default PyTorch extension build directory is `$HOME/.cache/torch_extensions`, which can be overridden by setting `TORCH_EXTENSIONS_DIR`.
|
101 |
+
|
102 |
+
**Docker**: You can run the above curated image example using Docker as follows:
|
103 |
+
|
104 |
+
```.bash
|
105 |
+
docker build --tag sg2ada:latest .
|
106 |
+
./docker_run.sh python3 generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \
|
107 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
108 |
+
```
|
109 |
+
|
110 |
+
Note: The Docker image requires NVIDIA driver release `r455.23` or later.
|
111 |
+
|
112 |
+
**Legacy networks**: The above commands can load most of the network pickles created using the previous TensorFlow versions of StyleGAN2 and StyleGAN2-ADA. However, for future compatibility, we recommend converting such legacy pickles into the new format used by the PyTorch version:
|
113 |
+
|
114 |
+
```.bash
|
115 |
+
python legacy.py \
|
116 |
+
--source=https://nvlabs-fi-cdn.nvidia.com/stylegan2/networks/stylegan2-cat-config-f.pkl \
|
117 |
+
--dest=stylegan2-cat-config-f.pkl
|
118 |
+
```
|
119 |
+
|
120 |
+
## Projecting images to latent space
|
121 |
+
|
122 |
+
To find the matching latent vector for a given image file, run:
|
123 |
+
|
124 |
+
```.bash
|
125 |
+
python projector.py --outdir=out --target=~/mytargetimg.png \
|
126 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
|
127 |
+
```
|
128 |
+
|
129 |
+
For optimal results, the target image should be cropped and aligned similar to the [FFHQ dataset](https://github.com/NVlabs/ffhq-dataset). The above command saves the projection target `out/target.png`, result `out/proj.png`, latent vector `out/projected_w.npz`, and progression video `out/proj.mp4`. You can render the resulting latent vector by specifying `--projected_w` for `generate.py`:
|
130 |
+
|
131 |
+
```.bash
|
132 |
+
python generate.py --outdir=out --projected_w=out/projected_w.npz \
|
133 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
|
134 |
+
```
|
135 |
+
|
136 |
+
## Using networks from Python
|
137 |
+
|
138 |
+
You can use pre-trained networks in your own Python code as follows:
|
139 |
+
|
140 |
+
```.python
|
141 |
+
with open('ffhq.pkl', 'rb') as f:
|
142 |
+
G = pickle.load(f)['G_ema'].cuda() # torch.nn.Module
|
143 |
+
z = torch.randn([1, G.z_dim]).cuda() # latent codes
|
144 |
+
c = None # class labels (not used in this example)
|
145 |
+
img = G(z, c) # NCHW, float32, dynamic range [-1, +1]
|
146 |
+
```
|
147 |
+
|
148 |
+
The above code requires `torch_utils` and `dnnlib` to be accessible via `PYTHONPATH`. It does not need source code for the networks themselves — their class definitions are loaded from the pickle via `torch_utils.persistence`.
|
149 |
+
|
150 |
+
The pickle contains three networks. `'G'` and `'D'` are instantaneous snapshots taken during training, and `'G_ema'` represents a moving average of the generator weights over several training steps. The networks are regular instances of `torch.nn.Module`, with all of their parameters and buffers placed on the CPU at import and gradient computation disabled by default.
|
151 |
+
|
152 |
+
The generator consists of two submodules, `G.mapping` and `G.synthesis`, that can be executed separately. They also support various additional options:
|
153 |
+
|
154 |
+
```.python
|
155 |
+
w = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)
|
156 |
+
img = G.synthesis(w, noise_mode='const', force_fp32=True)
|
157 |
+
```
|
158 |
+
|
159 |
+
Please refer to [`generate.py`](./generate.py), [`style_mixing.py`](./style_mixing.py), and [`projector.py`](./projector.py) for further examples.
|
160 |
+
|
161 |
+
## Preparing datasets
|
162 |
+
|
163 |
+
Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file `dataset.json` for labels.
|
164 |
+
|
165 |
+
Custom datasets can be created from a folder containing images; see [`python dataset_tool.py --help`](./docs/dataset-tool-help.txt) for more information. Alternatively, the folder can also be used directly as a dataset, without running it through `dataset_tool.py` first, but doing so may lead to suboptimal performance.
|
166 |
+
|
167 |
+
Legacy TFRecords datasets are not supported — see below for instructions on how to convert them.
|
168 |
+
|
169 |
+
**FFHQ**:
|
170 |
+
|
171 |
+
Step 1: Download the [Flickr-Faces-HQ dataset](https://github.com/NVlabs/ffhq-dataset) as TFRecords.
|
172 |
+
|
173 |
+
Step 2: Extract images from TFRecords using `dataset_tool.py` from the [TensorFlow version of StyleGAN2-ADA](https://github.com/NVlabs/stylegan2-ada/):
|
174 |
+
|
175 |
+
```.bash
|
176 |
+
# Using dataset_tool.py from TensorFlow version at
|
177 |
+
# https://github.com/NVlabs/stylegan2-ada/
|
178 |
+
python ../stylegan2-ada/dataset_tool.py unpack \
|
179 |
+
--tfrecord_dir=~/ffhq-dataset/tfrecords/ffhq --output_dir=/tmp/ffhq-unpacked
|
180 |
+
```
|
181 |
+
|
182 |
+
Step 3: Create ZIP archive using `dataset_tool.py` from this repository:
|
183 |
+
|
184 |
+
```.bash
|
185 |
+
# Original 1024x1024 resolution.
|
186 |
+
python dataset_tool.py --source=/tmp/ffhq-unpacked --dest=~/datasets/ffhq.zip
|
187 |
+
|
188 |
+
# Scaled down 256x256 resolution.
|
189 |
+
#
|
190 |
+
# Note: --resize-filter=box is required to reproduce FID scores shown in the
|
191 |
+
# paper. If you don't need to match exactly, it's better to leave this out
|
192 |
+
# and default to Lanczos. See https://github.com/NVlabs/stylegan2-ada-pytorch/issues/283#issuecomment-1731217782
|
193 |
+
python dataset_tool.py --source=/tmp/ffhq-unpacked --dest=~/datasets/ffhq256x256.zip \
|
194 |
+
--width=256 --height=256 --resize-filter=box
|
195 |
+
```
|
196 |
+
|
197 |
+
**MetFaces**: Download the [MetFaces dataset](https://github.com/NVlabs/metfaces-dataset) and create ZIP archive:
|
198 |
+
|
199 |
+
```.bash
|
200 |
+
python dataset_tool.py --source=~/downloads/metfaces/images --dest=~/datasets/metfaces.zip
|
201 |
+
```
|
202 |
+
|
203 |
+
**AFHQ**: Download the [AFHQ dataset](https://github.com/clovaai/stargan-v2/blob/master/README.md#animal-faces-hq-dataset-afhq) and create ZIP archive:
|
204 |
+
|
205 |
+
```.bash
|
206 |
+
python dataset_tool.py --source=~/downloads/afhq/train/cat --dest=~/datasets/afhqcat.zip
|
207 |
+
python dataset_tool.py --source=~/downloads/afhq/train/dog --dest=~/datasets/afhqdog.zip
|
208 |
+
python dataset_tool.py --source=~/downloads/afhq/train/wild --dest=~/datasets/afhqwild.zip
|
209 |
+
```
|
210 |
+
|
211 |
+
**CIFAR-10**: Download the [CIFAR-10 python version](https://www.cs.toronto.edu/~kriz/cifar.html) and convert to ZIP archive:
|
212 |
+
|
213 |
+
```.bash
|
214 |
+
python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/cifar10.zip
|
215 |
+
```
|
216 |
+
|
217 |
+
**LSUN**: Download the desired categories from the [LSUN project page](https://www.yf.io/p/lsun/) and convert to ZIP archive:
|
218 |
+
|
219 |
+
```.bash
|
220 |
+
python dataset_tool.py --source=~/downloads/lsun/raw/cat_lmdb --dest=~/datasets/lsuncat200k.zip \
|
221 |
+
--transform=center-crop --width=256 --height=256 --max_images=200000
|
222 |
+
|
223 |
+
python dataset_tool.py --source=~/downloads/lsun/raw/car_lmdb --dest=~/datasets/lsuncar200k.zip \
|
224 |
+
--transform=center-crop-wide --width=512 --height=384 --max_images=200000
|
225 |
+
```
|
226 |
+
|
227 |
+
**BreCaHAD**:
|
228 |
+
|
229 |
+
Step 1: Download the [BreCaHAD dataset](https://figshare.com/articles/BreCaHAD_A_Dataset_for_Breast_Cancer_Histopathological_Annotation_and_Diagnosis/7379186).
|
230 |
+
|
231 |
+
Step 2: Extract 512x512 resolution crops using `dataset_tool.py` from the [TensorFlow version of StyleGAN2-ADA](https://github.com/NVlabs/stylegan2-ada/):
|
232 |
+
|
233 |
+
```.bash
|
234 |
+
# Using dataset_tool.py from TensorFlow version at
|
235 |
+
# https://github.com/NVlabs/stylegan2-ada/
|
236 |
+
python dataset_tool.py extract_brecahad_crops --cropsize=512 \
|
237 |
+
--output_dir=/tmp/brecahad-crops --brecahad_dir=~/downloads/brecahad/images
|
238 |
+
```
|
239 |
+
|
240 |
+
Step 3: Create ZIP archive using `dataset_tool.py` from this repository:
|
241 |
+
|
242 |
+
```.bash
|
243 |
+
python dataset_tool.py --source=/tmp/brecahad-crops --dest=~/datasets/brecahad.zip
|
244 |
+
```
|
245 |
+
|
246 |
+
## Training new networks
|
247 |
+
|
248 |
+
In its most basic form, training new networks boils down to:
|
249 |
+
|
250 |
+
```.bash
|
251 |
+
python train.py --outdir=~/training-runs --data=~/mydataset.zip --gpus=1 --dry-run
|
252 |
+
python train.py --outdir=~/training-runs --data=~/mydataset.zip --gpus=1
|
253 |
+
```
|
254 |
+
|
255 |
+
The first command is optional; it validates the arguments, prints out the training configuration, and exits. The second command kicks off the actual training.
|
256 |
+
|
257 |
+
In this example, the results are saved to a newly created directory `~/training-runs/<ID>-mydataset-auto1`, controlled by `--outdir`. The training exports network pickles (`network-snapshot-<INT>.pkl`) and example images (`fakes<INT>.png`) at regular intervals (controlled by `--snap`). For each pickle, it also evaluates FID (controlled by `--metrics`) and logs the resulting scores in `metric-fid50k_full.jsonl` (as well as TFEvents if TensorBoard is installed).
|
258 |
+
|
259 |
+
The name of the output directory reflects the training configuration. For example, `00000-mydataset-auto1` indicates that the *base configuration* was `auto1`, meaning that the hyperparameters were selected automatically for training on one GPU. The base configuration is controlled by `--cfg`:
|
260 |
+
|
261 |
+
| Base config | Description
|
262 |
+
| :-------------------- | :----------
|
263 |
+
| `auto` (default) | Automatically select reasonable defaults based on resolution and GPU count. Serves as a good starting point for new datasets but does not necessarily lead to optimal results.
|
264 |
+
| `stylegan2` | Reproduce results for StyleGAN2 config F at 1024x1024 using 1, 2, 4, or 8 GPUs.
|
265 |
+
| `paper256` | Reproduce results for FFHQ and LSUN Cat at 256x256 using 1, 2, 4, or 8 GPUs.
|
266 |
+
| `paper512` | Reproduce results for BreCaHAD and AFHQ at 512x512 using 1, 2, 4, or 8 GPUs.
|
267 |
+
| `paper1024` | Reproduce results for MetFaces at 1024x1024 using 1, 2, 4, or 8 GPUs.
|
268 |
+
| `cifar` | Reproduce results for CIFAR-10 (tuned configuration) using 1 or 2 GPUs.
|
269 |
+
|
270 |
+
The training configuration can be further customized with additional command line options:
|
271 |
+
|
272 |
+
* `--aug=noaug` disables ADA.
|
273 |
+
* `--cond=1` enables class-conditional training (requires a dataset with labels).
|
274 |
+
* `--mirror=1` amplifies the dataset with x-flips. Often beneficial, even with ADA.
|
275 |
+
* `--resume=ffhq1024 --snap=10` performs transfer learning from FFHQ trained at 1024x1024.
|
276 |
+
* `--resume=~/training-runs/<NAME>/network-snapshot-<INT>.pkl` resumes a previous training run.
|
277 |
+
* `--gamma=10` overrides R1 gamma. We recommend trying a couple of different values for each new dataset.
|
278 |
+
* `--aug=ada --target=0.7` adjusts ADA target value (default: 0.6).
|
279 |
+
* `--augpipe=blit` enables pixel blitting but disables all other augmentations.
|
280 |
+
* `--augpipe=bgcfnc` enables all available augmentations (blit, geom, color, filter, noise, cutout).
|
281 |
+
|
282 |
+
Please refer to [`python train.py --help`](./docs/train-help.txt) for the full list.
|
283 |
+
|
284 |
+
## Expected training time
|
285 |
+
|
286 |
+
The total training time depends heavily on resolution, number of GPUs, dataset, desired quality, and hyperparameters. The following table lists expected wallclock times to reach different points in the training, measured in thousands of real images shown to the discriminator ("kimg"):
|
287 |
+
|
288 |
+
| Resolution | GPUs | 1000 kimg | 25000 kimg | sec/kimg | GPU mem | CPU mem
|
289 |
+
| :--------: | :--: | :-------: | :--------: | :---------------: | :-----: | :-----:
|
290 |
+
| 128x128 | 1 | 4h 05m | 4d 06h | 12.8–13.7 | 7.2 GB | 3.9 GB
|
291 |
+
| 128x128 | 2 | 2h 06m | 2d 04h | 6.5–6.8 | 7.4 GB | 7.9 GB
|
292 |
+
| 128x128 | 4 | 1h 20m | 1d 09h | 4.1–4.6 | 4.2 GB | 16.3 GB
|
293 |
+
| 128x128 | 8 | 1h 13m | 1d 06h | 3.9–4.9 | 2.6 GB | 31.9 GB
|
294 |
+
| 256x256 | 1 | 6h 36m | 6d 21h | 21.6–24.2 | 5.0 GB | 4.5 GB
|
295 |
+
| 256x256 | 2 | 3h 27m | 3d 14h | 11.2–11.8 | 5.2 GB | 9.0 GB
|
296 |
+
| 256x256 | 4 | 1h 45m | 1d 20h | 5.6–5.9 | 5.2 GB | 17.8 GB
|
297 |
+
| 256x256 | 8 | 1h 24m | 1d 11h | 4.4–5.5 | 3.2 GB | 34.7 GB
|
298 |
+
| 512x512 | 1 | 21h 03m | 21d 22h | 72.5–74.9 | 7.6 GB | 5.0 GB
|
299 |
+
| 512x512 | 2 | 10h 59m | 11d 10h | 37.7–40.0 | 7.8 GB | 9.8 GB
|
300 |
+
| 512x512 | 4 | 5h 29m | 5d 17h | 18.7–19.1 | 7.9 GB | 17.7 GB
|
301 |
+
| 512x512 | 8 | 2h 48m | 2d 22h | 9.5–9.7 | 7.8 GB | 38.2 GB
|
302 |
+
| 1024x1024 | 1 | 1d 20h | 46d 03h | 154.3–161.6 | 8.1 GB | 5.3 GB
|
303 |
+
| 1024x1024 | 2 | 23h 09m | 24d 02h | 80.6–86.2 | 8.6 GB | 11.9 GB
|
304 |
+
| 1024x1024 | 4 | 11h 36m | 12d 02h | 40.1–40.8 | 8.4 GB | 21.9 GB
|
305 |
+
| 1024x1024 | 8 | 5h 54m | 6d 03h | 20.2–20.6 | 8.3 GB | 44.7 GB
|
306 |
+
|
307 |
+
The above measurements were done using NVIDIA Tesla V100 GPUs with default settings (`--cfg=auto --aug=ada --metrics=fid50k_full`). "sec/kimg" shows the expected range of variation in raw training performance, as reported in `log.txt`. "GPU mem" and "CPU mem" show the highest observed memory consumption, excluding the peak at the beginning caused by `torch.backends.cudnn.benchmark`.
|
308 |
+
|
309 |
+
In typical cases, 25000 kimg or more is needed to reach convergence, but the results are already quite reasonable around 5000 kimg. 1000 kimg is often enough for transfer learning, which tends to converge significantly faster. The following figure shows example convergence curves for different datasets as a function of wallclock time, using the same settings as above:
|
310 |
+
|
311 |
+
![Training curves](./docs/stylegan2-ada-training-curves.png)
|
312 |
+
|
313 |
+
Note: `--cfg=auto` serves as a reasonable first guess for the hyperparameters but it does not necessarily lead to optimal results for a given dataset. For example, `--cfg=stylegan2` yields considerably better FID for FFHQ-140k at 1024x1024 than illustrated above. We recommend trying out at least a few different values of `--gamma` for each new dataset.
|
314 |
+
|
315 |
+
## Quality metrics
|
316 |
+
|
317 |
+
By default, `train.py` automatically computes FID for each network pickle exported during training. We recommend inspecting `metric-fid50k_full.jsonl` (or TensorBoard) at regular intervals to monitor the training progress. When desired, the automatic computation can be disabled with `--metrics=none` to speed up the training slightly (3%–9%).
|
318 |
+
|
319 |
+
Additional quality metrics can also be computed after the training:
|
320 |
+
|
321 |
+
```.bash
|
322 |
+
# Previous training run: look up options automatically, save result to JSONL file.
|
323 |
+
python calc_metrics.py --metrics=pr50k3_full \
|
324 |
+
--network=~/training-runs/00000-ffhq10k-res64-auto1/network-snapshot-000000.pkl
|
325 |
+
|
326 |
+
# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
|
327 |
+
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq.zip --mirror=1 \
|
328 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
|
329 |
+
```
|
330 |
+
|
331 |
+
The first example looks up the training configuration and performs the same operation as if `--metrics=pr50k3_full` had been specified during training. The second example downloads a pre-trained network pickle, in which case the values of `--mirror` and `--data` must be specified explicitly.
|
332 |
+
|
333 |
+
Note that many of the metrics have a significant one-off cost when calculating them for the first time for a new dataset (up to 30min). Also note that the evaluation is done using a different random seed each time, so the results will vary if the same metric is computed multiple times.
|
334 |
+
|
335 |
+
We employ the following metrics in the ADA paper. Execution time and GPU memory usage is reported for one NVIDIA Tesla V100 GPU at 1024x1024 resolution:
|
336 |
+
|
337 |
+
| Metric | Time | GPU mem | Description |
|
338 |
+
| :----- | :----: | :-----: | :---------- |
|
339 |
+
| `fid50k_full` | 13 min | 1.8 GB | Fréchet inception distance<sup>[1]</sup> against the full dataset
|
340 |
+
| `kid50k_full` | 13 min | 1.8 GB | Kernel inception distance<sup>[2]</sup> against the full dataset
|
341 |
+
| `pr50k3_full` | 13 min | 4.1 GB | Precision and recall<sup>[3]</sup> againt the full dataset
|
342 |
+
| `is50k` | 13 min | 1.8 GB | Inception score<sup>[4]</sup> for CIFAR-10
|
343 |
+
|
344 |
+
In addition, the following metrics from the [StyleGAN](https://github.com/NVlabs/stylegan) and [StyleGAN2](https://github.com/NVlabs/stylegan2) papers are also supported:
|
345 |
+
|
346 |
+
| Metric | Time | GPU mem | Description |
|
347 |
+
| :------------ | :----: | :-----: | :---------- |
|
348 |
+
| `fid50k` | 13 min | 1.8 GB | Fréchet inception distance against 50k real images
|
349 |
+
| `kid50k` | 13 min | 1.8 GB | Kernel inception distance against 50k real images
|
350 |
+
| `pr50k3` | 13 min | 4.1 GB | Precision and recall against 50k real images
|
351 |
+
| `ppl2_wend` | 36 min | 2.4 GB | Perceptual path length<sup>[5]</sup> in W, endpoints, full image
|
352 |
+
| `ppl_zfull` | 36 min | 2.4 GB | Perceptual path length in Z, full paths, cropped image
|
353 |
+
| `ppl_wfull` | 36 min | 2.4 GB | Perceptual path length in W, full paths, cropped image
|
354 |
+
| `ppl_zend` | 36 min | 2.4 GB | Perceptual path length in Z, endpoints, cropped image
|
355 |
+
| `ppl_wend` | 36 min | 2.4 GB | Perceptual path length in W, endpoints, cropped image
|
356 |
+
|
357 |
+
References:
|
358 |
+
1. [GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium](https://arxiv.org/abs/1706.08500), Heusel et al. 2017
|
359 |
+
2. [Demystifying MMD GANs](https://arxiv.org/abs/1801.01401), Bińkowski et al. 2018
|
360 |
+
3. [Improved Precision and Recall Metric for Assessing Generative Models](https://arxiv.org/abs/1904.06991), Kynkäänniemi et al. 2019
|
361 |
+
4. [Improved Techniques for Training GANs](https://arxiv.org/abs/1606.03498), Salimans et al. 2016
|
362 |
+
5. [A Style-Based Generator Architecture for Generative Adversarial Networks](https://arxiv.org/abs/1812.04948), Karras et al. 2018
|
363 |
+
|
364 |
+
## License
|
365 |
+
|
366 |
+
Copyright © 2021, NVIDIA Corporation. All rights reserved.
|
367 |
+
|
368 |
+
This work is made available under the [Nvidia Source Code License](https://nvlabs.github.io/stylegan2-ada-pytorch/license.html).
|
369 |
+
|
370 |
+
## Citation
|
371 |
+
|
372 |
+
```
|
373 |
+
@inproceedings{Karras2020ada,
|
374 |
+
title = {Training Generative Adversarial Networks with Limited Data},
|
375 |
+
author = {Tero Karras and Miika Aittala and Janne Hellsten and Samuli Laine and Jaakko Lehtinen and Timo Aila},
|
376 |
+
booktitle = {Proc. NeurIPS},
|
377 |
+
year = {2020}
|
378 |
+
}
|
379 |
+
```
|
380 |
+
|
381 |
+
## Development
|
382 |
+
|
383 |
+
This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.
|
384 |
+
|
385 |
+
## Acknowledgements
|
386 |
|
387 |
+
We thank David Luebke for helpful comments; Tero Kuosmanen and Sabu Nadarajan for their support with compute infrastructure; and Edgar Schönfeld for guidance on setting up unconditional BigGAN.
|
app.py
ADDED
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
import torch
|
3 |
+
import torchvision.transforms as transforms
|
4 |
+
from PIL import Image
|
5 |
+
import gradio as gr
|
6 |
+
from tqdm import tqdm
|
7 |
+
|
8 |
+
def optimize_latent_vector(G, target_image, num_iterations=1000):
|
9 |
+
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
10 |
+
target_image = transforms.Resize((G.img_resolution, G.img_resolution))(target_image)
|
11 |
+
target_tensor = transforms.ToTensor()(target_image).unsqueeze(0).to(device)
|
12 |
+
target_tensor = (target_tensor * 2) - 1 # Normalize to [-1, 1]
|
13 |
+
|
14 |
+
latent_vector = torch.randn((1, G.z_dim), device=device, requires_grad=True)
|
15 |
+
optimizer = torch.optim.Adam([latent_vector], lr=0.1)
|
16 |
+
|
17 |
+
for i in tqdm(range(num_iterations), desc="Optimizing latent vector"):
|
18 |
+
optimizer.zero_grad()
|
19 |
+
|
20 |
+
generated_image = G(latent_vector, None)
|
21 |
+
loss = torch.nn.functional.mse_loss(generated_image, target_tensor)
|
22 |
+
|
23 |
+
loss.backward()
|
24 |
+
optimizer.step()
|
25 |
+
|
26 |
+
if (i + 1) % 100 == 0:
|
27 |
+
print(f'Iteration {i+1}/{num_iterations}, Loss: {loss.item()}')
|
28 |
+
|
29 |
+
return latent_vector.detach()
|
30 |
+
|
31 |
+
def generate_from_upload(uploaded_image):
|
32 |
+
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
33 |
+
|
34 |
+
# Optimize latent vector for the uploaded image
|
35 |
+
optimized_z = optimize_latent_vector(G, uploaded_image)
|
36 |
+
|
37 |
+
# Generate variations
|
38 |
+
num_variations = 4
|
39 |
+
variation_strength = 0.1
|
40 |
+
varied_z = optimized_z + torch.randn((num_variations, G.z_dim), device=device) * variation_strength
|
41 |
+
|
42 |
+
# Generate the variations
|
43 |
+
with torch.no_grad():
|
44 |
+
imgs = G(varied_z, c=None, truncation_psi=0.7, noise_mode='const')
|
45 |
+
|
46 |
+
imgs = (imgs * 127.5 + 128).clamp(0, 255).to(torch.uint8)
|
47 |
+
imgs = imgs.permute(0, 2, 3, 1).cpu().numpy()
|
48 |
+
|
49 |
+
# Convert the generated image tensors to PIL Images
|
50 |
+
generated_images = [Image.fromarray(img) for img in imgs]
|
51 |
+
|
52 |
+
# Return the images separately
|
53 |
+
return generated_images[0], generated_images[1], generated_images[2], generated_images[3]
|
54 |
+
|
55 |
+
# Create the Gradio interface
|
56 |
+
iface = gr.Interface(
|
57 |
+
fn=generate_from_upload,
|
58 |
+
inputs=gr.Image(type="pil"),
|
59 |
+
outputs=[gr.Image(type="pil") for _ in range(4)],
|
60 |
+
title="StyleGAN Image Variation Generator"
|
61 |
+
)
|
62 |
+
|
63 |
+
# Launch the Gradio interface
|
64 |
+
iface.launch(share=True, debug=True)
|
calc_metrics.py
ADDED
@@ -0,0 +1,190 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Calculate quality metrics for previous training run or pretrained network pickle."""
|
10 |
+
|
11 |
+
import os
|
12 |
+
import click
|
13 |
+
import json
|
14 |
+
import tempfile
|
15 |
+
import copy
|
16 |
+
import torch
|
17 |
+
import dnnlib
|
18 |
+
|
19 |
+
import legacy
|
20 |
+
from metrics import metric_main
|
21 |
+
from metrics import metric_utils
|
22 |
+
from torch_utils import training_stats
|
23 |
+
from torch_utils import custom_ops
|
24 |
+
from torch_utils import misc
|
25 |
+
|
26 |
+
#----------------------------------------------------------------------------
|
27 |
+
|
28 |
+
def subprocess_fn(rank, args, temp_dir):
|
29 |
+
dnnlib.util.Logger(should_flush=True)
|
30 |
+
|
31 |
+
# Init torch.distributed.
|
32 |
+
if args.num_gpus > 1:
|
33 |
+
init_file = os.path.abspath(os.path.join(temp_dir, '.torch_distributed_init'))
|
34 |
+
if os.name == 'nt':
|
35 |
+
init_method = 'file:///' + init_file.replace('\\', '/')
|
36 |
+
torch.distributed.init_process_group(backend='gloo', init_method=init_method, rank=rank, world_size=args.num_gpus)
|
37 |
+
else:
|
38 |
+
init_method = f'file://{init_file}'
|
39 |
+
torch.distributed.init_process_group(backend='nccl', init_method=init_method, rank=rank, world_size=args.num_gpus)
|
40 |
+
|
41 |
+
# Init torch_utils.
|
42 |
+
sync_device = torch.device('cuda', rank) if args.num_gpus > 1 else None
|
43 |
+
training_stats.init_multiprocessing(rank=rank, sync_device=sync_device)
|
44 |
+
if rank != 0 or not args.verbose:
|
45 |
+
custom_ops.verbosity = 'none'
|
46 |
+
|
47 |
+
# Print network summary.
|
48 |
+
device = torch.device('cuda', rank)
|
49 |
+
torch.backends.cudnn.benchmark = True
|
50 |
+
torch.backends.cuda.matmul.allow_tf32 = False
|
51 |
+
torch.backends.cudnn.allow_tf32 = False
|
52 |
+
G = copy.deepcopy(args.G).eval().requires_grad_(False).to(device)
|
53 |
+
if rank == 0 and args.verbose:
|
54 |
+
z = torch.empty([1, G.z_dim], device=device)
|
55 |
+
c = torch.empty([1, G.c_dim], device=device)
|
56 |
+
misc.print_module_summary(G, [z, c])
|
57 |
+
|
58 |
+
# Calculate each metric.
|
59 |
+
for metric in args.metrics:
|
60 |
+
if rank == 0 and args.verbose:
|
61 |
+
print(f'Calculating {metric}...')
|
62 |
+
progress = metric_utils.ProgressMonitor(verbose=args.verbose)
|
63 |
+
result_dict = metric_main.calc_metric(metric=metric, G=G, dataset_kwargs=args.dataset_kwargs,
|
64 |
+
num_gpus=args.num_gpus, rank=rank, device=device, progress=progress)
|
65 |
+
if rank == 0:
|
66 |
+
metric_main.report_metric(result_dict, run_dir=args.run_dir, snapshot_pkl=args.network_pkl)
|
67 |
+
if rank == 0 and args.verbose:
|
68 |
+
print()
|
69 |
+
|
70 |
+
# Done.
|
71 |
+
if rank == 0 and args.verbose:
|
72 |
+
print('Exiting...')
|
73 |
+
|
74 |
+
#----------------------------------------------------------------------------
|
75 |
+
|
76 |
+
class CommaSeparatedList(click.ParamType):
|
77 |
+
name = 'list'
|
78 |
+
|
79 |
+
def convert(self, value, param, ctx):
|
80 |
+
_ = param, ctx
|
81 |
+
if value is None or value.lower() == 'none' or value == '':
|
82 |
+
return []
|
83 |
+
return value.split(',')
|
84 |
+
|
85 |
+
#----------------------------------------------------------------------------
|
86 |
+
|
87 |
+
@click.command()
|
88 |
+
@click.pass_context
|
89 |
+
@click.option('network_pkl', '--network', help='Network pickle filename or URL', metavar='PATH', required=True)
|
90 |
+
@click.option('--metrics', help='Comma-separated list or "none"', type=CommaSeparatedList(), default='fid50k_full', show_default=True)
|
91 |
+
@click.option('--data', help='Dataset to evaluate metrics against (directory or zip) [default: same as training data]', metavar='PATH')
|
92 |
+
@click.option('--mirror', help='Whether the dataset was augmented with x-flips during training [default: look up]', type=bool, metavar='BOOL')
|
93 |
+
@click.option('--gpus', help='Number of GPUs to use', type=int, default=1, metavar='INT', show_default=True)
|
94 |
+
@click.option('--verbose', help='Print optional information', type=bool, default=True, metavar='BOOL', show_default=True)
|
95 |
+
|
96 |
+
def calc_metrics(ctx, network_pkl, metrics, data, mirror, gpus, verbose):
|
97 |
+
"""Calculate quality metrics for previous training run or pretrained network pickle.
|
98 |
+
|
99 |
+
Examples:
|
100 |
+
|
101 |
+
\b
|
102 |
+
# Previous training run: look up options automatically, save result to JSONL file.
|
103 |
+
python calc_metrics.py --metrics=pr50k3_full \\
|
104 |
+
--network=~/training-runs/00000-ffhq10k-res64-auto1/network-snapshot-000000.pkl
|
105 |
+
|
106 |
+
\b
|
107 |
+
# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
|
108 |
+
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq.zip --mirror=1 \\
|
109 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
|
110 |
+
|
111 |
+
Available metrics:
|
112 |
+
|
113 |
+
\b
|
114 |
+
ADA paper:
|
115 |
+
fid50k_full Frechet inception distance against the full dataset.
|
116 |
+
kid50k_full Kernel inception distance against the full dataset.
|
117 |
+
pr50k3_full Precision and recall againt the full dataset.
|
118 |
+
is50k Inception score for CIFAR-10.
|
119 |
+
|
120 |
+
\b
|
121 |
+
StyleGAN and StyleGAN2 papers:
|
122 |
+
fid50k Frechet inception distance against 50k real images.
|
123 |
+
kid50k Kernel inception distance against 50k real images.
|
124 |
+
pr50k3 Precision and recall against 50k real images.
|
125 |
+
ppl2_wend Perceptual path length in W at path endpoints against full image.
|
126 |
+
ppl_zfull Perceptual path length in Z for full paths against cropped image.
|
127 |
+
ppl_wfull Perceptual path length in W for full paths against cropped image.
|
128 |
+
ppl_zend Perceptual path length in Z at path endpoints against cropped image.
|
129 |
+
ppl_wend Perceptual path length in W at path endpoints against cropped image.
|
130 |
+
"""
|
131 |
+
dnnlib.util.Logger(should_flush=True)
|
132 |
+
|
133 |
+
# Validate arguments.
|
134 |
+
args = dnnlib.EasyDict(metrics=metrics, num_gpus=gpus, network_pkl=network_pkl, verbose=verbose)
|
135 |
+
if not all(metric_main.is_valid_metric(metric) for metric in args.metrics):
|
136 |
+
ctx.fail('\n'.join(['--metrics can only contain the following values:'] + metric_main.list_valid_metrics()))
|
137 |
+
if not args.num_gpus >= 1:
|
138 |
+
ctx.fail('--gpus must be at least 1')
|
139 |
+
|
140 |
+
# Load network.
|
141 |
+
if not dnnlib.util.is_url(network_pkl, allow_file_urls=True) and not os.path.isfile(network_pkl):
|
142 |
+
ctx.fail('--network must point to a file or URL')
|
143 |
+
if args.verbose:
|
144 |
+
print(f'Loading network from "{network_pkl}"...')
|
145 |
+
with dnnlib.util.open_url(network_pkl, verbose=args.verbose) as f:
|
146 |
+
network_dict = legacy.load_network_pkl(f)
|
147 |
+
args.G = network_dict['G_ema'] # subclass of torch.nn.Module
|
148 |
+
|
149 |
+
# Initialize dataset options.
|
150 |
+
if data is not None:
|
151 |
+
args.dataset_kwargs = dnnlib.EasyDict(class_name='training.dataset.ImageFolderDataset', path=data)
|
152 |
+
elif network_dict['training_set_kwargs'] is not None:
|
153 |
+
args.dataset_kwargs = dnnlib.EasyDict(network_dict['training_set_kwargs'])
|
154 |
+
else:
|
155 |
+
ctx.fail('Could not look up dataset options; please specify --data')
|
156 |
+
|
157 |
+
# Finalize dataset options.
|
158 |
+
args.dataset_kwargs.resolution = args.G.img_resolution
|
159 |
+
args.dataset_kwargs.use_labels = (args.G.c_dim != 0)
|
160 |
+
if mirror is not None:
|
161 |
+
args.dataset_kwargs.xflip = mirror
|
162 |
+
|
163 |
+
# Print dataset options.
|
164 |
+
if args.verbose:
|
165 |
+
print('Dataset options:')
|
166 |
+
print(json.dumps(args.dataset_kwargs, indent=2))
|
167 |
+
|
168 |
+
# Locate run dir.
|
169 |
+
args.run_dir = None
|
170 |
+
if os.path.isfile(network_pkl):
|
171 |
+
pkl_dir = os.path.dirname(network_pkl)
|
172 |
+
if os.path.isfile(os.path.join(pkl_dir, 'training_options.json')):
|
173 |
+
args.run_dir = pkl_dir
|
174 |
+
|
175 |
+
# Launch processes.
|
176 |
+
if args.verbose:
|
177 |
+
print('Launching processes...')
|
178 |
+
torch.multiprocessing.set_start_method('spawn')
|
179 |
+
with tempfile.TemporaryDirectory() as temp_dir:
|
180 |
+
if args.num_gpus == 1:
|
181 |
+
subprocess_fn(rank=0, args=args, temp_dir=temp_dir)
|
182 |
+
else:
|
183 |
+
torch.multiprocessing.spawn(fn=subprocess_fn, args=(args, temp_dir), nprocs=args.num_gpus)
|
184 |
+
|
185 |
+
#----------------------------------------------------------------------------
|
186 |
+
|
187 |
+
if __name__ == "__main__":
|
188 |
+
calc_metrics() # pylint: disable=no-value-for-parameter
|
189 |
+
|
190 |
+
#----------------------------------------------------------------------------
|
dataset_tool.py
ADDED
@@ -0,0 +1,444 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import functools
|
10 |
+
import io
|
11 |
+
import json
|
12 |
+
import os
|
13 |
+
import pickle
|
14 |
+
import sys
|
15 |
+
import tarfile
|
16 |
+
import gzip
|
17 |
+
import zipfile
|
18 |
+
from pathlib import Path
|
19 |
+
from typing import Callable, Optional, Tuple, Union
|
20 |
+
|
21 |
+
import click
|
22 |
+
import numpy as np
|
23 |
+
import PIL.Image
|
24 |
+
from tqdm import tqdm
|
25 |
+
|
26 |
+
#----------------------------------------------------------------------------
|
27 |
+
|
28 |
+
def error(msg):
|
29 |
+
print('Error: ' + msg)
|
30 |
+
sys.exit(1)
|
31 |
+
|
32 |
+
#----------------------------------------------------------------------------
|
33 |
+
|
34 |
+
def maybe_min(a: int, b: Optional[int]) -> int:
|
35 |
+
if b is not None:
|
36 |
+
return min(a, b)
|
37 |
+
return a
|
38 |
+
|
39 |
+
#----------------------------------------------------------------------------
|
40 |
+
|
41 |
+
def file_ext(name: Union[str, Path]) -> str:
|
42 |
+
return str(name).split('.')[-1]
|
43 |
+
|
44 |
+
#----------------------------------------------------------------------------
|
45 |
+
|
46 |
+
def is_image_ext(fname: Union[str, Path]) -> bool:
|
47 |
+
ext = file_ext(fname).lower()
|
48 |
+
return f'.{ext}' in PIL.Image.EXTENSION # type: ignore
|
49 |
+
|
50 |
+
#----------------------------------------------------------------------------
|
51 |
+
|
52 |
+
def open_image_folder(source_dir, *, max_images: Optional[int]):
|
53 |
+
input_images = [str(f) for f in sorted(Path(source_dir).rglob('*')) if is_image_ext(f) and os.path.isfile(f)]
|
54 |
+
|
55 |
+
# Load labels.
|
56 |
+
labels = {}
|
57 |
+
meta_fname = os.path.join(source_dir, 'dataset.json')
|
58 |
+
if os.path.isfile(meta_fname):
|
59 |
+
with open(meta_fname, 'r') as file:
|
60 |
+
labels = json.load(file)['labels']
|
61 |
+
if labels is not None:
|
62 |
+
labels = { x[0]: x[1] for x in labels }
|
63 |
+
else:
|
64 |
+
labels = {}
|
65 |
+
|
66 |
+
max_idx = maybe_min(len(input_images), max_images)
|
67 |
+
|
68 |
+
def iterate_images():
|
69 |
+
for idx, fname in enumerate(input_images):
|
70 |
+
arch_fname = os.path.relpath(fname, source_dir)
|
71 |
+
arch_fname = arch_fname.replace('\\', '/')
|
72 |
+
img = np.array(PIL.Image.open(fname))
|
73 |
+
yield dict(img=img, label=labels.get(arch_fname))
|
74 |
+
if idx >= max_idx-1:
|
75 |
+
break
|
76 |
+
return max_idx, iterate_images()
|
77 |
+
|
78 |
+
#----------------------------------------------------------------------------
|
79 |
+
|
80 |
+
def open_image_zip(source, *, max_images: Optional[int]):
|
81 |
+
with zipfile.ZipFile(source, mode='r') as z:
|
82 |
+
input_images = [str(f) for f in sorted(z.namelist()) if is_image_ext(f)]
|
83 |
+
|
84 |
+
# Load labels.
|
85 |
+
labels = {}
|
86 |
+
if 'dataset.json' in z.namelist():
|
87 |
+
with z.open('dataset.json', 'r') as file:
|
88 |
+
labels = json.load(file)['labels']
|
89 |
+
if labels is not None:
|
90 |
+
labels = { x[0]: x[1] for x in labels }
|
91 |
+
else:
|
92 |
+
labels = {}
|
93 |
+
|
94 |
+
max_idx = maybe_min(len(input_images), max_images)
|
95 |
+
|
96 |
+
def iterate_images():
|
97 |
+
with zipfile.ZipFile(source, mode='r') as z:
|
98 |
+
for idx, fname in enumerate(input_images):
|
99 |
+
with z.open(fname, 'r') as file:
|
100 |
+
img = PIL.Image.open(file) # type: ignore
|
101 |
+
img = np.array(img)
|
102 |
+
yield dict(img=img, label=labels.get(fname))
|
103 |
+
if idx >= max_idx-1:
|
104 |
+
break
|
105 |
+
return max_idx, iterate_images()
|
106 |
+
|
107 |
+
#----------------------------------------------------------------------------
|
108 |
+
|
109 |
+
def open_lmdb(lmdb_dir: str, *, max_images: Optional[int]):
|
110 |
+
import cv2 # pip install opencv-python
|
111 |
+
import lmdb # pip install lmdb # pylint: disable=import-error
|
112 |
+
|
113 |
+
with lmdb.open(lmdb_dir, readonly=True, lock=False).begin(write=False) as txn:
|
114 |
+
max_idx = maybe_min(txn.stat()['entries'], max_images)
|
115 |
+
|
116 |
+
def iterate_images():
|
117 |
+
with lmdb.open(lmdb_dir, readonly=True, lock=False).begin(write=False) as txn:
|
118 |
+
for idx, (_key, value) in enumerate(txn.cursor()):
|
119 |
+
try:
|
120 |
+
try:
|
121 |
+
img = cv2.imdecode(np.frombuffer(value, dtype=np.uint8), 1)
|
122 |
+
if img is None:
|
123 |
+
raise IOError('cv2.imdecode failed')
|
124 |
+
img = img[:, :, ::-1] # BGR => RGB
|
125 |
+
except IOError:
|
126 |
+
img = np.array(PIL.Image.open(io.BytesIO(value)))
|
127 |
+
yield dict(img=img, label=None)
|
128 |
+
if idx >= max_idx-1:
|
129 |
+
break
|
130 |
+
except:
|
131 |
+
print(sys.exc_info()[1])
|
132 |
+
|
133 |
+
return max_idx, iterate_images()
|
134 |
+
|
135 |
+
#----------------------------------------------------------------------------
|
136 |
+
|
137 |
+
def open_cifar10(tarball: str, *, max_images: Optional[int]):
|
138 |
+
images = []
|
139 |
+
labels = []
|
140 |
+
|
141 |
+
with tarfile.open(tarball, 'r:gz') as tar:
|
142 |
+
for batch in range(1, 6):
|
143 |
+
member = tar.getmember(f'cifar-10-batches-py/data_batch_{batch}')
|
144 |
+
with tar.extractfile(member) as file:
|
145 |
+
data = pickle.load(file, encoding='latin1')
|
146 |
+
images.append(data['data'].reshape(-1, 3, 32, 32))
|
147 |
+
labels.append(data['labels'])
|
148 |
+
|
149 |
+
images = np.concatenate(images)
|
150 |
+
labels = np.concatenate(labels)
|
151 |
+
images = images.transpose([0, 2, 3, 1]) # NCHW -> NHWC
|
152 |
+
assert images.shape == (50000, 32, 32, 3) and images.dtype == np.uint8
|
153 |
+
assert labels.shape == (50000,) and labels.dtype in [np.int32, np.int64]
|
154 |
+
assert np.min(images) == 0 and np.max(images) == 255
|
155 |
+
assert np.min(labels) == 0 and np.max(labels) == 9
|
156 |
+
|
157 |
+
max_idx = maybe_min(len(images), max_images)
|
158 |
+
|
159 |
+
def iterate_images():
|
160 |
+
for idx, img in enumerate(images):
|
161 |
+
yield dict(img=img, label=int(labels[idx]))
|
162 |
+
if idx >= max_idx-1:
|
163 |
+
break
|
164 |
+
|
165 |
+
return max_idx, iterate_images()
|
166 |
+
|
167 |
+
#----------------------------------------------------------------------------
|
168 |
+
|
169 |
+
def open_mnist(images_gz: str, *, max_images: Optional[int]):
|
170 |
+
labels_gz = images_gz.replace('-images-idx3-ubyte.gz', '-labels-idx1-ubyte.gz')
|
171 |
+
assert labels_gz != images_gz
|
172 |
+
images = []
|
173 |
+
labels = []
|
174 |
+
|
175 |
+
with gzip.open(images_gz, 'rb') as f:
|
176 |
+
images = np.frombuffer(f.read(), np.uint8, offset=16)
|
177 |
+
with gzip.open(labels_gz, 'rb') as f:
|
178 |
+
labels = np.frombuffer(f.read(), np.uint8, offset=8)
|
179 |
+
|
180 |
+
images = images.reshape(-1, 28, 28)
|
181 |
+
images = np.pad(images, [(0,0), (2,2), (2,2)], 'constant', constant_values=0)
|
182 |
+
assert images.shape == (60000, 32, 32) and images.dtype == np.uint8
|
183 |
+
assert labels.shape == (60000,) and labels.dtype == np.uint8
|
184 |
+
assert np.min(images) == 0 and np.max(images) == 255
|
185 |
+
assert np.min(labels) == 0 and np.max(labels) == 9
|
186 |
+
|
187 |
+
max_idx = maybe_min(len(images), max_images)
|
188 |
+
|
189 |
+
def iterate_images():
|
190 |
+
for idx, img in enumerate(images):
|
191 |
+
yield dict(img=img, label=int(labels[idx]))
|
192 |
+
if idx >= max_idx-1:
|
193 |
+
break
|
194 |
+
|
195 |
+
return max_idx, iterate_images()
|
196 |
+
|
197 |
+
#----------------------------------------------------------------------------
|
198 |
+
|
199 |
+
def make_transform(
|
200 |
+
transform: Optional[str],
|
201 |
+
output_width: Optional[int],
|
202 |
+
output_height: Optional[int],
|
203 |
+
resize_filter: str
|
204 |
+
) -> Callable[[np.ndarray], Optional[np.ndarray]]:
|
205 |
+
resample = { 'box': PIL.Image.BOX, 'lanczos': PIL.Image.LANCZOS }[resize_filter]
|
206 |
+
def scale(width, height, img):
|
207 |
+
w = img.shape[1]
|
208 |
+
h = img.shape[0]
|
209 |
+
if width == w and height == h:
|
210 |
+
return img
|
211 |
+
img = PIL.Image.fromarray(img)
|
212 |
+
ww = width if width is not None else w
|
213 |
+
hh = height if height is not None else h
|
214 |
+
img = img.resize((ww, hh), resample)
|
215 |
+
return np.array(img)
|
216 |
+
|
217 |
+
def center_crop(width, height, img):
|
218 |
+
crop = np.min(img.shape[:2])
|
219 |
+
img = img[(img.shape[0] - crop) // 2 : (img.shape[0] + crop) // 2, (img.shape[1] - crop) // 2 : (img.shape[1] + crop) // 2]
|
220 |
+
img = PIL.Image.fromarray(img, 'RGB')
|
221 |
+
img = img.resize((width, height), resample)
|
222 |
+
return np.array(img)
|
223 |
+
|
224 |
+
def center_crop_wide(width, height, img):
|
225 |
+
ch = int(np.round(width * img.shape[0] / img.shape[1]))
|
226 |
+
if img.shape[1] < width or ch < height:
|
227 |
+
return None
|
228 |
+
|
229 |
+
img = img[(img.shape[0] - ch) // 2 : (img.shape[0] + ch) // 2]
|
230 |
+
img = PIL.Image.fromarray(img, 'RGB')
|
231 |
+
img = img.resize((width, height), resample)
|
232 |
+
img = np.array(img)
|
233 |
+
|
234 |
+
canvas = np.zeros([width, width, 3], dtype=np.uint8)
|
235 |
+
canvas[(width - height) // 2 : (width + height) // 2, :] = img
|
236 |
+
return canvas
|
237 |
+
|
238 |
+
if transform is None:
|
239 |
+
return functools.partial(scale, output_width, output_height)
|
240 |
+
if transform == 'center-crop':
|
241 |
+
if (output_width is None) or (output_height is None):
|
242 |
+
error ('must specify --width and --height when using ' + transform + 'transform')
|
243 |
+
return functools.partial(center_crop, output_width, output_height)
|
244 |
+
if transform == 'center-crop-wide':
|
245 |
+
if (output_width is None) or (output_height is None):
|
246 |
+
error ('must specify --width and --height when using ' + transform + ' transform')
|
247 |
+
return functools.partial(center_crop_wide, output_width, output_height)
|
248 |
+
assert False, 'unknown transform'
|
249 |
+
|
250 |
+
#----------------------------------------------------------------------------
|
251 |
+
|
252 |
+
def open_dataset(source, *, max_images: Optional[int]):
|
253 |
+
if os.path.isdir(source):
|
254 |
+
if source.rstrip('/').endswith('_lmdb'):
|
255 |
+
return open_lmdb(source, max_images=max_images)
|
256 |
+
else:
|
257 |
+
return open_image_folder(source, max_images=max_images)
|
258 |
+
elif os.path.isfile(source):
|
259 |
+
if os.path.basename(source) == 'cifar-10-python.tar.gz':
|
260 |
+
return open_cifar10(source, max_images=max_images)
|
261 |
+
elif os.path.basename(source) == 'train-images-idx3-ubyte.gz':
|
262 |
+
return open_mnist(source, max_images=max_images)
|
263 |
+
elif file_ext(source) == 'zip':
|
264 |
+
return open_image_zip(source, max_images=max_images)
|
265 |
+
else:
|
266 |
+
assert False, 'unknown archive type'
|
267 |
+
else:
|
268 |
+
error(f'Missing input file or directory: {source}')
|
269 |
+
|
270 |
+
#----------------------------------------------------------------------------
|
271 |
+
|
272 |
+
def open_dest(dest: str) -> Tuple[str, Callable[[str, Union[bytes, str]], None], Callable[[], None]]:
|
273 |
+
dest_ext = file_ext(dest)
|
274 |
+
|
275 |
+
if dest_ext == 'zip':
|
276 |
+
if os.path.dirname(dest) != '':
|
277 |
+
os.makedirs(os.path.dirname(dest), exist_ok=True)
|
278 |
+
zf = zipfile.ZipFile(file=dest, mode='w', compression=zipfile.ZIP_STORED)
|
279 |
+
def zip_write_bytes(fname: str, data: Union[bytes, str]):
|
280 |
+
zf.writestr(fname, data)
|
281 |
+
return '', zip_write_bytes, zf.close
|
282 |
+
else:
|
283 |
+
# If the output folder already exists, check that is is
|
284 |
+
# empty.
|
285 |
+
#
|
286 |
+
# Note: creating the output directory is not strictly
|
287 |
+
# necessary as folder_write_bytes() also mkdirs, but it's better
|
288 |
+
# to give an error message earlier in case the dest folder
|
289 |
+
# somehow cannot be created.
|
290 |
+
if os.path.isdir(dest) and len(os.listdir(dest)) != 0:
|
291 |
+
error('--dest folder must be empty')
|
292 |
+
os.makedirs(dest, exist_ok=True)
|
293 |
+
|
294 |
+
def folder_write_bytes(fname: str, data: Union[bytes, str]):
|
295 |
+
os.makedirs(os.path.dirname(fname), exist_ok=True)
|
296 |
+
with open(fname, 'wb') as fout:
|
297 |
+
if isinstance(data, str):
|
298 |
+
data = data.encode('utf8')
|
299 |
+
fout.write(data)
|
300 |
+
return dest, folder_write_bytes, lambda: None
|
301 |
+
|
302 |
+
#----------------------------------------------------------------------------
|
303 |
+
|
304 |
+
@click.command()
|
305 |
+
@click.pass_context
|
306 |
+
@click.option('--source', help='Directory or archive name for input dataset', required=True, metavar='PATH')
|
307 |
+
@click.option('--dest', help='Output directory or archive name for output dataset', required=True, metavar='PATH')
|
308 |
+
@click.option('--max-images', help='Output only up to `max-images` images', type=int, default=None)
|
309 |
+
@click.option('--resize-filter', help='Filter to use when resizing images for output resolution', type=click.Choice(['box', 'lanczos']), default='lanczos', show_default=True)
|
310 |
+
@click.option('--transform', help='Input crop/resize mode', type=click.Choice(['center-crop', 'center-crop-wide']))
|
311 |
+
@click.option('--width', help='Output width', type=int)
|
312 |
+
@click.option('--height', help='Output height', type=int)
|
313 |
+
def convert_dataset(
|
314 |
+
ctx: click.Context,
|
315 |
+
source: str,
|
316 |
+
dest: str,
|
317 |
+
max_images: Optional[int],
|
318 |
+
transform: Optional[str],
|
319 |
+
resize_filter: str,
|
320 |
+
width: Optional[int],
|
321 |
+
height: Optional[int]
|
322 |
+
):
|
323 |
+
"""Convert an image dataset into a dataset archive usable with StyleGAN2 ADA PyTorch.
|
324 |
+
|
325 |
+
The input dataset format is guessed from the --source argument:
|
326 |
+
|
327 |
+
\b
|
328 |
+
--source *_lmdb/ Load LSUN dataset
|
329 |
+
--source cifar-10-python.tar.gz Load CIFAR-10 dataset
|
330 |
+
--source train-images-idx3-ubyte.gz Load MNIST dataset
|
331 |
+
--source path/ Recursively load all images from path/
|
332 |
+
--source dataset.zip Recursively load all images from dataset.zip
|
333 |
+
|
334 |
+
Specifying the output format and path:
|
335 |
+
|
336 |
+
\b
|
337 |
+
--dest /path/to/dir Save output files under /path/to/dir
|
338 |
+
--dest /path/to/dataset.zip Save output files into /path/to/dataset.zip
|
339 |
+
|
340 |
+
The output dataset format can be either an image folder or an uncompressed zip archive.
|
341 |
+
Zip archives makes it easier to move datasets around file servers and clusters, and may
|
342 |
+
offer better training performance on network file systems.
|
343 |
+
|
344 |
+
Images within the dataset archive will be stored as uncompressed PNG.
|
345 |
+
Uncompresed PNGs can be efficiently decoded in the training loop.
|
346 |
+
|
347 |
+
Class labels are stored in a file called 'dataset.json' that is stored at the
|
348 |
+
dataset root folder. This file has the following structure:
|
349 |
+
|
350 |
+
\b
|
351 |
+
{
|
352 |
+
"labels": [
|
353 |
+
["00000/img00000000.png",6],
|
354 |
+
["00000/img00000001.png",9],
|
355 |
+
... repeated for every image in the datase
|
356 |
+
["00049/img00049999.png",1]
|
357 |
+
]
|
358 |
+
}
|
359 |
+
|
360 |
+
If the 'dataset.json' file cannot be found, the dataset is interpreted as
|
361 |
+
not containing class labels.
|
362 |
+
|
363 |
+
Image scale/crop and resolution requirements:
|
364 |
+
|
365 |
+
Output images must be square-shaped and they must all have the same power-of-two
|
366 |
+
dimensions.
|
367 |
+
|
368 |
+
To scale arbitrary input image size to a specific width and height, use the
|
369 |
+
--width and --height options. Output resolution will be either the original
|
370 |
+
input resolution (if --width/--height was not specified) or the one specified with
|
371 |
+
--width/height.
|
372 |
+
|
373 |
+
Use the --transform=center-crop or --transform=center-crop-wide options to apply a
|
374 |
+
center crop transform on the input image. These options should be used with the
|
375 |
+
--width and --height options. For example:
|
376 |
+
|
377 |
+
\b
|
378 |
+
python dataset_tool.py --source LSUN/raw/cat_lmdb --dest /tmp/lsun_cat \\
|
379 |
+
--transform=center-crop-wide --width 512 --height=384
|
380 |
+
"""
|
381 |
+
|
382 |
+
PIL.Image.init() # type: ignore
|
383 |
+
|
384 |
+
if dest == '':
|
385 |
+
ctx.fail('--dest output filename or directory must not be an empty string')
|
386 |
+
|
387 |
+
num_files, input_iter = open_dataset(source, max_images=max_images)
|
388 |
+
archive_root_dir, save_bytes, close_dest = open_dest(dest)
|
389 |
+
|
390 |
+
transform_image = make_transform(transform, width, height, resize_filter)
|
391 |
+
|
392 |
+
dataset_attrs = None
|
393 |
+
|
394 |
+
labels = []
|
395 |
+
for idx, image in tqdm(enumerate(input_iter), total=num_files):
|
396 |
+
idx_str = f'{idx:08d}'
|
397 |
+
archive_fname = f'{idx_str[:5]}/img{idx_str}.png'
|
398 |
+
|
399 |
+
# Apply crop and resize.
|
400 |
+
img = transform_image(image['img'])
|
401 |
+
|
402 |
+
# Transform may drop images.
|
403 |
+
if img is None:
|
404 |
+
continue
|
405 |
+
|
406 |
+
# Error check to require uniform image attributes across
|
407 |
+
# the whole dataset.
|
408 |
+
channels = img.shape[2] if img.ndim == 3 else 1
|
409 |
+
cur_image_attrs = {
|
410 |
+
'width': img.shape[1],
|
411 |
+
'height': img.shape[0],
|
412 |
+
'channels': channels
|
413 |
+
}
|
414 |
+
if dataset_attrs is None:
|
415 |
+
dataset_attrs = cur_image_attrs
|
416 |
+
width = dataset_attrs['width']
|
417 |
+
height = dataset_attrs['height']
|
418 |
+
if width != height:
|
419 |
+
error(f'Image dimensions after scale and crop are required to be square. Got {width}x{height}')
|
420 |
+
if dataset_attrs['channels'] not in [1, 3]:
|
421 |
+
error('Input images must be stored as RGB or grayscale')
|
422 |
+
if width != 2 ** int(np.floor(np.log2(width))):
|
423 |
+
error('Image width/height after scale and crop are required to be power-of-two')
|
424 |
+
elif dataset_attrs != cur_image_attrs:
|
425 |
+
err = [f' dataset {k}/cur image {k}: {dataset_attrs[k]}/{cur_image_attrs[k]}' for k in dataset_attrs.keys()]
|
426 |
+
error(f'Image {archive_fname} attributes must be equal across all images of the dataset. Got:\n' + '\n'.join(err))
|
427 |
+
|
428 |
+
# Save the image as an uncompressed PNG.
|
429 |
+
img = PIL.Image.fromarray(img, { 1: 'L', 3: 'RGB' }[channels])
|
430 |
+
image_bits = io.BytesIO()
|
431 |
+
img.save(image_bits, format='png', compress_level=0, optimize=False)
|
432 |
+
save_bytes(os.path.join(archive_root_dir, archive_fname), image_bits.getbuffer())
|
433 |
+
labels.append([archive_fname, image['label']] if image['label'] is not None else None)
|
434 |
+
|
435 |
+
metadata = {
|
436 |
+
'labels': labels if all(x is not None for x in labels) else None
|
437 |
+
}
|
438 |
+
save_bytes(os.path.join(archive_root_dir, 'dataset.json'), json.dumps(metadata))
|
439 |
+
close_dest()
|
440 |
+
|
441 |
+
#----------------------------------------------------------------------------
|
442 |
+
|
443 |
+
if __name__ == "__main__":
|
444 |
+
convert_dataset() # pylint: disable=no-value-for-parameter
|
dnnlib/__init__.py
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
from .util import EasyDict, make_cache_dir_path
|
dnnlib/util.py
ADDED
@@ -0,0 +1,477 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Miscellaneous utility classes and functions."""
|
10 |
+
|
11 |
+
import ctypes
|
12 |
+
import fnmatch
|
13 |
+
import importlib
|
14 |
+
import inspect
|
15 |
+
import numpy as np
|
16 |
+
import os
|
17 |
+
import shutil
|
18 |
+
import sys
|
19 |
+
import types
|
20 |
+
import io
|
21 |
+
import pickle
|
22 |
+
import re
|
23 |
+
import requests
|
24 |
+
import html
|
25 |
+
import hashlib
|
26 |
+
import glob
|
27 |
+
import tempfile
|
28 |
+
import urllib
|
29 |
+
import urllib.request
|
30 |
+
import uuid
|
31 |
+
|
32 |
+
from distutils.util import strtobool
|
33 |
+
from typing import Any, List, Tuple, Union
|
34 |
+
|
35 |
+
|
36 |
+
# Util classes
|
37 |
+
# ------------------------------------------------------------------------------------------
|
38 |
+
|
39 |
+
|
40 |
+
class EasyDict(dict):
|
41 |
+
"""Convenience class that behaves like a dict but allows access with the attribute syntax."""
|
42 |
+
|
43 |
+
def __getattr__(self, name: str) -> Any:
|
44 |
+
try:
|
45 |
+
return self[name]
|
46 |
+
except KeyError:
|
47 |
+
raise AttributeError(name)
|
48 |
+
|
49 |
+
def __setattr__(self, name: str, value: Any) -> None:
|
50 |
+
self[name] = value
|
51 |
+
|
52 |
+
def __delattr__(self, name: str) -> None:
|
53 |
+
del self[name]
|
54 |
+
|
55 |
+
|
56 |
+
class Logger(object):
|
57 |
+
"""Redirect stderr to stdout, optionally print stdout to a file, and optionally force flushing on both stdout and the file."""
|
58 |
+
|
59 |
+
def __init__(self, file_name: str = None, file_mode: str = "w", should_flush: bool = True):
|
60 |
+
self.file = None
|
61 |
+
|
62 |
+
if file_name is not None:
|
63 |
+
self.file = open(file_name, file_mode)
|
64 |
+
|
65 |
+
self.should_flush = should_flush
|
66 |
+
self.stdout = sys.stdout
|
67 |
+
self.stderr = sys.stderr
|
68 |
+
|
69 |
+
sys.stdout = self
|
70 |
+
sys.stderr = self
|
71 |
+
|
72 |
+
def __enter__(self) -> "Logger":
|
73 |
+
return self
|
74 |
+
|
75 |
+
def __exit__(self, exc_type: Any, exc_value: Any, traceback: Any) -> None:
|
76 |
+
self.close()
|
77 |
+
|
78 |
+
def write(self, text: Union[str, bytes]) -> None:
|
79 |
+
"""Write text to stdout (and a file) and optionally flush."""
|
80 |
+
if isinstance(text, bytes):
|
81 |
+
text = text.decode()
|
82 |
+
if len(text) == 0: # workaround for a bug in VSCode debugger: sys.stdout.write(''); sys.stdout.flush() => crash
|
83 |
+
return
|
84 |
+
|
85 |
+
if self.file is not None:
|
86 |
+
self.file.write(text)
|
87 |
+
|
88 |
+
self.stdout.write(text)
|
89 |
+
|
90 |
+
if self.should_flush:
|
91 |
+
self.flush()
|
92 |
+
|
93 |
+
def flush(self) -> None:
|
94 |
+
"""Flush written text to both stdout and a file, if open."""
|
95 |
+
if self.file is not None:
|
96 |
+
self.file.flush()
|
97 |
+
|
98 |
+
self.stdout.flush()
|
99 |
+
|
100 |
+
def close(self) -> None:
|
101 |
+
"""Flush, close possible files, and remove stdout/stderr mirroring."""
|
102 |
+
self.flush()
|
103 |
+
|
104 |
+
# if using multiple loggers, prevent closing in wrong order
|
105 |
+
if sys.stdout is self:
|
106 |
+
sys.stdout = self.stdout
|
107 |
+
if sys.stderr is self:
|
108 |
+
sys.stderr = self.stderr
|
109 |
+
|
110 |
+
if self.file is not None:
|
111 |
+
self.file.close()
|
112 |
+
self.file = None
|
113 |
+
|
114 |
+
|
115 |
+
# Cache directories
|
116 |
+
# ------------------------------------------------------------------------------------------
|
117 |
+
|
118 |
+
_dnnlib_cache_dir = None
|
119 |
+
|
120 |
+
def set_cache_dir(path: str) -> None:
|
121 |
+
global _dnnlib_cache_dir
|
122 |
+
_dnnlib_cache_dir = path
|
123 |
+
|
124 |
+
def make_cache_dir_path(*paths: str) -> str:
|
125 |
+
if _dnnlib_cache_dir is not None:
|
126 |
+
return os.path.join(_dnnlib_cache_dir, *paths)
|
127 |
+
if 'DNNLIB_CACHE_DIR' in os.environ:
|
128 |
+
return os.path.join(os.environ['DNNLIB_CACHE_DIR'], *paths)
|
129 |
+
if 'HOME' in os.environ:
|
130 |
+
return os.path.join(os.environ['HOME'], '.cache', 'dnnlib', *paths)
|
131 |
+
if 'USERPROFILE' in os.environ:
|
132 |
+
return os.path.join(os.environ['USERPROFILE'], '.cache', 'dnnlib', *paths)
|
133 |
+
return os.path.join(tempfile.gettempdir(), '.cache', 'dnnlib', *paths)
|
134 |
+
|
135 |
+
# Small util functions
|
136 |
+
# ------------------------------------------------------------------------------------------
|
137 |
+
|
138 |
+
|
139 |
+
def format_time(seconds: Union[int, float]) -> str:
|
140 |
+
"""Convert the seconds to human readable string with days, hours, minutes and seconds."""
|
141 |
+
s = int(np.rint(seconds))
|
142 |
+
|
143 |
+
if s < 60:
|
144 |
+
return "{0}s".format(s)
|
145 |
+
elif s < 60 * 60:
|
146 |
+
return "{0}m {1:02}s".format(s // 60, s % 60)
|
147 |
+
elif s < 24 * 60 * 60:
|
148 |
+
return "{0}h {1:02}m {2:02}s".format(s // (60 * 60), (s // 60) % 60, s % 60)
|
149 |
+
else:
|
150 |
+
return "{0}d {1:02}h {2:02}m".format(s // (24 * 60 * 60), (s // (60 * 60)) % 24, (s // 60) % 60)
|
151 |
+
|
152 |
+
|
153 |
+
def ask_yes_no(question: str) -> bool:
|
154 |
+
"""Ask the user the question until the user inputs a valid answer."""
|
155 |
+
while True:
|
156 |
+
try:
|
157 |
+
print("{0} [y/n]".format(question))
|
158 |
+
return strtobool(input().lower())
|
159 |
+
except ValueError:
|
160 |
+
pass
|
161 |
+
|
162 |
+
|
163 |
+
def tuple_product(t: Tuple) -> Any:
|
164 |
+
"""Calculate the product of the tuple elements."""
|
165 |
+
result = 1
|
166 |
+
|
167 |
+
for v in t:
|
168 |
+
result *= v
|
169 |
+
|
170 |
+
return result
|
171 |
+
|
172 |
+
|
173 |
+
_str_to_ctype = {
|
174 |
+
"uint8": ctypes.c_ubyte,
|
175 |
+
"uint16": ctypes.c_uint16,
|
176 |
+
"uint32": ctypes.c_uint32,
|
177 |
+
"uint64": ctypes.c_uint64,
|
178 |
+
"int8": ctypes.c_byte,
|
179 |
+
"int16": ctypes.c_int16,
|
180 |
+
"int32": ctypes.c_int32,
|
181 |
+
"int64": ctypes.c_int64,
|
182 |
+
"float32": ctypes.c_float,
|
183 |
+
"float64": ctypes.c_double
|
184 |
+
}
|
185 |
+
|
186 |
+
|
187 |
+
def get_dtype_and_ctype(type_obj: Any) -> Tuple[np.dtype, Any]:
|
188 |
+
"""Given a type name string (or an object having a __name__ attribute), return matching Numpy and ctypes types that have the same size in bytes."""
|
189 |
+
type_str = None
|
190 |
+
|
191 |
+
if isinstance(type_obj, str):
|
192 |
+
type_str = type_obj
|
193 |
+
elif hasattr(type_obj, "__name__"):
|
194 |
+
type_str = type_obj.__name__
|
195 |
+
elif hasattr(type_obj, "name"):
|
196 |
+
type_str = type_obj.name
|
197 |
+
else:
|
198 |
+
raise RuntimeError("Cannot infer type name from input")
|
199 |
+
|
200 |
+
assert type_str in _str_to_ctype.keys()
|
201 |
+
|
202 |
+
my_dtype = np.dtype(type_str)
|
203 |
+
my_ctype = _str_to_ctype[type_str]
|
204 |
+
|
205 |
+
assert my_dtype.itemsize == ctypes.sizeof(my_ctype)
|
206 |
+
|
207 |
+
return my_dtype, my_ctype
|
208 |
+
|
209 |
+
|
210 |
+
def is_pickleable(obj: Any) -> bool:
|
211 |
+
try:
|
212 |
+
with io.BytesIO() as stream:
|
213 |
+
pickle.dump(obj, stream)
|
214 |
+
return True
|
215 |
+
except:
|
216 |
+
return False
|
217 |
+
|
218 |
+
|
219 |
+
# Functionality to import modules/objects by name, and call functions by name
|
220 |
+
# ------------------------------------------------------------------------------------------
|
221 |
+
|
222 |
+
def get_module_from_obj_name(obj_name: str) -> Tuple[types.ModuleType, str]:
|
223 |
+
"""Searches for the underlying module behind the name to some python object.
|
224 |
+
Returns the module and the object name (original name with module part removed)."""
|
225 |
+
|
226 |
+
# allow convenience shorthands, substitute them by full names
|
227 |
+
obj_name = re.sub("^np.", "numpy.", obj_name)
|
228 |
+
obj_name = re.sub("^tf.", "tensorflow.", obj_name)
|
229 |
+
|
230 |
+
# list alternatives for (module_name, local_obj_name)
|
231 |
+
parts = obj_name.split(".")
|
232 |
+
name_pairs = [(".".join(parts[:i]), ".".join(parts[i:])) for i in range(len(parts), 0, -1)]
|
233 |
+
|
234 |
+
# try each alternative in turn
|
235 |
+
for module_name, local_obj_name in name_pairs:
|
236 |
+
try:
|
237 |
+
module = importlib.import_module(module_name) # may raise ImportError
|
238 |
+
get_obj_from_module(module, local_obj_name) # may raise AttributeError
|
239 |
+
return module, local_obj_name
|
240 |
+
except:
|
241 |
+
pass
|
242 |
+
|
243 |
+
# maybe some of the modules themselves contain errors?
|
244 |
+
for module_name, _local_obj_name in name_pairs:
|
245 |
+
try:
|
246 |
+
importlib.import_module(module_name) # may raise ImportError
|
247 |
+
except ImportError:
|
248 |
+
if not str(sys.exc_info()[1]).startswith("No module named '" + module_name + "'"):
|
249 |
+
raise
|
250 |
+
|
251 |
+
# maybe the requested attribute is missing?
|
252 |
+
for module_name, local_obj_name in name_pairs:
|
253 |
+
try:
|
254 |
+
module = importlib.import_module(module_name) # may raise ImportError
|
255 |
+
get_obj_from_module(module, local_obj_name) # may raise AttributeError
|
256 |
+
except ImportError:
|
257 |
+
pass
|
258 |
+
|
259 |
+
# we are out of luck, but we have no idea why
|
260 |
+
raise ImportError(obj_name)
|
261 |
+
|
262 |
+
|
263 |
+
def get_obj_from_module(module: types.ModuleType, obj_name: str) -> Any:
|
264 |
+
"""Traverses the object name and returns the last (rightmost) python object."""
|
265 |
+
if obj_name == '':
|
266 |
+
return module
|
267 |
+
obj = module
|
268 |
+
for part in obj_name.split("."):
|
269 |
+
obj = getattr(obj, part)
|
270 |
+
return obj
|
271 |
+
|
272 |
+
|
273 |
+
def get_obj_by_name(name: str) -> Any:
|
274 |
+
"""Finds the python object with the given name."""
|
275 |
+
module, obj_name = get_module_from_obj_name(name)
|
276 |
+
return get_obj_from_module(module, obj_name)
|
277 |
+
|
278 |
+
|
279 |
+
def call_func_by_name(*args, func_name: str = None, **kwargs) -> Any:
|
280 |
+
"""Finds the python object with the given name and calls it as a function."""
|
281 |
+
assert func_name is not None
|
282 |
+
func_obj = get_obj_by_name(func_name)
|
283 |
+
assert callable(func_obj)
|
284 |
+
return func_obj(*args, **kwargs)
|
285 |
+
|
286 |
+
|
287 |
+
def construct_class_by_name(*args, class_name: str = None, **kwargs) -> Any:
|
288 |
+
"""Finds the python class with the given name and constructs it with the given arguments."""
|
289 |
+
return call_func_by_name(*args, func_name=class_name, **kwargs)
|
290 |
+
|
291 |
+
|
292 |
+
def get_module_dir_by_obj_name(obj_name: str) -> str:
|
293 |
+
"""Get the directory path of the module containing the given object name."""
|
294 |
+
module, _ = get_module_from_obj_name(obj_name)
|
295 |
+
return os.path.dirname(inspect.getfile(module))
|
296 |
+
|
297 |
+
|
298 |
+
def is_top_level_function(obj: Any) -> bool:
|
299 |
+
"""Determine whether the given object is a top-level function, i.e., defined at module scope using 'def'."""
|
300 |
+
return callable(obj) and obj.__name__ in sys.modules[obj.__module__].__dict__
|
301 |
+
|
302 |
+
|
303 |
+
def get_top_level_function_name(obj: Any) -> str:
|
304 |
+
"""Return the fully-qualified name of a top-level function."""
|
305 |
+
assert is_top_level_function(obj)
|
306 |
+
module = obj.__module__
|
307 |
+
if module == '__main__':
|
308 |
+
module = os.path.splitext(os.path.basename(sys.modules[module].__file__))[0]
|
309 |
+
return module + "." + obj.__name__
|
310 |
+
|
311 |
+
|
312 |
+
# File system helpers
|
313 |
+
# ------------------------------------------------------------------------------------------
|
314 |
+
|
315 |
+
def list_dir_recursively_with_ignore(dir_path: str, ignores: List[str] = None, add_base_to_relative: bool = False) -> List[Tuple[str, str]]:
|
316 |
+
"""List all files recursively in a given directory while ignoring given file and directory names.
|
317 |
+
Returns list of tuples containing both absolute and relative paths."""
|
318 |
+
assert os.path.isdir(dir_path)
|
319 |
+
base_name = os.path.basename(os.path.normpath(dir_path))
|
320 |
+
|
321 |
+
if ignores is None:
|
322 |
+
ignores = []
|
323 |
+
|
324 |
+
result = []
|
325 |
+
|
326 |
+
for root, dirs, files in os.walk(dir_path, topdown=True):
|
327 |
+
for ignore_ in ignores:
|
328 |
+
dirs_to_remove = [d for d in dirs if fnmatch.fnmatch(d, ignore_)]
|
329 |
+
|
330 |
+
# dirs need to be edited in-place
|
331 |
+
for d in dirs_to_remove:
|
332 |
+
dirs.remove(d)
|
333 |
+
|
334 |
+
files = [f for f in files if not fnmatch.fnmatch(f, ignore_)]
|
335 |
+
|
336 |
+
absolute_paths = [os.path.join(root, f) for f in files]
|
337 |
+
relative_paths = [os.path.relpath(p, dir_path) for p in absolute_paths]
|
338 |
+
|
339 |
+
if add_base_to_relative:
|
340 |
+
relative_paths = [os.path.join(base_name, p) for p in relative_paths]
|
341 |
+
|
342 |
+
assert len(absolute_paths) == len(relative_paths)
|
343 |
+
result += zip(absolute_paths, relative_paths)
|
344 |
+
|
345 |
+
return result
|
346 |
+
|
347 |
+
|
348 |
+
def copy_files_and_create_dirs(files: List[Tuple[str, str]]) -> None:
|
349 |
+
"""Takes in a list of tuples of (src, dst) paths and copies files.
|
350 |
+
Will create all necessary directories."""
|
351 |
+
for file in files:
|
352 |
+
target_dir_name = os.path.dirname(file[1])
|
353 |
+
|
354 |
+
# will create all intermediate-level directories
|
355 |
+
if not os.path.exists(target_dir_name):
|
356 |
+
os.makedirs(target_dir_name)
|
357 |
+
|
358 |
+
shutil.copyfile(file[0], file[1])
|
359 |
+
|
360 |
+
|
361 |
+
# URL helpers
|
362 |
+
# ------------------------------------------------------------------------------------------
|
363 |
+
|
364 |
+
def is_url(obj: Any, allow_file_urls: bool = False) -> bool:
|
365 |
+
"""Determine whether the given object is a valid URL string."""
|
366 |
+
if not isinstance(obj, str) or not "://" in obj:
|
367 |
+
return False
|
368 |
+
if allow_file_urls and obj.startswith('file://'):
|
369 |
+
return True
|
370 |
+
try:
|
371 |
+
res = requests.compat.urlparse(obj)
|
372 |
+
if not res.scheme or not res.netloc or not "." in res.netloc:
|
373 |
+
return False
|
374 |
+
res = requests.compat.urlparse(requests.compat.urljoin(obj, "/"))
|
375 |
+
if not res.scheme or not res.netloc or not "." in res.netloc:
|
376 |
+
return False
|
377 |
+
except:
|
378 |
+
return False
|
379 |
+
return True
|
380 |
+
|
381 |
+
|
382 |
+
def open_url(url: str, cache_dir: str = None, num_attempts: int = 10, verbose: bool = True, return_filename: bool = False, cache: bool = True) -> Any:
|
383 |
+
"""Download the given URL and return a binary-mode file object to access the data."""
|
384 |
+
assert num_attempts >= 1
|
385 |
+
assert not (return_filename and (not cache))
|
386 |
+
|
387 |
+
# Doesn't look like an URL scheme so interpret it as a local filename.
|
388 |
+
if not re.match('^[a-z]+://', url):
|
389 |
+
return url if return_filename else open(url, "rb")
|
390 |
+
|
391 |
+
# Handle file URLs. This code handles unusual file:// patterns that
|
392 |
+
# arise on Windows:
|
393 |
+
#
|
394 |
+
# file:///c:/foo.txt
|
395 |
+
#
|
396 |
+
# which would translate to a local '/c:/foo.txt' filename that's
|
397 |
+
# invalid. Drop the forward slash for such pathnames.
|
398 |
+
#
|
399 |
+
# If you touch this code path, you should test it on both Linux and
|
400 |
+
# Windows.
|
401 |
+
#
|
402 |
+
# Some internet resources suggest using urllib.request.url2pathname() but
|
403 |
+
# but that converts forward slashes to backslashes and this causes
|
404 |
+
# its own set of problems.
|
405 |
+
if url.startswith('file://'):
|
406 |
+
filename = urllib.parse.urlparse(url).path
|
407 |
+
if re.match(r'^/[a-zA-Z]:', filename):
|
408 |
+
filename = filename[1:]
|
409 |
+
return filename if return_filename else open(filename, "rb")
|
410 |
+
|
411 |
+
assert is_url(url)
|
412 |
+
|
413 |
+
# Lookup from cache.
|
414 |
+
if cache_dir is None:
|
415 |
+
cache_dir = make_cache_dir_path('downloads')
|
416 |
+
|
417 |
+
url_md5 = hashlib.md5(url.encode("utf-8")).hexdigest()
|
418 |
+
if cache:
|
419 |
+
cache_files = glob.glob(os.path.join(cache_dir, url_md5 + "_*"))
|
420 |
+
if len(cache_files) == 1:
|
421 |
+
filename = cache_files[0]
|
422 |
+
return filename if return_filename else open(filename, "rb")
|
423 |
+
|
424 |
+
# Download.
|
425 |
+
url_name = None
|
426 |
+
url_data = None
|
427 |
+
with requests.Session() as session:
|
428 |
+
if verbose:
|
429 |
+
print("Downloading %s ..." % url, end="", flush=True)
|
430 |
+
for attempts_left in reversed(range(num_attempts)):
|
431 |
+
try:
|
432 |
+
with session.get(url) as res:
|
433 |
+
res.raise_for_status()
|
434 |
+
if len(res.content) == 0:
|
435 |
+
raise IOError("No data received")
|
436 |
+
|
437 |
+
if len(res.content) < 8192:
|
438 |
+
content_str = res.content.decode("utf-8")
|
439 |
+
if "download_warning" in res.headers.get("Set-Cookie", ""):
|
440 |
+
links = [html.unescape(link) for link in content_str.split('"') if "export=download" in link]
|
441 |
+
if len(links) == 1:
|
442 |
+
url = requests.compat.urljoin(url, links[0])
|
443 |
+
raise IOError("Google Drive virus checker nag")
|
444 |
+
if "Google Drive - Quota exceeded" in content_str:
|
445 |
+
raise IOError("Google Drive download quota exceeded -- please try again later")
|
446 |
+
|
447 |
+
match = re.search(r'filename="([^"]*)"', res.headers.get("Content-Disposition", ""))
|
448 |
+
url_name = match[1] if match else url
|
449 |
+
url_data = res.content
|
450 |
+
if verbose:
|
451 |
+
print(" done")
|
452 |
+
break
|
453 |
+
except KeyboardInterrupt:
|
454 |
+
raise
|
455 |
+
except:
|
456 |
+
if not attempts_left:
|
457 |
+
if verbose:
|
458 |
+
print(" failed")
|
459 |
+
raise
|
460 |
+
if verbose:
|
461 |
+
print(".", end="", flush=True)
|
462 |
+
|
463 |
+
# Save to cache.
|
464 |
+
if cache:
|
465 |
+
safe_name = re.sub(r"[^0-9a-zA-Z-._]", "_", url_name)
|
466 |
+
cache_file = os.path.join(cache_dir, url_md5 + "_" + safe_name)
|
467 |
+
temp_file = os.path.join(cache_dir, "tmp_" + uuid.uuid4().hex + "_" + url_md5 + "_" + safe_name)
|
468 |
+
os.makedirs(cache_dir, exist_ok=True)
|
469 |
+
with open(temp_file, "wb") as f:
|
470 |
+
f.write(url_data)
|
471 |
+
os.replace(temp_file, cache_file) # atomic
|
472 |
+
if return_filename:
|
473 |
+
return cache_file
|
474 |
+
|
475 |
+
# Return data as file object.
|
476 |
+
assert not return_filename
|
477 |
+
return io.BytesIO(url_data)
|
docker_run.sh
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash
|
2 |
+
|
3 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
4 |
+
#
|
5 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
6 |
+
# and proprietary rights in and to this software, related documentation
|
7 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
8 |
+
# distribution of this software and related documentation without an express
|
9 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
10 |
+
|
11 |
+
set -e
|
12 |
+
|
13 |
+
# Wrapper script for setting up `docker run` to properly
|
14 |
+
# cache downloaded files, custom extension builds and
|
15 |
+
# mount the source directory into the container and make it
|
16 |
+
# run as non-root user.
|
17 |
+
#
|
18 |
+
# Use it like:
|
19 |
+
#
|
20 |
+
# ./docker_run.sh python generate.py --help
|
21 |
+
#
|
22 |
+
# To override the default `stylegan2ada:latest` image, run:
|
23 |
+
#
|
24 |
+
# IMAGE=my_image:v1.0 ./docker_run.sh python generate.py --help
|
25 |
+
#
|
26 |
+
|
27 |
+
rest=$@
|
28 |
+
|
29 |
+
IMAGE="${IMAGE:-sg2ada:latest}"
|
30 |
+
|
31 |
+
CONTAINER_ID=$(docker inspect --format="{{.Id}}" ${IMAGE} 2> /dev/null)
|
32 |
+
if [[ "${CONTAINER_ID}" ]]; then
|
33 |
+
docker run --shm-size=2g --gpus all -it --rm -v `pwd`:/scratch --user $(id -u):$(id -g) \
|
34 |
+
--workdir=/scratch -e HOME=/scratch $IMAGE $@
|
35 |
+
else
|
36 |
+
echo "Unknown container image: ${IMAGE}"
|
37 |
+
exit 1
|
38 |
+
fi
|
docs/dataset-tool-help.txt
ADDED
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Usage: dataset_tool.py [OPTIONS]
|
2 |
+
|
3 |
+
Convert an image dataset into a dataset archive usable with StyleGAN2 ADA
|
4 |
+
PyTorch.
|
5 |
+
|
6 |
+
The input dataset format is guessed from the --source argument:
|
7 |
+
|
8 |
+
--source *_lmdb/ - Load LSUN dataset
|
9 |
+
--source cifar-10-python.tar.gz - Load CIFAR-10 dataset
|
10 |
+
--source path/ - Recursively load all images from path/
|
11 |
+
--source dataset.zip - Recursively load all images from dataset.zip
|
12 |
+
|
13 |
+
The output dataset format can be either an image folder or a zip archive.
|
14 |
+
Specifying the output format and path:
|
15 |
+
|
16 |
+
--dest /path/to/dir - Save output files under /path/to/dir
|
17 |
+
--dest /path/to/dataset.zip - Save output files into /path/to/dataset.zip archive
|
18 |
+
|
19 |
+
Images within the dataset archive will be stored as uncompressed PNG.
|
20 |
+
|
21 |
+
Image scale/crop and resolution requirements:
|
22 |
+
|
23 |
+
Output images must be square-shaped and they must all have the same power-
|
24 |
+
of-two dimensions.
|
25 |
+
|
26 |
+
To scale arbitrary input image size to a specific width and height, use
|
27 |
+
the --width and --height options. Output resolution will be either the
|
28 |
+
original input resolution (if --width/--height was not specified) or the
|
29 |
+
one specified with --width/height.
|
30 |
+
|
31 |
+
Use the --transform=center-crop or --transform=center-crop-wide options to
|
32 |
+
apply a center crop transform on the input image. These options should be
|
33 |
+
used with the --width and --height options. For example:
|
34 |
+
|
35 |
+
python dataset_tool.py --source LSUN/raw/cat_lmdb --dest /tmp/lsun_cat \
|
36 |
+
--transform=center-crop-wide --width 512 --height=384
|
37 |
+
|
38 |
+
Options:
|
39 |
+
--source PATH Directory or archive name for input dataset
|
40 |
+
[required]
|
41 |
+
--dest PATH Output directory or archive name for output
|
42 |
+
dataset [required]
|
43 |
+
--max-images INTEGER Output only up to `max-images` images
|
44 |
+
--resize-filter [box|lanczos] Filter to use when resizing images for
|
45 |
+
output resolution [default: lanczos]
|
46 |
+
--transform [center-crop|center-crop-wide]
|
47 |
+
Input crop/resize mode
|
48 |
+
--width INTEGER Output width
|
49 |
+
--height INTEGER Output height
|
50 |
+
--help Show this message and exit.
|
docs/license.html
ADDED
@@ -0,0 +1,153 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<!DOCTYPE html>
|
2 |
+
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
|
3 |
+
<head>
|
4 |
+
<meta charset="utf-8"/>
|
5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"/>
|
6 |
+
<title>Nvidia Source Code License-NC</title>
|
7 |
+
<link href="https://fonts.googleapis.com/css?family=Helvetica+Neue" rel="stylesheet"/>
|
8 |
+
<style type="text/css">
|
9 |
+
|
10 |
+
body {
|
11 |
+
font-family: 'Helvetica Neue', sans-serif;
|
12 |
+
color: #000000;
|
13 |
+
line-height: 1.5;
|
14 |
+
}
|
15 |
+
|
16 |
+
h1, h2, h3, h4, h5, h6 {
|
17 |
+
color: #92D050;
|
18 |
+
font-weight: normal;
|
19 |
+
}
|
20 |
+
|
21 |
+
h1 {
|
22 |
+
line-height: 1.2;
|
23 |
+
font-size: 2em;
|
24 |
+
margin-top: 1.5em;
|
25 |
+
}
|
26 |
+
|
27 |
+
p {
|
28 |
+
margin-left: 0px;
|
29 |
+
margin-right: 0px;
|
30 |
+
margin-top: 0.75em;
|
31 |
+
margin-bottom: 0.75em;
|
32 |
+
}
|
33 |
+
|
34 |
+
p.tab {
|
35 |
+
margin-left: 3em;
|
36 |
+
}
|
37 |
+
|
38 |
+
hr {
|
39 |
+
border: 0px;
|
40 |
+
height: 1px;
|
41 |
+
background: #CCCCCC;
|
42 |
+
}
|
43 |
+
|
44 |
+
@media screen and (min-width: 680px) {
|
45 |
+
.max-width {
|
46 |
+
margin: 0 100px 0 170px;
|
47 |
+
max-width: 640px;
|
48 |
+
}
|
49 |
+
}
|
50 |
+
@media screen and (min-width: 980px) {
|
51 |
+
.max-width {
|
52 |
+
margin: 0 auto;
|
53 |
+
}
|
54 |
+
}
|
55 |
+
</style>
|
56 |
+
</head>
|
57 |
+
<body class="max-width">
|
58 |
+
|
59 |
+
<h1>NVIDIA Source Code License for StyleGAN2 with Adaptive Discriminator Augmentation (ADA)</h1>
|
60 |
+
|
61 |
+
<hr/>
|
62 |
+
|
63 |
+
<h2>1. Definitions</h2>
|
64 |
+
|
65 |
+
<p>“Licensor” means any person or entity that distributes its Work.</p>
|
66 |
+
|
67 |
+
<p>“Software” means the original work of authorship made available under
|
68 |
+
this License.</p>
|
69 |
+
|
70 |
+
<p>“Work” means the Software and any additions to or derivative works of
|
71 |
+
the Software that are made available under this License.</p>
|
72 |
+
|
73 |
+
<p>The terms “reproduce,” “reproduction,” “derivative works,” and
|
74 |
+
“distribution” have the meaning as provided under U.S. copyright law;
|
75 |
+
provided, however, that for the purposes of this License, derivative
|
76 |
+
works shall not include works that remain separable from, or merely
|
77 |
+
link (or bind by name) to the interfaces of, the Work.</p>
|
78 |
+
|
79 |
+
<p>Works, including the Software, are “made available” under this License
|
80 |
+
by including in or with the Work either (a) a copyright notice
|
81 |
+
referencing the applicability of this License to the Work, or (b) a
|
82 |
+
copy of this License.<p>
|
83 |
+
|
84 |
+
<h2>2. License Grants</h2>
|
85 |
+
|
86 |
+
<p class="tab">2.1 Copyright Grant. Subject to the terms and conditions of this
|
87 |
+
License, each Licensor grants to you a perpetual, worldwide,
|
88 |
+
non-exclusive, royalty-free, copyright license to reproduce,
|
89 |
+
prepare derivative works of, publicly display, publicly perform,
|
90 |
+
sublicense and distribute its Work and any resulting derivative
|
91 |
+
works in any form.</p>
|
92 |
+
|
93 |
+
<h2>3. Limitations</h2>
|
94 |
+
|
95 |
+
<p class="tab">3.1 Redistribution. You may reproduce or distribute the Work only
|
96 |
+
if (a) you do so under this License, (b) you include a complete
|
97 |
+
copy of this License with your distribution, and (c) you retain
|
98 |
+
without modification any copyright, patent, trademark, or
|
99 |
+
attribution notices that are present in the Work.</p>
|
100 |
+
|
101 |
+
<p class="tab">3.2 Derivative Works. You may specify that additional or different
|
102 |
+
terms apply to the use, reproduction, and distribution of your
|
103 |
+
derivative works of the Work (“Your Terms”) only if (a) Your Terms
|
104 |
+
provide that the use limitation in Section 3.3 applies to your
|
105 |
+
derivative works, and (b) you identify the specific derivative
|
106 |
+
works that are subject to Your Terms. Notwithstanding Your Terms,
|
107 |
+
this License (including the redistribution requirements in Section
|
108 |
+
3.1) will continue to apply to the Work itself.</p>
|
109 |
+
|
110 |
+
<p class="tab">3.3 Use Limitation. The Work and any derivative works thereof only may be used or intended for
|
111 |
+
use non-commercially. Notwithstanding the foregoing, NVIDIA and its affiliates may use the Work
|
112 |
+
and any derivative works commercially. As used herein, “non-commercially” means for research or
|
113 |
+
evaluation purposes only.
|
114 |
+
|
115 |
+
<p class="tab">3.4 Patent Claims. If you bring or threaten to bring a patent claim
|
116 |
+
against any Licensor (including any claim, cross-claim or
|
117 |
+
counterclaim in a lawsuit) to enforce any patents that you allege
|
118 |
+
are infringed by any Work, then your rights under this License from
|
119 |
+
such Licensor (including the grant in Section 2.1) will terminate immediately.
|
120 |
+
|
121 |
+
<p class="tab">3.5 Trademarks. This License does not grant any rights to use any
|
122 |
+
Licensor’s or its affiliates’ names, logos, or trademarks, except
|
123 |
+
as necessary to reproduce the notices described in this License.</p>
|
124 |
+
|
125 |
+
<p class="tab">3.6 Termination. If you violate any term of this License, then your
|
126 |
+
rights under this License (including the grant in Section 2.1)
|
127 |
+
will terminate immediately.</p>
|
128 |
+
|
129 |
+
<h2>4. Disclaimer of Warranty.</h2>
|
130 |
+
|
131 |
+
<p>THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
132 |
+
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF
|
133 |
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR
|
134 |
+
NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER
|
135 |
+
THIS LICENSE.</p>
|
136 |
+
|
137 |
+
<h2>5. Limitation of Liability.</h2>
|
138 |
+
|
139 |
+
<p>EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL
|
140 |
+
THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE
|
141 |
+
SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT,
|
142 |
+
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF
|
143 |
+
OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK
|
144 |
+
(INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION,
|
145 |
+
LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER
|
146 |
+
COMMERCIAL DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF
|
147 |
+
THE POSSIBILITY OF SUCH DAMAGES.</p>
|
148 |
+
|
149 |
+
<hr/>
|
150 |
+
<br/>
|
151 |
+
|
152 |
+
</body>
|
153 |
+
</html>
|
docs/stylegan2-ada-teaser-1024x252.png
ADDED
docs/stylegan2-ada-training-curves.png
ADDED
docs/train-help.txt
ADDED
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Usage: train.py [OPTIONS]
|
2 |
+
|
3 |
+
Train a GAN using the techniques described in the paper "Training
|
4 |
+
Generative Adversarial Networks with Limited Data".
|
5 |
+
|
6 |
+
Examples:
|
7 |
+
|
8 |
+
# Train with custom images using 1 GPU.
|
9 |
+
python train.py --outdir=~/training-runs --data=~/my-image-folder
|
10 |
+
|
11 |
+
# Train class-conditional CIFAR-10 using 2 GPUs.
|
12 |
+
python train.py --outdir=~/training-runs --data=~/datasets/cifar10.zip \
|
13 |
+
--gpus=2 --cfg=cifar --cond=1
|
14 |
+
|
15 |
+
# Transfer learn MetFaces from FFHQ using 4 GPUs.
|
16 |
+
python train.py --outdir=~/training-runs --data=~/datasets/metfaces.zip \
|
17 |
+
--gpus=4 --cfg=paper1024 --mirror=1 --resume=ffhq1024 --snap=10
|
18 |
+
|
19 |
+
# Reproduce original StyleGAN2 config F.
|
20 |
+
python train.py --outdir=~/training-runs --data=~/datasets/ffhq.zip \
|
21 |
+
--gpus=8 --cfg=stylegan2 --mirror=1 --aug=noaug
|
22 |
+
|
23 |
+
Base configs (--cfg):
|
24 |
+
auto Automatically select reasonable defaults based on resolution
|
25 |
+
and GPU count. Good starting point for new datasets.
|
26 |
+
stylegan2 Reproduce results for StyleGAN2 config F at 1024x1024.
|
27 |
+
paper256 Reproduce results for FFHQ and LSUN Cat at 256x256.
|
28 |
+
paper512 Reproduce results for BreCaHAD and AFHQ at 512x512.
|
29 |
+
paper1024 Reproduce results for MetFaces at 1024x1024.
|
30 |
+
cifar Reproduce results for CIFAR-10 at 32x32.
|
31 |
+
|
32 |
+
Transfer learning source networks (--resume):
|
33 |
+
ffhq256 FFHQ trained at 256x256 resolution.
|
34 |
+
ffhq512 FFHQ trained at 512x512 resolution.
|
35 |
+
ffhq1024 FFHQ trained at 1024x1024 resolution.
|
36 |
+
celebahq256 CelebA-HQ trained at 256x256 resolution.
|
37 |
+
lsundog256 LSUN Dog trained at 256x256 resolution.
|
38 |
+
<PATH or URL> Custom network pickle.
|
39 |
+
|
40 |
+
Options:
|
41 |
+
--outdir DIR Where to save the results [required]
|
42 |
+
--gpus INT Number of GPUs to use [default: 1]
|
43 |
+
--snap INT Snapshot interval [default: 50 ticks]
|
44 |
+
--metrics LIST Comma-separated list or "none" [default:
|
45 |
+
fid50k_full]
|
46 |
+
--seed INT Random seed [default: 0]
|
47 |
+
-n, --dry-run Print training options and exit
|
48 |
+
--data PATH Training data (directory or zip) [required]
|
49 |
+
--cond BOOL Train conditional model based on dataset
|
50 |
+
labels [default: false]
|
51 |
+
--subset INT Train with only N images [default: all]
|
52 |
+
--mirror BOOL Enable dataset x-flips [default: false]
|
53 |
+
--cfg [auto|stylegan2|paper256|paper512|paper1024|cifar]
|
54 |
+
Base config [default: auto]
|
55 |
+
--gamma FLOAT Override R1 gamma
|
56 |
+
--kimg INT Override training duration
|
57 |
+
--batch INT Override batch size
|
58 |
+
--aug [noaug|ada|fixed] Augmentation mode [default: ada]
|
59 |
+
--p FLOAT Augmentation probability for --aug=fixed
|
60 |
+
--target FLOAT ADA target value for --aug=ada
|
61 |
+
--augpipe [blit|geom|color|filter|noise|cutout|bg|bgc|bgcf|bgcfn|bgcfnc]
|
62 |
+
Augmentation pipeline [default: bgc]
|
63 |
+
--resume PKL Resume training [default: noresume]
|
64 |
+
--freezed INT Freeze-D [default: 0 layers]
|
65 |
+
--fp32 BOOL Disable mixed-precision training
|
66 |
+
--nhwc BOOL Use NHWC memory format with FP16
|
67 |
+
--nobench BOOL Disable cuDNN benchmarking
|
68 |
+
--allow-tf32 BOOL Allow PyTorch to use TF32 internally
|
69 |
+
--workers INT Override number of DataLoader workers
|
70 |
+
--help Show this message and exit.
|
ffhq.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a205a346e86a9ddaae702e118097d014b7b8bd719491396a162cca438f2f524c
|
3 |
+
size 381624121
|
ffhq.pkl.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a205a346e86a9ddaae702e118097d014b7b8bd719491396a162cca438f2f524c
|
3 |
+
size 381624121
|
fine_tuned_stylegan.pth
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d430dbbbc9213fcf9919a17ffb8f9f78c370857d6bf9434f0797fba56d124e4a
|
3 |
+
size 132732094
|
generate.py
ADDED
@@ -0,0 +1,129 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Generate images using pretrained network pickle."""
|
10 |
+
|
11 |
+
import os
|
12 |
+
import re
|
13 |
+
from typing import List, Optional
|
14 |
+
|
15 |
+
import click
|
16 |
+
import dnnlib
|
17 |
+
import numpy as np
|
18 |
+
import PIL.Image
|
19 |
+
import torch
|
20 |
+
|
21 |
+
import legacy
|
22 |
+
|
23 |
+
#----------------------------------------------------------------------------
|
24 |
+
|
25 |
+
def num_range(s: str) -> List[int]:
|
26 |
+
'''Accept either a comma separated list of numbers 'a,b,c' or a range 'a-c' and return as a list of ints.'''
|
27 |
+
|
28 |
+
range_re = re.compile(r'^(\d+)-(\d+)$')
|
29 |
+
m = range_re.match(s)
|
30 |
+
if m:
|
31 |
+
return list(range(int(m.group(1)), int(m.group(2))+1))
|
32 |
+
vals = s.split(',')
|
33 |
+
return [int(x) for x in vals]
|
34 |
+
|
35 |
+
#----------------------------------------------------------------------------
|
36 |
+
|
37 |
+
@click.command()
|
38 |
+
@click.pass_context
|
39 |
+
@click.option('--network', 'network_pkl', help='Network pickle filename', required=True)
|
40 |
+
@click.option('--seeds', type=num_range, help='List of random seeds')
|
41 |
+
@click.option('--trunc', 'truncation_psi', type=float, help='Truncation psi', default=1, show_default=True)
|
42 |
+
@click.option('--class', 'class_idx', type=int, help='Class label (unconditional if not specified)')
|
43 |
+
@click.option('--noise-mode', help='Noise mode', type=click.Choice(['const', 'random', 'none']), default='const', show_default=True)
|
44 |
+
@click.option('--projected-w', help='Projection result file', type=str, metavar='FILE')
|
45 |
+
@click.option('--outdir', help='Where to save the output images', type=str, required=True, metavar='DIR')
|
46 |
+
def generate_images(
|
47 |
+
ctx: click.Context,
|
48 |
+
network_pkl: str,
|
49 |
+
seeds: Optional[List[int]],
|
50 |
+
truncation_psi: float,
|
51 |
+
noise_mode: str,
|
52 |
+
outdir: str,
|
53 |
+
class_idx: Optional[int],
|
54 |
+
projected_w: Optional[str]
|
55 |
+
):
|
56 |
+
"""Generate images using pretrained network pickle.
|
57 |
+
|
58 |
+
Examples:
|
59 |
+
|
60 |
+
\b
|
61 |
+
# Generate curated MetFaces images without truncation (Fig.10 left)
|
62 |
+
python generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 \\
|
63 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
64 |
+
|
65 |
+
\b
|
66 |
+
# Generate uncurated MetFaces images with truncation (Fig.12 upper left)
|
67 |
+
python generate.py --outdir=out --trunc=0.7 --seeds=600-605 \\
|
68 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
69 |
+
|
70 |
+
\b
|
71 |
+
# Generate class conditional CIFAR-10 images (Fig.17 left, Car)
|
72 |
+
python generate.py --outdir=out --seeds=0-35 --class=1 \\
|
73 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/cifar10.pkl
|
74 |
+
|
75 |
+
\b
|
76 |
+
# Render an image from projected W
|
77 |
+
python generate.py --outdir=out --projected_w=projected_w.npz \\
|
78 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
79 |
+
"""
|
80 |
+
|
81 |
+
print('Loading networks from "%s"...' % network_pkl)
|
82 |
+
device = torch.device('cuda')
|
83 |
+
with dnnlib.util.open_url(network_pkl) as f:
|
84 |
+
G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore
|
85 |
+
|
86 |
+
os.makedirs(outdir, exist_ok=True)
|
87 |
+
|
88 |
+
# Synthesize the result of a W projection.
|
89 |
+
if projected_w is not None:
|
90 |
+
if seeds is not None:
|
91 |
+
print ('warn: --seeds is ignored when using --projected-w')
|
92 |
+
print(f'Generating images from projected W "{projected_w}"')
|
93 |
+
ws = np.load(projected_w)['w']
|
94 |
+
ws = torch.tensor(ws, device=device) # pylint: disable=not-callable
|
95 |
+
assert ws.shape[1:] == (G.num_ws, G.w_dim)
|
96 |
+
for idx, w in enumerate(ws):
|
97 |
+
img = G.synthesis(w.unsqueeze(0), noise_mode=noise_mode)
|
98 |
+
img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
|
99 |
+
img = PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save(f'{outdir}/proj{idx:02d}.png')
|
100 |
+
return
|
101 |
+
|
102 |
+
if seeds is None:
|
103 |
+
ctx.fail('--seeds option is required when not using --projected-w')
|
104 |
+
|
105 |
+
# Labels.
|
106 |
+
label = torch.zeros([1, G.c_dim], device=device)
|
107 |
+
if G.c_dim != 0:
|
108 |
+
if class_idx is None:
|
109 |
+
ctx.fail('Must specify class label with --class when using a conditional network')
|
110 |
+
label[:, class_idx] = 1
|
111 |
+
else:
|
112 |
+
if class_idx is not None:
|
113 |
+
print ('warn: --class=lbl ignored when running on an unconditional network')
|
114 |
+
|
115 |
+
# Generate images.
|
116 |
+
for seed_idx, seed in enumerate(seeds):
|
117 |
+
print('Generating image for seed %d (%d/%d) ...' % (seed, seed_idx, len(seeds)))
|
118 |
+
z = torch.from_numpy(np.random.RandomState(seed).randn(1, G.z_dim)).to(device)
|
119 |
+
img = G(z, label, truncation_psi=truncation_psi, noise_mode=noise_mode)
|
120 |
+
img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
|
121 |
+
PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save(f'{outdir}/seed{seed:04d}.png')
|
122 |
+
|
123 |
+
|
124 |
+
#----------------------------------------------------------------------------
|
125 |
+
|
126 |
+
if __name__ == "__main__":
|
127 |
+
generate_images() # pylint: disable=no-value-for-parameter
|
128 |
+
|
129 |
+
#----------------------------------------------------------------------------
|
legacy.py
ADDED
@@ -0,0 +1,320 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import click
|
10 |
+
import pickle
|
11 |
+
import re
|
12 |
+
import copy
|
13 |
+
import numpy as np
|
14 |
+
import torch
|
15 |
+
import dnnlib
|
16 |
+
from torch_utils import misc
|
17 |
+
|
18 |
+
#----------------------------------------------------------------------------
|
19 |
+
|
20 |
+
def load_network_pkl(f, force_fp16=False):
|
21 |
+
data = _LegacyUnpickler(f).load()
|
22 |
+
|
23 |
+
# Legacy TensorFlow pickle => convert.
|
24 |
+
if isinstance(data, tuple) and len(data) == 3 and all(isinstance(net, _TFNetworkStub) for net in data):
|
25 |
+
tf_G, tf_D, tf_Gs = data
|
26 |
+
G = convert_tf_generator(tf_G)
|
27 |
+
D = convert_tf_discriminator(tf_D)
|
28 |
+
G_ema = convert_tf_generator(tf_Gs)
|
29 |
+
data = dict(G=G, D=D, G_ema=G_ema)
|
30 |
+
|
31 |
+
# Add missing fields.
|
32 |
+
if 'training_set_kwargs' not in data:
|
33 |
+
data['training_set_kwargs'] = None
|
34 |
+
if 'augment_pipe' not in data:
|
35 |
+
data['augment_pipe'] = None
|
36 |
+
|
37 |
+
# Validate contents.
|
38 |
+
assert isinstance(data['G'], torch.nn.Module)
|
39 |
+
assert isinstance(data['D'], torch.nn.Module)
|
40 |
+
assert isinstance(data['G_ema'], torch.nn.Module)
|
41 |
+
assert isinstance(data['training_set_kwargs'], (dict, type(None)))
|
42 |
+
assert isinstance(data['augment_pipe'], (torch.nn.Module, type(None)))
|
43 |
+
|
44 |
+
# Force FP16.
|
45 |
+
if force_fp16:
|
46 |
+
for key in ['G', 'D', 'G_ema']:
|
47 |
+
old = data[key]
|
48 |
+
kwargs = copy.deepcopy(old.init_kwargs)
|
49 |
+
if key.startswith('G'):
|
50 |
+
kwargs.synthesis_kwargs = dnnlib.EasyDict(kwargs.get('synthesis_kwargs', {}))
|
51 |
+
kwargs.synthesis_kwargs.num_fp16_res = 4
|
52 |
+
kwargs.synthesis_kwargs.conv_clamp = 256
|
53 |
+
if key.startswith('D'):
|
54 |
+
kwargs.num_fp16_res = 4
|
55 |
+
kwargs.conv_clamp = 256
|
56 |
+
if kwargs != old.init_kwargs:
|
57 |
+
new = type(old)(**kwargs).eval().requires_grad_(False)
|
58 |
+
misc.copy_params_and_buffers(old, new, require_all=True)
|
59 |
+
data[key] = new
|
60 |
+
return data
|
61 |
+
|
62 |
+
#----------------------------------------------------------------------------
|
63 |
+
|
64 |
+
class _TFNetworkStub(dnnlib.EasyDict):
|
65 |
+
pass
|
66 |
+
|
67 |
+
class _LegacyUnpickler(pickle.Unpickler):
|
68 |
+
def find_class(self, module, name):
|
69 |
+
if module == 'dnnlib.tflib.network' and name == 'Network':
|
70 |
+
return _TFNetworkStub
|
71 |
+
return super().find_class(module, name)
|
72 |
+
|
73 |
+
#----------------------------------------------------------------------------
|
74 |
+
|
75 |
+
def _collect_tf_params(tf_net):
|
76 |
+
# pylint: disable=protected-access
|
77 |
+
tf_params = dict()
|
78 |
+
def recurse(prefix, tf_net):
|
79 |
+
for name, value in tf_net.variables:
|
80 |
+
tf_params[prefix + name] = value
|
81 |
+
for name, comp in tf_net.components.items():
|
82 |
+
recurse(prefix + name + '/', comp)
|
83 |
+
recurse('', tf_net)
|
84 |
+
return tf_params
|
85 |
+
|
86 |
+
#----------------------------------------------------------------------------
|
87 |
+
|
88 |
+
def _populate_module_params(module, *patterns):
|
89 |
+
for name, tensor in misc.named_params_and_buffers(module):
|
90 |
+
found = False
|
91 |
+
value = None
|
92 |
+
for pattern, value_fn in zip(patterns[0::2], patterns[1::2]):
|
93 |
+
match = re.fullmatch(pattern, name)
|
94 |
+
if match:
|
95 |
+
found = True
|
96 |
+
if value_fn is not None:
|
97 |
+
value = value_fn(*match.groups())
|
98 |
+
break
|
99 |
+
try:
|
100 |
+
assert found
|
101 |
+
if value is not None:
|
102 |
+
tensor.copy_(torch.from_numpy(np.array(value)))
|
103 |
+
except:
|
104 |
+
print(name, list(tensor.shape))
|
105 |
+
raise
|
106 |
+
|
107 |
+
#----------------------------------------------------------------------------
|
108 |
+
|
109 |
+
def convert_tf_generator(tf_G):
|
110 |
+
if tf_G.version < 4:
|
111 |
+
raise ValueError('TensorFlow pickle version too low')
|
112 |
+
|
113 |
+
# Collect kwargs.
|
114 |
+
tf_kwargs = tf_G.static_kwargs
|
115 |
+
known_kwargs = set()
|
116 |
+
def kwarg(tf_name, default=None, none=None):
|
117 |
+
known_kwargs.add(tf_name)
|
118 |
+
val = tf_kwargs.get(tf_name, default)
|
119 |
+
return val if val is not None else none
|
120 |
+
|
121 |
+
# Convert kwargs.
|
122 |
+
kwargs = dnnlib.EasyDict(
|
123 |
+
z_dim = kwarg('latent_size', 512),
|
124 |
+
c_dim = kwarg('label_size', 0),
|
125 |
+
w_dim = kwarg('dlatent_size', 512),
|
126 |
+
img_resolution = kwarg('resolution', 1024),
|
127 |
+
img_channels = kwarg('num_channels', 3),
|
128 |
+
mapping_kwargs = dnnlib.EasyDict(
|
129 |
+
num_layers = kwarg('mapping_layers', 8),
|
130 |
+
embed_features = kwarg('label_fmaps', None),
|
131 |
+
layer_features = kwarg('mapping_fmaps', None),
|
132 |
+
activation = kwarg('mapping_nonlinearity', 'lrelu'),
|
133 |
+
lr_multiplier = kwarg('mapping_lrmul', 0.01),
|
134 |
+
w_avg_beta = kwarg('w_avg_beta', 0.995, none=1),
|
135 |
+
),
|
136 |
+
synthesis_kwargs = dnnlib.EasyDict(
|
137 |
+
channel_base = kwarg('fmap_base', 16384) * 2,
|
138 |
+
channel_max = kwarg('fmap_max', 512),
|
139 |
+
num_fp16_res = kwarg('num_fp16_res', 0),
|
140 |
+
conv_clamp = kwarg('conv_clamp', None),
|
141 |
+
architecture = kwarg('architecture', 'skip'),
|
142 |
+
resample_filter = kwarg('resample_kernel', [1,3,3,1]),
|
143 |
+
use_noise = kwarg('use_noise', True),
|
144 |
+
activation = kwarg('nonlinearity', 'lrelu'),
|
145 |
+
),
|
146 |
+
)
|
147 |
+
|
148 |
+
# Check for unknown kwargs.
|
149 |
+
kwarg('truncation_psi')
|
150 |
+
kwarg('truncation_cutoff')
|
151 |
+
kwarg('style_mixing_prob')
|
152 |
+
kwarg('structure')
|
153 |
+
unknown_kwargs = list(set(tf_kwargs.keys()) - known_kwargs)
|
154 |
+
if len(unknown_kwargs) > 0:
|
155 |
+
raise ValueError('Unknown TensorFlow kwarg', unknown_kwargs[0])
|
156 |
+
|
157 |
+
# Collect params.
|
158 |
+
tf_params = _collect_tf_params(tf_G)
|
159 |
+
for name, value in list(tf_params.items()):
|
160 |
+
match = re.fullmatch(r'ToRGB_lod(\d+)/(.*)', name)
|
161 |
+
if match:
|
162 |
+
r = kwargs.img_resolution // (2 ** int(match.group(1)))
|
163 |
+
tf_params[f'{r}x{r}/ToRGB/{match.group(2)}'] = value
|
164 |
+
kwargs.synthesis.kwargs.architecture = 'orig'
|
165 |
+
#for name, value in tf_params.items(): print(f'{name:<50s}{list(value.shape)}')
|
166 |
+
|
167 |
+
# Convert params.
|
168 |
+
from training import networks
|
169 |
+
G = networks.Generator(**kwargs).eval().requires_grad_(False)
|
170 |
+
# pylint: disable=unnecessary-lambda
|
171 |
+
_populate_module_params(G,
|
172 |
+
r'mapping\.w_avg', lambda: tf_params[f'dlatent_avg'],
|
173 |
+
r'mapping\.embed\.weight', lambda: tf_params[f'mapping/LabelEmbed/weight'].transpose(),
|
174 |
+
r'mapping\.embed\.bias', lambda: tf_params[f'mapping/LabelEmbed/bias'],
|
175 |
+
r'mapping\.fc(\d+)\.weight', lambda i: tf_params[f'mapping/Dense{i}/weight'].transpose(),
|
176 |
+
r'mapping\.fc(\d+)\.bias', lambda i: tf_params[f'mapping/Dense{i}/bias'],
|
177 |
+
r'synthesis\.b4\.const', lambda: tf_params[f'synthesis/4x4/Const/const'][0],
|
178 |
+
r'synthesis\.b4\.conv1\.weight', lambda: tf_params[f'synthesis/4x4/Conv/weight'].transpose(3, 2, 0, 1),
|
179 |
+
r'synthesis\.b4\.conv1\.bias', lambda: tf_params[f'synthesis/4x4/Conv/bias'],
|
180 |
+
r'synthesis\.b4\.conv1\.noise_const', lambda: tf_params[f'synthesis/noise0'][0, 0],
|
181 |
+
r'synthesis\.b4\.conv1\.noise_strength', lambda: tf_params[f'synthesis/4x4/Conv/noise_strength'],
|
182 |
+
r'synthesis\.b4\.conv1\.affine\.weight', lambda: tf_params[f'synthesis/4x4/Conv/mod_weight'].transpose(),
|
183 |
+
r'synthesis\.b4\.conv1\.affine\.bias', lambda: tf_params[f'synthesis/4x4/Conv/mod_bias'] + 1,
|
184 |
+
r'synthesis\.b(\d+)\.conv0\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/Conv0_up/weight'][::-1, ::-1].transpose(3, 2, 0, 1),
|
185 |
+
r'synthesis\.b(\d+)\.conv0\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/Conv0_up/bias'],
|
186 |
+
r'synthesis\.b(\d+)\.conv0\.noise_const', lambda r: tf_params[f'synthesis/noise{int(np.log2(int(r)))*2-5}'][0, 0],
|
187 |
+
r'synthesis\.b(\d+)\.conv0\.noise_strength', lambda r: tf_params[f'synthesis/{r}x{r}/Conv0_up/noise_strength'],
|
188 |
+
r'synthesis\.b(\d+)\.conv0\.affine\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/Conv0_up/mod_weight'].transpose(),
|
189 |
+
r'synthesis\.b(\d+)\.conv0\.affine\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/Conv0_up/mod_bias'] + 1,
|
190 |
+
r'synthesis\.b(\d+)\.conv1\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/Conv1/weight'].transpose(3, 2, 0, 1),
|
191 |
+
r'synthesis\.b(\d+)\.conv1\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/Conv1/bias'],
|
192 |
+
r'synthesis\.b(\d+)\.conv1\.noise_const', lambda r: tf_params[f'synthesis/noise{int(np.log2(int(r)))*2-4}'][0, 0],
|
193 |
+
r'synthesis\.b(\d+)\.conv1\.noise_strength', lambda r: tf_params[f'synthesis/{r}x{r}/Conv1/noise_strength'],
|
194 |
+
r'synthesis\.b(\d+)\.conv1\.affine\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/Conv1/mod_weight'].transpose(),
|
195 |
+
r'synthesis\.b(\d+)\.conv1\.affine\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/Conv1/mod_bias'] + 1,
|
196 |
+
r'synthesis\.b(\d+)\.torgb\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/ToRGB/weight'].transpose(3, 2, 0, 1),
|
197 |
+
r'synthesis\.b(\d+)\.torgb\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/ToRGB/bias'],
|
198 |
+
r'synthesis\.b(\d+)\.torgb\.affine\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/ToRGB/mod_weight'].transpose(),
|
199 |
+
r'synthesis\.b(\d+)\.torgb\.affine\.bias', lambda r: tf_params[f'synthesis/{r}x{r}/ToRGB/mod_bias'] + 1,
|
200 |
+
r'synthesis\.b(\d+)\.skip\.weight', lambda r: tf_params[f'synthesis/{r}x{r}/Skip/weight'][::-1, ::-1].transpose(3, 2, 0, 1),
|
201 |
+
r'.*\.resample_filter', None,
|
202 |
+
)
|
203 |
+
return G
|
204 |
+
|
205 |
+
#----------------------------------------------------------------------------
|
206 |
+
|
207 |
+
def convert_tf_discriminator(tf_D):
|
208 |
+
if tf_D.version < 4:
|
209 |
+
raise ValueError('TensorFlow pickle version too low')
|
210 |
+
|
211 |
+
# Collect kwargs.
|
212 |
+
tf_kwargs = tf_D.static_kwargs
|
213 |
+
known_kwargs = set()
|
214 |
+
def kwarg(tf_name, default=None):
|
215 |
+
known_kwargs.add(tf_name)
|
216 |
+
return tf_kwargs.get(tf_name, default)
|
217 |
+
|
218 |
+
# Convert kwargs.
|
219 |
+
kwargs = dnnlib.EasyDict(
|
220 |
+
c_dim = kwarg('label_size', 0),
|
221 |
+
img_resolution = kwarg('resolution', 1024),
|
222 |
+
img_channels = kwarg('num_channels', 3),
|
223 |
+
architecture = kwarg('architecture', 'resnet'),
|
224 |
+
channel_base = kwarg('fmap_base', 16384) * 2,
|
225 |
+
channel_max = kwarg('fmap_max', 512),
|
226 |
+
num_fp16_res = kwarg('num_fp16_res', 0),
|
227 |
+
conv_clamp = kwarg('conv_clamp', None),
|
228 |
+
cmap_dim = kwarg('mapping_fmaps', None),
|
229 |
+
block_kwargs = dnnlib.EasyDict(
|
230 |
+
activation = kwarg('nonlinearity', 'lrelu'),
|
231 |
+
resample_filter = kwarg('resample_kernel', [1,3,3,1]),
|
232 |
+
freeze_layers = kwarg('freeze_layers', 0),
|
233 |
+
),
|
234 |
+
mapping_kwargs = dnnlib.EasyDict(
|
235 |
+
num_layers = kwarg('mapping_layers', 0),
|
236 |
+
embed_features = kwarg('mapping_fmaps', None),
|
237 |
+
layer_features = kwarg('mapping_fmaps', None),
|
238 |
+
activation = kwarg('nonlinearity', 'lrelu'),
|
239 |
+
lr_multiplier = kwarg('mapping_lrmul', 0.1),
|
240 |
+
),
|
241 |
+
epilogue_kwargs = dnnlib.EasyDict(
|
242 |
+
mbstd_group_size = kwarg('mbstd_group_size', None),
|
243 |
+
mbstd_num_channels = kwarg('mbstd_num_features', 1),
|
244 |
+
activation = kwarg('nonlinearity', 'lrelu'),
|
245 |
+
),
|
246 |
+
)
|
247 |
+
|
248 |
+
# Check for unknown kwargs.
|
249 |
+
kwarg('structure')
|
250 |
+
unknown_kwargs = list(set(tf_kwargs.keys()) - known_kwargs)
|
251 |
+
if len(unknown_kwargs) > 0:
|
252 |
+
raise ValueError('Unknown TensorFlow kwarg', unknown_kwargs[0])
|
253 |
+
|
254 |
+
# Collect params.
|
255 |
+
tf_params = _collect_tf_params(tf_D)
|
256 |
+
for name, value in list(tf_params.items()):
|
257 |
+
match = re.fullmatch(r'FromRGB_lod(\d+)/(.*)', name)
|
258 |
+
if match:
|
259 |
+
r = kwargs.img_resolution // (2 ** int(match.group(1)))
|
260 |
+
tf_params[f'{r}x{r}/FromRGB/{match.group(2)}'] = value
|
261 |
+
kwargs.architecture = 'orig'
|
262 |
+
#for name, value in tf_params.items(): print(f'{name:<50s}{list(value.shape)}')
|
263 |
+
|
264 |
+
# Convert params.
|
265 |
+
from training import networks
|
266 |
+
D = networks.Discriminator(**kwargs).eval().requires_grad_(False)
|
267 |
+
# pylint: disable=unnecessary-lambda
|
268 |
+
_populate_module_params(D,
|
269 |
+
r'b(\d+)\.fromrgb\.weight', lambda r: tf_params[f'{r}x{r}/FromRGB/weight'].transpose(3, 2, 0, 1),
|
270 |
+
r'b(\d+)\.fromrgb\.bias', lambda r: tf_params[f'{r}x{r}/FromRGB/bias'],
|
271 |
+
r'b(\d+)\.conv(\d+)\.weight', lambda r, i: tf_params[f'{r}x{r}/Conv{i}{["","_down"][int(i)]}/weight'].transpose(3, 2, 0, 1),
|
272 |
+
r'b(\d+)\.conv(\d+)\.bias', lambda r, i: tf_params[f'{r}x{r}/Conv{i}{["","_down"][int(i)]}/bias'],
|
273 |
+
r'b(\d+)\.skip\.weight', lambda r: tf_params[f'{r}x{r}/Skip/weight'].transpose(3, 2, 0, 1),
|
274 |
+
r'mapping\.embed\.weight', lambda: tf_params[f'LabelEmbed/weight'].transpose(),
|
275 |
+
r'mapping\.embed\.bias', lambda: tf_params[f'LabelEmbed/bias'],
|
276 |
+
r'mapping\.fc(\d+)\.weight', lambda i: tf_params[f'Mapping{i}/weight'].transpose(),
|
277 |
+
r'mapping\.fc(\d+)\.bias', lambda i: tf_params[f'Mapping{i}/bias'],
|
278 |
+
r'b4\.conv\.weight', lambda: tf_params[f'4x4/Conv/weight'].transpose(3, 2, 0, 1),
|
279 |
+
r'b4\.conv\.bias', lambda: tf_params[f'4x4/Conv/bias'],
|
280 |
+
r'b4\.fc\.weight', lambda: tf_params[f'4x4/Dense0/weight'].transpose(),
|
281 |
+
r'b4\.fc\.bias', lambda: tf_params[f'4x4/Dense0/bias'],
|
282 |
+
r'b4\.out\.weight', lambda: tf_params[f'Output/weight'].transpose(),
|
283 |
+
r'b4\.out\.bias', lambda: tf_params[f'Output/bias'],
|
284 |
+
r'.*\.resample_filter', None,
|
285 |
+
)
|
286 |
+
return D
|
287 |
+
|
288 |
+
#----------------------------------------------------------------------------
|
289 |
+
|
290 |
+
@click.command()
|
291 |
+
@click.option('--source', help='Input pickle', required=True, metavar='PATH')
|
292 |
+
@click.option('--dest', help='Output pickle', required=True, metavar='PATH')
|
293 |
+
@click.option('--force-fp16', help='Force the networks to use FP16', type=bool, default=False, metavar='BOOL', show_default=True)
|
294 |
+
def convert_network_pickle(source, dest, force_fp16):
|
295 |
+
"""Convert legacy network pickle into the native PyTorch format.
|
296 |
+
|
297 |
+
The tool is able to load the main network configurations exported using the TensorFlow version of StyleGAN2 or StyleGAN2-ADA.
|
298 |
+
It does not support e.g. StyleGAN2-ADA comparison methods, StyleGAN2 configs A-D, or StyleGAN1 networks.
|
299 |
+
|
300 |
+
Example:
|
301 |
+
|
302 |
+
\b
|
303 |
+
python legacy.py \\
|
304 |
+
--source=https://nvlabs-fi-cdn.nvidia.com/stylegan2/networks/stylegan2-cat-config-f.pkl \\
|
305 |
+
--dest=stylegan2-cat-config-f.pkl
|
306 |
+
"""
|
307 |
+
print(f'Loading "{source}"...')
|
308 |
+
with dnnlib.util.open_url(source) as f:
|
309 |
+
data = load_network_pkl(f, force_fp16=force_fp16)
|
310 |
+
print(f'Saving "{dest}"...')
|
311 |
+
with open(dest, 'wb') as f:
|
312 |
+
pickle.dump(data, f)
|
313 |
+
print('Done.')
|
314 |
+
|
315 |
+
#----------------------------------------------------------------------------
|
316 |
+
|
317 |
+
if __name__ == "__main__":
|
318 |
+
convert_network_pickle() # pylint: disable=no-value-for-parameter
|
319 |
+
|
320 |
+
#----------------------------------------------------------------------------
|
metrics/__init__.py
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
# empty
|
metrics/frechet_inception_distance.py
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Frechet Inception Distance (FID) from the paper
|
10 |
+
"GANs trained by a two time-scale update rule converge to a local Nash
|
11 |
+
equilibrium". Matches the original implementation by Heusel et al. at
|
12 |
+
https://github.com/bioinf-jku/TTUR/blob/master/fid.py"""
|
13 |
+
|
14 |
+
import numpy as np
|
15 |
+
import scipy.linalg
|
16 |
+
from . import metric_utils
|
17 |
+
|
18 |
+
#----------------------------------------------------------------------------
|
19 |
+
|
20 |
+
def compute_fid(opts, max_real, num_gen):
|
21 |
+
# Direct TorchScript translation of http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz
|
22 |
+
detector_url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt'
|
23 |
+
detector_kwargs = dict(return_features=True) # Return raw features before the softmax layer.
|
24 |
+
|
25 |
+
mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset(
|
26 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
27 |
+
rel_lo=0, rel_hi=0, capture_mean_cov=True, max_items=max_real).get_mean_cov()
|
28 |
+
|
29 |
+
mu_gen, sigma_gen = metric_utils.compute_feature_stats_for_generator(
|
30 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
31 |
+
rel_lo=0, rel_hi=1, capture_mean_cov=True, max_items=num_gen).get_mean_cov()
|
32 |
+
|
33 |
+
if opts.rank != 0:
|
34 |
+
return float('nan')
|
35 |
+
|
36 |
+
m = np.square(mu_gen - mu_real).sum()
|
37 |
+
s, _ = scipy.linalg.sqrtm(np.dot(sigma_gen, sigma_real), disp=False) # pylint: disable=no-member
|
38 |
+
fid = np.real(m + np.trace(sigma_gen + sigma_real - s * 2))
|
39 |
+
return float(fid)
|
40 |
+
|
41 |
+
#----------------------------------------------------------------------------
|
metrics/inception_score.py
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Inception Score (IS) from the paper "Improved techniques for training
|
10 |
+
GANs". Matches the original implementation by Salimans et al. at
|
11 |
+
https://github.com/openai/improved-gan/blob/master/inception_score/model.py"""
|
12 |
+
|
13 |
+
import numpy as np
|
14 |
+
from . import metric_utils
|
15 |
+
|
16 |
+
#----------------------------------------------------------------------------
|
17 |
+
|
18 |
+
def compute_is(opts, num_gen, num_splits):
|
19 |
+
# Direct TorchScript translation of http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz
|
20 |
+
detector_url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt'
|
21 |
+
detector_kwargs = dict(no_output_bias=True) # Match the original implementation by not applying bias in the softmax layer.
|
22 |
+
|
23 |
+
gen_probs = metric_utils.compute_feature_stats_for_generator(
|
24 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
25 |
+
capture_all=True, max_items=num_gen).get_all()
|
26 |
+
|
27 |
+
if opts.rank != 0:
|
28 |
+
return float('nan'), float('nan')
|
29 |
+
|
30 |
+
scores = []
|
31 |
+
for i in range(num_splits):
|
32 |
+
part = gen_probs[i * num_gen // num_splits : (i + 1) * num_gen // num_splits]
|
33 |
+
kl = part * (np.log(part) - np.log(np.mean(part, axis=0, keepdims=True)))
|
34 |
+
kl = np.mean(np.sum(kl, axis=1))
|
35 |
+
scores.append(np.exp(kl))
|
36 |
+
return float(np.mean(scores)), float(np.std(scores))
|
37 |
+
|
38 |
+
#----------------------------------------------------------------------------
|
metrics/kernel_inception_distance.py
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Kernel Inception Distance (KID) from the paper "Demystifying MMD
|
10 |
+
GANs". Matches the original implementation by Binkowski et al. at
|
11 |
+
https://github.com/mbinkowski/MMD-GAN/blob/master/gan/compute_scores.py"""
|
12 |
+
|
13 |
+
import numpy as np
|
14 |
+
from . import metric_utils
|
15 |
+
|
16 |
+
#----------------------------------------------------------------------------
|
17 |
+
|
18 |
+
def compute_kid(opts, max_real, num_gen, num_subsets, max_subset_size):
|
19 |
+
# Direct TorchScript translation of http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz
|
20 |
+
detector_url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/inception-2015-12-05.pt'
|
21 |
+
detector_kwargs = dict(return_features=True) # Return raw features before the softmax layer.
|
22 |
+
|
23 |
+
real_features = metric_utils.compute_feature_stats_for_dataset(
|
24 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
25 |
+
rel_lo=0, rel_hi=0, capture_all=True, max_items=max_real).get_all()
|
26 |
+
|
27 |
+
gen_features = metric_utils.compute_feature_stats_for_generator(
|
28 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
29 |
+
rel_lo=0, rel_hi=1, capture_all=True, max_items=num_gen).get_all()
|
30 |
+
|
31 |
+
if opts.rank != 0:
|
32 |
+
return float('nan')
|
33 |
+
|
34 |
+
n = real_features.shape[1]
|
35 |
+
m = min(min(real_features.shape[0], gen_features.shape[0]), max_subset_size)
|
36 |
+
t = 0
|
37 |
+
for _subset_idx in range(num_subsets):
|
38 |
+
x = gen_features[np.random.choice(gen_features.shape[0], m, replace=False)]
|
39 |
+
y = real_features[np.random.choice(real_features.shape[0], m, replace=False)]
|
40 |
+
a = (x @ x.T / n + 1) ** 3 + (y @ y.T / n + 1) ** 3
|
41 |
+
b = (x @ y.T / n + 1) ** 3
|
42 |
+
t += (a.sum() - np.diag(a).sum()) / (m - 1) - b.sum() * 2 / m
|
43 |
+
kid = t / num_subsets / m
|
44 |
+
return float(kid)
|
45 |
+
|
46 |
+
#----------------------------------------------------------------------------
|
metrics/metric_main.py
ADDED
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import os
|
10 |
+
import time
|
11 |
+
import json
|
12 |
+
import torch
|
13 |
+
import dnnlib
|
14 |
+
|
15 |
+
from . import metric_utils
|
16 |
+
from . import frechet_inception_distance
|
17 |
+
from . import kernel_inception_distance
|
18 |
+
from . import precision_recall
|
19 |
+
from . import perceptual_path_length
|
20 |
+
from . import inception_score
|
21 |
+
|
22 |
+
#----------------------------------------------------------------------------
|
23 |
+
|
24 |
+
_metric_dict = dict() # name => fn
|
25 |
+
|
26 |
+
def register_metric(fn):
|
27 |
+
assert callable(fn)
|
28 |
+
_metric_dict[fn.__name__] = fn
|
29 |
+
return fn
|
30 |
+
|
31 |
+
def is_valid_metric(metric):
|
32 |
+
return metric in _metric_dict
|
33 |
+
|
34 |
+
def list_valid_metrics():
|
35 |
+
return list(_metric_dict.keys())
|
36 |
+
|
37 |
+
#----------------------------------------------------------------------------
|
38 |
+
|
39 |
+
def calc_metric(metric, **kwargs): # See metric_utils.MetricOptions for the full list of arguments.
|
40 |
+
assert is_valid_metric(metric)
|
41 |
+
opts = metric_utils.MetricOptions(**kwargs)
|
42 |
+
|
43 |
+
# Calculate.
|
44 |
+
start_time = time.time()
|
45 |
+
results = _metric_dict[metric](opts)
|
46 |
+
total_time = time.time() - start_time
|
47 |
+
|
48 |
+
# Broadcast results.
|
49 |
+
for key, value in list(results.items()):
|
50 |
+
if opts.num_gpus > 1:
|
51 |
+
value = torch.as_tensor(value, dtype=torch.float64, device=opts.device)
|
52 |
+
torch.distributed.broadcast(tensor=value, src=0)
|
53 |
+
value = float(value.cpu())
|
54 |
+
results[key] = value
|
55 |
+
|
56 |
+
# Decorate with metadata.
|
57 |
+
return dnnlib.EasyDict(
|
58 |
+
results = dnnlib.EasyDict(results),
|
59 |
+
metric = metric,
|
60 |
+
total_time = total_time,
|
61 |
+
total_time_str = dnnlib.util.format_time(total_time),
|
62 |
+
num_gpus = opts.num_gpus,
|
63 |
+
)
|
64 |
+
|
65 |
+
#----------------------------------------------------------------------------
|
66 |
+
|
67 |
+
def report_metric(result_dict, run_dir=None, snapshot_pkl=None):
|
68 |
+
metric = result_dict['metric']
|
69 |
+
assert is_valid_metric(metric)
|
70 |
+
if run_dir is not None and snapshot_pkl is not None:
|
71 |
+
snapshot_pkl = os.path.relpath(snapshot_pkl, run_dir)
|
72 |
+
|
73 |
+
jsonl_line = json.dumps(dict(result_dict, snapshot_pkl=snapshot_pkl, timestamp=time.time()))
|
74 |
+
print(jsonl_line)
|
75 |
+
if run_dir is not None and os.path.isdir(run_dir):
|
76 |
+
with open(os.path.join(run_dir, f'metric-{metric}.jsonl'), 'at') as f:
|
77 |
+
f.write(jsonl_line + '\n')
|
78 |
+
|
79 |
+
#----------------------------------------------------------------------------
|
80 |
+
# Primary metrics.
|
81 |
+
|
82 |
+
@register_metric
|
83 |
+
def fid50k_full(opts):
|
84 |
+
opts.dataset_kwargs.update(max_size=None, xflip=False)
|
85 |
+
fid = frechet_inception_distance.compute_fid(opts, max_real=None, num_gen=50000)
|
86 |
+
return dict(fid50k_full=fid)
|
87 |
+
|
88 |
+
@register_metric
|
89 |
+
def kid50k_full(opts):
|
90 |
+
opts.dataset_kwargs.update(max_size=None, xflip=False)
|
91 |
+
kid = kernel_inception_distance.compute_kid(opts, max_real=1000000, num_gen=50000, num_subsets=100, max_subset_size=1000)
|
92 |
+
return dict(kid50k_full=kid)
|
93 |
+
|
94 |
+
@register_metric
|
95 |
+
def pr50k3_full(opts):
|
96 |
+
opts.dataset_kwargs.update(max_size=None, xflip=False)
|
97 |
+
precision, recall = precision_recall.compute_pr(opts, max_real=200000, num_gen=50000, nhood_size=3, row_batch_size=10000, col_batch_size=10000)
|
98 |
+
return dict(pr50k3_full_precision=precision, pr50k3_full_recall=recall)
|
99 |
+
|
100 |
+
@register_metric
|
101 |
+
def ppl2_wend(opts):
|
102 |
+
ppl = perceptual_path_length.compute_ppl(opts, num_samples=50000, epsilon=1e-4, space='w', sampling='end', crop=False, batch_size=2)
|
103 |
+
return dict(ppl2_wend=ppl)
|
104 |
+
|
105 |
+
@register_metric
|
106 |
+
def is50k(opts):
|
107 |
+
opts.dataset_kwargs.update(max_size=None, xflip=False)
|
108 |
+
mean, std = inception_score.compute_is(opts, num_gen=50000, num_splits=10)
|
109 |
+
return dict(is50k_mean=mean, is50k_std=std)
|
110 |
+
|
111 |
+
#----------------------------------------------------------------------------
|
112 |
+
# Legacy metrics.
|
113 |
+
|
114 |
+
@register_metric
|
115 |
+
def fid50k(opts):
|
116 |
+
opts.dataset_kwargs.update(max_size=None)
|
117 |
+
fid = frechet_inception_distance.compute_fid(opts, max_real=50000, num_gen=50000)
|
118 |
+
return dict(fid50k=fid)
|
119 |
+
|
120 |
+
@register_metric
|
121 |
+
def kid50k(opts):
|
122 |
+
opts.dataset_kwargs.update(max_size=None)
|
123 |
+
kid = kernel_inception_distance.compute_kid(opts, max_real=50000, num_gen=50000, num_subsets=100, max_subset_size=1000)
|
124 |
+
return dict(kid50k=kid)
|
125 |
+
|
126 |
+
@register_metric
|
127 |
+
def pr50k3(opts):
|
128 |
+
opts.dataset_kwargs.update(max_size=None)
|
129 |
+
precision, recall = precision_recall.compute_pr(opts, max_real=50000, num_gen=50000, nhood_size=3, row_batch_size=10000, col_batch_size=10000)
|
130 |
+
return dict(pr50k3_precision=precision, pr50k3_recall=recall)
|
131 |
+
|
132 |
+
@register_metric
|
133 |
+
def ppl_zfull(opts):
|
134 |
+
ppl = perceptual_path_length.compute_ppl(opts, num_samples=50000, epsilon=1e-4, space='z', sampling='full', crop=True, batch_size=2)
|
135 |
+
return dict(ppl_zfull=ppl)
|
136 |
+
|
137 |
+
@register_metric
|
138 |
+
def ppl_wfull(opts):
|
139 |
+
ppl = perceptual_path_length.compute_ppl(opts, num_samples=50000, epsilon=1e-4, space='w', sampling='full', crop=True, batch_size=2)
|
140 |
+
return dict(ppl_wfull=ppl)
|
141 |
+
|
142 |
+
@register_metric
|
143 |
+
def ppl_zend(opts):
|
144 |
+
ppl = perceptual_path_length.compute_ppl(opts, num_samples=50000, epsilon=1e-4, space='z', sampling='end', crop=True, batch_size=2)
|
145 |
+
return dict(ppl_zend=ppl)
|
146 |
+
|
147 |
+
@register_metric
|
148 |
+
def ppl_wend(opts):
|
149 |
+
ppl = perceptual_path_length.compute_ppl(opts, num_samples=50000, epsilon=1e-4, space='w', sampling='end', crop=True, batch_size=2)
|
150 |
+
return dict(ppl_wend=ppl)
|
151 |
+
|
152 |
+
#----------------------------------------------------------------------------
|
metrics/metric_utils.py
ADDED
@@ -0,0 +1,275 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import os
|
10 |
+
import time
|
11 |
+
import hashlib
|
12 |
+
import pickle
|
13 |
+
import copy
|
14 |
+
import uuid
|
15 |
+
import numpy as np
|
16 |
+
import torch
|
17 |
+
import dnnlib
|
18 |
+
|
19 |
+
#----------------------------------------------------------------------------
|
20 |
+
|
21 |
+
class MetricOptions:
|
22 |
+
def __init__(self, G=None, G_kwargs={}, dataset_kwargs={}, num_gpus=1, rank=0, device=None, progress=None, cache=True):
|
23 |
+
assert 0 <= rank < num_gpus
|
24 |
+
self.G = G
|
25 |
+
self.G_kwargs = dnnlib.EasyDict(G_kwargs)
|
26 |
+
self.dataset_kwargs = dnnlib.EasyDict(dataset_kwargs)
|
27 |
+
self.num_gpus = num_gpus
|
28 |
+
self.rank = rank
|
29 |
+
self.device = device if device is not None else torch.device('cuda', rank)
|
30 |
+
self.progress = progress.sub() if progress is not None and rank == 0 else ProgressMonitor()
|
31 |
+
self.cache = cache
|
32 |
+
|
33 |
+
#----------------------------------------------------------------------------
|
34 |
+
|
35 |
+
_feature_detector_cache = dict()
|
36 |
+
|
37 |
+
def get_feature_detector_name(url):
|
38 |
+
return os.path.splitext(url.split('/')[-1])[0]
|
39 |
+
|
40 |
+
def get_feature_detector(url, device=torch.device('cpu'), num_gpus=1, rank=0, verbose=False):
|
41 |
+
assert 0 <= rank < num_gpus
|
42 |
+
key = (url, device)
|
43 |
+
if key not in _feature_detector_cache:
|
44 |
+
is_leader = (rank == 0)
|
45 |
+
if not is_leader and num_gpus > 1:
|
46 |
+
torch.distributed.barrier() # leader goes first
|
47 |
+
with dnnlib.util.open_url(url, verbose=(verbose and is_leader)) as f:
|
48 |
+
_feature_detector_cache[key] = torch.jit.load(f).eval().to(device)
|
49 |
+
if is_leader and num_gpus > 1:
|
50 |
+
torch.distributed.barrier() # others follow
|
51 |
+
return _feature_detector_cache[key]
|
52 |
+
|
53 |
+
#----------------------------------------------------------------------------
|
54 |
+
|
55 |
+
class FeatureStats:
|
56 |
+
def __init__(self, capture_all=False, capture_mean_cov=False, max_items=None):
|
57 |
+
self.capture_all = capture_all
|
58 |
+
self.capture_mean_cov = capture_mean_cov
|
59 |
+
self.max_items = max_items
|
60 |
+
self.num_items = 0
|
61 |
+
self.num_features = None
|
62 |
+
self.all_features = None
|
63 |
+
self.raw_mean = None
|
64 |
+
self.raw_cov = None
|
65 |
+
|
66 |
+
def set_num_features(self, num_features):
|
67 |
+
if self.num_features is not None:
|
68 |
+
assert num_features == self.num_features
|
69 |
+
else:
|
70 |
+
self.num_features = num_features
|
71 |
+
self.all_features = []
|
72 |
+
self.raw_mean = np.zeros([num_features], dtype=np.float64)
|
73 |
+
self.raw_cov = np.zeros([num_features, num_features], dtype=np.float64)
|
74 |
+
|
75 |
+
def is_full(self):
|
76 |
+
return (self.max_items is not None) and (self.num_items >= self.max_items)
|
77 |
+
|
78 |
+
def append(self, x):
|
79 |
+
x = np.asarray(x, dtype=np.float32)
|
80 |
+
assert x.ndim == 2
|
81 |
+
if (self.max_items is not None) and (self.num_items + x.shape[0] > self.max_items):
|
82 |
+
if self.num_items >= self.max_items:
|
83 |
+
return
|
84 |
+
x = x[:self.max_items - self.num_items]
|
85 |
+
|
86 |
+
self.set_num_features(x.shape[1])
|
87 |
+
self.num_items += x.shape[0]
|
88 |
+
if self.capture_all:
|
89 |
+
self.all_features.append(x)
|
90 |
+
if self.capture_mean_cov:
|
91 |
+
x64 = x.astype(np.float64)
|
92 |
+
self.raw_mean += x64.sum(axis=0)
|
93 |
+
self.raw_cov += x64.T @ x64
|
94 |
+
|
95 |
+
def append_torch(self, x, num_gpus=1, rank=0):
|
96 |
+
assert isinstance(x, torch.Tensor) and x.ndim == 2
|
97 |
+
assert 0 <= rank < num_gpus
|
98 |
+
if num_gpus > 1:
|
99 |
+
ys = []
|
100 |
+
for src in range(num_gpus):
|
101 |
+
y = x.clone()
|
102 |
+
torch.distributed.broadcast(y, src=src)
|
103 |
+
ys.append(y)
|
104 |
+
x = torch.stack(ys, dim=1).flatten(0, 1) # interleave samples
|
105 |
+
self.append(x.cpu().numpy())
|
106 |
+
|
107 |
+
def get_all(self):
|
108 |
+
assert self.capture_all
|
109 |
+
return np.concatenate(self.all_features, axis=0)
|
110 |
+
|
111 |
+
def get_all_torch(self):
|
112 |
+
return torch.from_numpy(self.get_all())
|
113 |
+
|
114 |
+
def get_mean_cov(self):
|
115 |
+
assert self.capture_mean_cov
|
116 |
+
mean = self.raw_mean / self.num_items
|
117 |
+
cov = self.raw_cov / self.num_items
|
118 |
+
cov = cov - np.outer(mean, mean)
|
119 |
+
return mean, cov
|
120 |
+
|
121 |
+
def save(self, pkl_file):
|
122 |
+
with open(pkl_file, 'wb') as f:
|
123 |
+
pickle.dump(self.__dict__, f)
|
124 |
+
|
125 |
+
@staticmethod
|
126 |
+
def load(pkl_file):
|
127 |
+
with open(pkl_file, 'rb') as f:
|
128 |
+
s = dnnlib.EasyDict(pickle.load(f))
|
129 |
+
obj = FeatureStats(capture_all=s.capture_all, max_items=s.max_items)
|
130 |
+
obj.__dict__.update(s)
|
131 |
+
return obj
|
132 |
+
|
133 |
+
#----------------------------------------------------------------------------
|
134 |
+
|
135 |
+
class ProgressMonitor:
|
136 |
+
def __init__(self, tag=None, num_items=None, flush_interval=1000, verbose=False, progress_fn=None, pfn_lo=0, pfn_hi=1000, pfn_total=1000):
|
137 |
+
self.tag = tag
|
138 |
+
self.num_items = num_items
|
139 |
+
self.verbose = verbose
|
140 |
+
self.flush_interval = flush_interval
|
141 |
+
self.progress_fn = progress_fn
|
142 |
+
self.pfn_lo = pfn_lo
|
143 |
+
self.pfn_hi = pfn_hi
|
144 |
+
self.pfn_total = pfn_total
|
145 |
+
self.start_time = time.time()
|
146 |
+
self.batch_time = self.start_time
|
147 |
+
self.batch_items = 0
|
148 |
+
if self.progress_fn is not None:
|
149 |
+
self.progress_fn(self.pfn_lo, self.pfn_total)
|
150 |
+
|
151 |
+
def update(self, cur_items):
|
152 |
+
assert (self.num_items is None) or (cur_items <= self.num_items)
|
153 |
+
if (cur_items < self.batch_items + self.flush_interval) and (self.num_items is None or cur_items < self.num_items):
|
154 |
+
return
|
155 |
+
cur_time = time.time()
|
156 |
+
total_time = cur_time - self.start_time
|
157 |
+
time_per_item = (cur_time - self.batch_time) / max(cur_items - self.batch_items, 1)
|
158 |
+
if (self.verbose) and (self.tag is not None):
|
159 |
+
print(f'{self.tag:<19s} items {cur_items:<7d} time {dnnlib.util.format_time(total_time):<12s} ms/item {time_per_item*1e3:.2f}')
|
160 |
+
self.batch_time = cur_time
|
161 |
+
self.batch_items = cur_items
|
162 |
+
|
163 |
+
if (self.progress_fn is not None) and (self.num_items is not None):
|
164 |
+
self.progress_fn(self.pfn_lo + (self.pfn_hi - self.pfn_lo) * (cur_items / self.num_items), self.pfn_total)
|
165 |
+
|
166 |
+
def sub(self, tag=None, num_items=None, flush_interval=1000, rel_lo=0, rel_hi=1):
|
167 |
+
return ProgressMonitor(
|
168 |
+
tag = tag,
|
169 |
+
num_items = num_items,
|
170 |
+
flush_interval = flush_interval,
|
171 |
+
verbose = self.verbose,
|
172 |
+
progress_fn = self.progress_fn,
|
173 |
+
pfn_lo = self.pfn_lo + (self.pfn_hi - self.pfn_lo) * rel_lo,
|
174 |
+
pfn_hi = self.pfn_lo + (self.pfn_hi - self.pfn_lo) * rel_hi,
|
175 |
+
pfn_total = self.pfn_total,
|
176 |
+
)
|
177 |
+
|
178 |
+
#----------------------------------------------------------------------------
|
179 |
+
|
180 |
+
def compute_feature_stats_for_dataset(opts, detector_url, detector_kwargs, rel_lo=0, rel_hi=1, batch_size=64, data_loader_kwargs=None, max_items=None, **stats_kwargs):
|
181 |
+
dataset = dnnlib.util.construct_class_by_name(**opts.dataset_kwargs)
|
182 |
+
if data_loader_kwargs is None:
|
183 |
+
data_loader_kwargs = dict(pin_memory=True, num_workers=3, prefetch_factor=2)
|
184 |
+
|
185 |
+
# Try to lookup from cache.
|
186 |
+
cache_file = None
|
187 |
+
if opts.cache:
|
188 |
+
# Choose cache file name.
|
189 |
+
args = dict(dataset_kwargs=opts.dataset_kwargs, detector_url=detector_url, detector_kwargs=detector_kwargs, stats_kwargs=stats_kwargs)
|
190 |
+
md5 = hashlib.md5(repr(sorted(args.items())).encode('utf-8'))
|
191 |
+
cache_tag = f'{dataset.name}-{get_feature_detector_name(detector_url)}-{md5.hexdigest()}'
|
192 |
+
cache_file = dnnlib.make_cache_dir_path('gan-metrics', cache_tag + '.pkl')
|
193 |
+
|
194 |
+
# Check if the file exists (all processes must agree).
|
195 |
+
flag = os.path.isfile(cache_file) if opts.rank == 0 else False
|
196 |
+
if opts.num_gpus > 1:
|
197 |
+
flag = torch.as_tensor(flag, dtype=torch.float32, device=opts.device)
|
198 |
+
torch.distributed.broadcast(tensor=flag, src=0)
|
199 |
+
flag = (float(flag.cpu()) != 0)
|
200 |
+
|
201 |
+
# Load.
|
202 |
+
if flag:
|
203 |
+
return FeatureStats.load(cache_file)
|
204 |
+
|
205 |
+
# Initialize.
|
206 |
+
num_items = len(dataset)
|
207 |
+
if max_items is not None:
|
208 |
+
num_items = min(num_items, max_items)
|
209 |
+
stats = FeatureStats(max_items=num_items, **stats_kwargs)
|
210 |
+
progress = opts.progress.sub(tag='dataset features', num_items=num_items, rel_lo=rel_lo, rel_hi=rel_hi)
|
211 |
+
detector = get_feature_detector(url=detector_url, device=opts.device, num_gpus=opts.num_gpus, rank=opts.rank, verbose=progress.verbose)
|
212 |
+
|
213 |
+
# Main loop.
|
214 |
+
item_subset = [(i * opts.num_gpus + opts.rank) % num_items for i in range((num_items - 1) // opts.num_gpus + 1)]
|
215 |
+
for images, _labels in torch.utils.data.DataLoader(dataset=dataset, sampler=item_subset, batch_size=batch_size, **data_loader_kwargs):
|
216 |
+
if images.shape[1] == 1:
|
217 |
+
images = images.repeat([1, 3, 1, 1])
|
218 |
+
features = detector(images.to(opts.device), **detector_kwargs)
|
219 |
+
stats.append_torch(features, num_gpus=opts.num_gpus, rank=opts.rank)
|
220 |
+
progress.update(stats.num_items)
|
221 |
+
|
222 |
+
# Save to cache.
|
223 |
+
if cache_file is not None and opts.rank == 0:
|
224 |
+
os.makedirs(os.path.dirname(cache_file), exist_ok=True)
|
225 |
+
temp_file = cache_file + '.' + uuid.uuid4().hex
|
226 |
+
stats.save(temp_file)
|
227 |
+
os.replace(temp_file, cache_file) # atomic
|
228 |
+
return stats
|
229 |
+
|
230 |
+
#----------------------------------------------------------------------------
|
231 |
+
|
232 |
+
def compute_feature_stats_for_generator(opts, detector_url, detector_kwargs, rel_lo=0, rel_hi=1, batch_size=64, batch_gen=None, jit=False, **stats_kwargs):
|
233 |
+
if batch_gen is None:
|
234 |
+
batch_gen = min(batch_size, 4)
|
235 |
+
assert batch_size % batch_gen == 0
|
236 |
+
|
237 |
+
# Setup generator and load labels.
|
238 |
+
G = copy.deepcopy(opts.G).eval().requires_grad_(False).to(opts.device)
|
239 |
+
dataset = dnnlib.util.construct_class_by_name(**opts.dataset_kwargs)
|
240 |
+
|
241 |
+
# Image generation func.
|
242 |
+
def run_generator(z, c):
|
243 |
+
img = G(z=z, c=c, **opts.G_kwargs)
|
244 |
+
img = (img * 127.5 + 128).clamp(0, 255).to(torch.uint8)
|
245 |
+
return img
|
246 |
+
|
247 |
+
# JIT.
|
248 |
+
if jit:
|
249 |
+
z = torch.zeros([batch_gen, G.z_dim], device=opts.device)
|
250 |
+
c = torch.zeros([batch_gen, G.c_dim], device=opts.device)
|
251 |
+
run_generator = torch.jit.trace(run_generator, [z, c], check_trace=False)
|
252 |
+
|
253 |
+
# Initialize.
|
254 |
+
stats = FeatureStats(**stats_kwargs)
|
255 |
+
assert stats.max_items is not None
|
256 |
+
progress = opts.progress.sub(tag='generator features', num_items=stats.max_items, rel_lo=rel_lo, rel_hi=rel_hi)
|
257 |
+
detector = get_feature_detector(url=detector_url, device=opts.device, num_gpus=opts.num_gpus, rank=opts.rank, verbose=progress.verbose)
|
258 |
+
|
259 |
+
# Main loop.
|
260 |
+
while not stats.is_full():
|
261 |
+
images = []
|
262 |
+
for _i in range(batch_size // batch_gen):
|
263 |
+
z = torch.randn([batch_gen, G.z_dim], device=opts.device)
|
264 |
+
c = [dataset.get_label(np.random.randint(len(dataset))) for _i in range(batch_gen)]
|
265 |
+
c = torch.from_numpy(np.stack(c)).pin_memory().to(opts.device)
|
266 |
+
images.append(run_generator(z, c))
|
267 |
+
images = torch.cat(images)
|
268 |
+
if images.shape[1] == 1:
|
269 |
+
images = images.repeat([1, 3, 1, 1])
|
270 |
+
features = detector(images, **detector_kwargs)
|
271 |
+
stats.append_torch(features, num_gpus=opts.num_gpus, rank=opts.rank)
|
272 |
+
progress.update(stats.num_items)
|
273 |
+
return stats
|
274 |
+
|
275 |
+
#----------------------------------------------------------------------------
|
metrics/perceptual_path_length.py
ADDED
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Perceptual Path Length (PPL) from the paper "A Style-Based Generator
|
10 |
+
Architecture for Generative Adversarial Networks". Matches the original
|
11 |
+
implementation by Karras et al. at
|
12 |
+
https://github.com/NVlabs/stylegan/blob/master/metrics/perceptual_path_length.py"""
|
13 |
+
|
14 |
+
import copy
|
15 |
+
import numpy as np
|
16 |
+
import torch
|
17 |
+
import dnnlib
|
18 |
+
from . import metric_utils
|
19 |
+
|
20 |
+
#----------------------------------------------------------------------------
|
21 |
+
|
22 |
+
# Spherical interpolation of a batch of vectors.
|
23 |
+
def slerp(a, b, t):
|
24 |
+
a = a / a.norm(dim=-1, keepdim=True)
|
25 |
+
b = b / b.norm(dim=-1, keepdim=True)
|
26 |
+
d = (a * b).sum(dim=-1, keepdim=True)
|
27 |
+
p = t * torch.acos(d)
|
28 |
+
c = b - d * a
|
29 |
+
c = c / c.norm(dim=-1, keepdim=True)
|
30 |
+
d = a * torch.cos(p) + c * torch.sin(p)
|
31 |
+
d = d / d.norm(dim=-1, keepdim=True)
|
32 |
+
return d
|
33 |
+
|
34 |
+
#----------------------------------------------------------------------------
|
35 |
+
|
36 |
+
class PPLSampler(torch.nn.Module):
|
37 |
+
def __init__(self, G, G_kwargs, epsilon, space, sampling, crop, vgg16):
|
38 |
+
assert space in ['z', 'w']
|
39 |
+
assert sampling in ['full', 'end']
|
40 |
+
super().__init__()
|
41 |
+
self.G = copy.deepcopy(G)
|
42 |
+
self.G_kwargs = G_kwargs
|
43 |
+
self.epsilon = epsilon
|
44 |
+
self.space = space
|
45 |
+
self.sampling = sampling
|
46 |
+
self.crop = crop
|
47 |
+
self.vgg16 = copy.deepcopy(vgg16)
|
48 |
+
|
49 |
+
def forward(self, c):
|
50 |
+
# Generate random latents and interpolation t-values.
|
51 |
+
t = torch.rand([c.shape[0]], device=c.device) * (1 if self.sampling == 'full' else 0)
|
52 |
+
z0, z1 = torch.randn([c.shape[0] * 2, self.G.z_dim], device=c.device).chunk(2)
|
53 |
+
|
54 |
+
# Interpolate in W or Z.
|
55 |
+
if self.space == 'w':
|
56 |
+
w0, w1 = self.G.mapping(z=torch.cat([z0,z1]), c=torch.cat([c,c])).chunk(2)
|
57 |
+
wt0 = w0.lerp(w1, t.unsqueeze(1).unsqueeze(2))
|
58 |
+
wt1 = w0.lerp(w1, t.unsqueeze(1).unsqueeze(2) + self.epsilon)
|
59 |
+
else: # space == 'z'
|
60 |
+
zt0 = slerp(z0, z1, t.unsqueeze(1))
|
61 |
+
zt1 = slerp(z0, z1, t.unsqueeze(1) + self.epsilon)
|
62 |
+
wt0, wt1 = self.G.mapping(z=torch.cat([zt0,zt1]), c=torch.cat([c,c])).chunk(2)
|
63 |
+
|
64 |
+
# Randomize noise buffers.
|
65 |
+
for name, buf in self.G.named_buffers():
|
66 |
+
if name.endswith('.noise_const'):
|
67 |
+
buf.copy_(torch.randn_like(buf))
|
68 |
+
|
69 |
+
# Generate images.
|
70 |
+
img = self.G.synthesis(ws=torch.cat([wt0,wt1]), noise_mode='const', force_fp32=True, **self.G_kwargs)
|
71 |
+
|
72 |
+
# Center crop.
|
73 |
+
if self.crop:
|
74 |
+
assert img.shape[2] == img.shape[3]
|
75 |
+
c = img.shape[2] // 8
|
76 |
+
img = img[:, :, c*3 : c*7, c*2 : c*6]
|
77 |
+
|
78 |
+
# Downsample to 256x256.
|
79 |
+
factor = self.G.img_resolution // 256
|
80 |
+
if factor > 1:
|
81 |
+
img = img.reshape([-1, img.shape[1], img.shape[2] // factor, factor, img.shape[3] // factor, factor]).mean([3, 5])
|
82 |
+
|
83 |
+
# Scale dynamic range from [-1,1] to [0,255].
|
84 |
+
img = (img + 1) * (255 / 2)
|
85 |
+
if self.G.img_channels == 1:
|
86 |
+
img = img.repeat([1, 3, 1, 1])
|
87 |
+
|
88 |
+
# Evaluate differential LPIPS.
|
89 |
+
lpips_t0, lpips_t1 = self.vgg16(img, resize_images=False, return_lpips=True).chunk(2)
|
90 |
+
dist = (lpips_t0 - lpips_t1).square().sum(1) / self.epsilon ** 2
|
91 |
+
return dist
|
92 |
+
|
93 |
+
#----------------------------------------------------------------------------
|
94 |
+
|
95 |
+
def compute_ppl(opts, num_samples, epsilon, space, sampling, crop, batch_size, jit=False):
|
96 |
+
dataset = dnnlib.util.construct_class_by_name(**opts.dataset_kwargs)
|
97 |
+
vgg16_url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/vgg16.pt'
|
98 |
+
vgg16 = metric_utils.get_feature_detector(vgg16_url, num_gpus=opts.num_gpus, rank=opts.rank, verbose=opts.progress.verbose)
|
99 |
+
|
100 |
+
# Setup sampler.
|
101 |
+
sampler = PPLSampler(G=opts.G, G_kwargs=opts.G_kwargs, epsilon=epsilon, space=space, sampling=sampling, crop=crop, vgg16=vgg16)
|
102 |
+
sampler.eval().requires_grad_(False).to(opts.device)
|
103 |
+
if jit:
|
104 |
+
c = torch.zeros([batch_size, opts.G.c_dim], device=opts.device)
|
105 |
+
sampler = torch.jit.trace(sampler, [c], check_trace=False)
|
106 |
+
|
107 |
+
# Sampling loop.
|
108 |
+
dist = []
|
109 |
+
progress = opts.progress.sub(tag='ppl sampling', num_items=num_samples)
|
110 |
+
for batch_start in range(0, num_samples, batch_size * opts.num_gpus):
|
111 |
+
progress.update(batch_start)
|
112 |
+
c = [dataset.get_label(np.random.randint(len(dataset))) for _i in range(batch_size)]
|
113 |
+
c = torch.from_numpy(np.stack(c)).pin_memory().to(opts.device)
|
114 |
+
x = sampler(c)
|
115 |
+
for src in range(opts.num_gpus):
|
116 |
+
y = x.clone()
|
117 |
+
if opts.num_gpus > 1:
|
118 |
+
torch.distributed.broadcast(y, src=src)
|
119 |
+
dist.append(y)
|
120 |
+
progress.update(num_samples)
|
121 |
+
|
122 |
+
# Compute PPL.
|
123 |
+
if opts.rank != 0:
|
124 |
+
return float('nan')
|
125 |
+
dist = torch.cat(dist)[:num_samples].cpu().numpy()
|
126 |
+
lo = np.percentile(dist, 1, interpolation='lower')
|
127 |
+
hi = np.percentile(dist, 99, interpolation='higher')
|
128 |
+
ppl = np.extract(np.logical_and(dist >= lo, dist <= hi), dist).mean()
|
129 |
+
return float(ppl)
|
130 |
+
|
131 |
+
#----------------------------------------------------------------------------
|
metrics/precision_recall.py
ADDED
@@ -0,0 +1,62 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Precision/Recall (PR) from the paper "Improved Precision and Recall
|
10 |
+
Metric for Assessing Generative Models". Matches the original implementation
|
11 |
+
by Kynkaanniemi et al. at
|
12 |
+
https://github.com/kynkaat/improved-precision-and-recall-metric/blob/master/precision_recall.py"""
|
13 |
+
|
14 |
+
import torch
|
15 |
+
from . import metric_utils
|
16 |
+
|
17 |
+
#----------------------------------------------------------------------------
|
18 |
+
|
19 |
+
def compute_distances(row_features, col_features, num_gpus, rank, col_batch_size):
|
20 |
+
assert 0 <= rank < num_gpus
|
21 |
+
num_cols = col_features.shape[0]
|
22 |
+
num_batches = ((num_cols - 1) // col_batch_size // num_gpus + 1) * num_gpus
|
23 |
+
col_batches = torch.nn.functional.pad(col_features, [0, 0, 0, -num_cols % num_batches]).chunk(num_batches)
|
24 |
+
dist_batches = []
|
25 |
+
for col_batch in col_batches[rank :: num_gpus]:
|
26 |
+
dist_batch = torch.cdist(row_features.unsqueeze(0), col_batch.unsqueeze(0))[0]
|
27 |
+
for src in range(num_gpus):
|
28 |
+
dist_broadcast = dist_batch.clone()
|
29 |
+
if num_gpus > 1:
|
30 |
+
torch.distributed.broadcast(dist_broadcast, src=src)
|
31 |
+
dist_batches.append(dist_broadcast.cpu() if rank == 0 else None)
|
32 |
+
return torch.cat(dist_batches, dim=1)[:, :num_cols] if rank == 0 else None
|
33 |
+
|
34 |
+
#----------------------------------------------------------------------------
|
35 |
+
|
36 |
+
def compute_pr(opts, max_real, num_gen, nhood_size, row_batch_size, col_batch_size):
|
37 |
+
detector_url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/vgg16.pt'
|
38 |
+
detector_kwargs = dict(return_features=True)
|
39 |
+
|
40 |
+
real_features = metric_utils.compute_feature_stats_for_dataset(
|
41 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
42 |
+
rel_lo=0, rel_hi=0, capture_all=True, max_items=max_real).get_all_torch().to(torch.float16).to(opts.device)
|
43 |
+
|
44 |
+
gen_features = metric_utils.compute_feature_stats_for_generator(
|
45 |
+
opts=opts, detector_url=detector_url, detector_kwargs=detector_kwargs,
|
46 |
+
rel_lo=0, rel_hi=1, capture_all=True, max_items=num_gen).get_all_torch().to(torch.float16).to(opts.device)
|
47 |
+
|
48 |
+
results = dict()
|
49 |
+
for name, manifold, probes in [('precision', real_features, gen_features), ('recall', gen_features, real_features)]:
|
50 |
+
kth = []
|
51 |
+
for manifold_batch in manifold.split(row_batch_size):
|
52 |
+
dist = compute_distances(row_features=manifold_batch, col_features=manifold, num_gpus=opts.num_gpus, rank=opts.rank, col_batch_size=col_batch_size)
|
53 |
+
kth.append(dist.to(torch.float32).kthvalue(nhood_size + 1).values.to(torch.float16) if opts.rank == 0 else None)
|
54 |
+
kth = torch.cat(kth) if opts.rank == 0 else None
|
55 |
+
pred = []
|
56 |
+
for probes_batch in probes.split(row_batch_size):
|
57 |
+
dist = compute_distances(row_features=probes_batch, col_features=manifold, num_gpus=opts.num_gpus, rank=opts.rank, col_batch_size=col_batch_size)
|
58 |
+
pred.append((dist <= kth).any(dim=1) if opts.rank == 0 else None)
|
59 |
+
results[name] = float(torch.cat(pred).to(torch.float32).mean() if opts.rank == 0 else 'nan')
|
60 |
+
return results['precision'], results['recall']
|
61 |
+
|
62 |
+
#----------------------------------------------------------------------------
|
projector.py
ADDED
@@ -0,0 +1,212 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Project given image to the latent space of pretrained network pickle."""
|
10 |
+
|
11 |
+
import copy
|
12 |
+
import os
|
13 |
+
from time import perf_counter
|
14 |
+
|
15 |
+
import click
|
16 |
+
import imageio
|
17 |
+
import numpy as np
|
18 |
+
import PIL.Image
|
19 |
+
import torch
|
20 |
+
import torch.nn.functional as F
|
21 |
+
|
22 |
+
import dnnlib
|
23 |
+
import legacy
|
24 |
+
|
25 |
+
def project(
|
26 |
+
G,
|
27 |
+
target: torch.Tensor, # [C,H,W] and dynamic range [0,255], W & H must match G output resolution
|
28 |
+
*,
|
29 |
+
num_steps = 1000,
|
30 |
+
w_avg_samples = 10000,
|
31 |
+
initial_learning_rate = 0.1,
|
32 |
+
initial_noise_factor = 0.05,
|
33 |
+
lr_rampdown_length = 0.25,
|
34 |
+
lr_rampup_length = 0.05,
|
35 |
+
noise_ramp_length = 0.75,
|
36 |
+
regularize_noise_weight = 1e5,
|
37 |
+
verbose = False,
|
38 |
+
device: torch.device
|
39 |
+
):
|
40 |
+
assert target.shape == (G.img_channels, G.img_resolution, G.img_resolution)
|
41 |
+
|
42 |
+
def logprint(*args):
|
43 |
+
if verbose:
|
44 |
+
print(*args)
|
45 |
+
|
46 |
+
G = copy.deepcopy(G).eval().requires_grad_(False).to(device) # type: ignore
|
47 |
+
|
48 |
+
# Compute w stats.
|
49 |
+
logprint(f'Computing W midpoint and stddev using {w_avg_samples} samples...')
|
50 |
+
z_samples = np.random.RandomState(123).randn(w_avg_samples, G.z_dim)
|
51 |
+
w_samples = G.mapping(torch.from_numpy(z_samples).to(device), None) # [N, L, C]
|
52 |
+
w_samples = w_samples[:, :1, :].cpu().numpy().astype(np.float32) # [N, 1, C]
|
53 |
+
w_avg = np.mean(w_samples, axis=0, keepdims=True) # [1, 1, C]
|
54 |
+
w_std = (np.sum((w_samples - w_avg) ** 2) / w_avg_samples) ** 0.5
|
55 |
+
|
56 |
+
# Setup noise inputs.
|
57 |
+
noise_bufs = { name: buf for (name, buf) in G.synthesis.named_buffers() if 'noise_const' in name }
|
58 |
+
|
59 |
+
# Load VGG16 feature detector.
|
60 |
+
url = 'https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metrics/vgg16.pt'
|
61 |
+
with dnnlib.util.open_url(url) as f:
|
62 |
+
vgg16 = torch.jit.load(f).eval().to(device)
|
63 |
+
|
64 |
+
# Features for target image.
|
65 |
+
target_images = target.unsqueeze(0).to(device).to(torch.float32)
|
66 |
+
if target_images.shape[2] > 256:
|
67 |
+
target_images = F.interpolate(target_images, size=(256, 256), mode='area')
|
68 |
+
target_features = vgg16(target_images, resize_images=False, return_lpips=True)
|
69 |
+
|
70 |
+
w_opt = torch.tensor(w_avg, dtype=torch.float32, device=device, requires_grad=True) # pylint: disable=not-callable
|
71 |
+
w_out = torch.zeros([num_steps] + list(w_opt.shape[1:]), dtype=torch.float32, device=device)
|
72 |
+
optimizer = torch.optim.Adam([w_opt] + list(noise_bufs.values()), betas=(0.9, 0.999), lr=initial_learning_rate)
|
73 |
+
|
74 |
+
# Init noise.
|
75 |
+
for buf in noise_bufs.values():
|
76 |
+
buf[:] = torch.randn_like(buf)
|
77 |
+
buf.requires_grad = True
|
78 |
+
|
79 |
+
for step in range(num_steps):
|
80 |
+
# Learning rate schedule.
|
81 |
+
t = step / num_steps
|
82 |
+
w_noise_scale = w_std * initial_noise_factor * max(0.0, 1.0 - t / noise_ramp_length) ** 2
|
83 |
+
lr_ramp = min(1.0, (1.0 - t) / lr_rampdown_length)
|
84 |
+
lr_ramp = 0.5 - 0.5 * np.cos(lr_ramp * np.pi)
|
85 |
+
lr_ramp = lr_ramp * min(1.0, t / lr_rampup_length)
|
86 |
+
lr = initial_learning_rate * lr_ramp
|
87 |
+
for param_group in optimizer.param_groups:
|
88 |
+
param_group['lr'] = lr
|
89 |
+
|
90 |
+
# Synth images from opt_w.
|
91 |
+
w_noise = torch.randn_like(w_opt) * w_noise_scale
|
92 |
+
ws = (w_opt + w_noise).repeat([1, G.mapping.num_ws, 1])
|
93 |
+
synth_images = G.synthesis(ws, noise_mode='const')
|
94 |
+
|
95 |
+
# Downsample image to 256x256 if it's larger than that. VGG was built for 224x224 images.
|
96 |
+
synth_images = (synth_images + 1) * (255/2)
|
97 |
+
if synth_images.shape[2] > 256:
|
98 |
+
synth_images = F.interpolate(synth_images, size=(256, 256), mode='area')
|
99 |
+
|
100 |
+
# Features for synth images.
|
101 |
+
synth_features = vgg16(synth_images, resize_images=False, return_lpips=True)
|
102 |
+
dist = (target_features - synth_features).square().sum()
|
103 |
+
|
104 |
+
# Noise regularization.
|
105 |
+
reg_loss = 0.0
|
106 |
+
for v in noise_bufs.values():
|
107 |
+
noise = v[None,None,:,:] # must be [1,1,H,W] for F.avg_pool2d()
|
108 |
+
while True:
|
109 |
+
reg_loss += (noise*torch.roll(noise, shifts=1, dims=3)).mean()**2
|
110 |
+
reg_loss += (noise*torch.roll(noise, shifts=1, dims=2)).mean()**2
|
111 |
+
if noise.shape[2] <= 8:
|
112 |
+
break
|
113 |
+
noise = F.avg_pool2d(noise, kernel_size=2)
|
114 |
+
loss = dist + reg_loss * regularize_noise_weight
|
115 |
+
|
116 |
+
# Step
|
117 |
+
optimizer.zero_grad(set_to_none=True)
|
118 |
+
loss.backward()
|
119 |
+
optimizer.step()
|
120 |
+
logprint(f'step {step+1:>4d}/{num_steps}: dist {dist:<4.2f} loss {float(loss):<5.2f}')
|
121 |
+
|
122 |
+
# Save projected W for each optimization step.
|
123 |
+
w_out[step] = w_opt.detach()[0]
|
124 |
+
|
125 |
+
# Normalize noise.
|
126 |
+
with torch.no_grad():
|
127 |
+
for buf in noise_bufs.values():
|
128 |
+
buf -= buf.mean()
|
129 |
+
buf *= buf.square().mean().rsqrt()
|
130 |
+
|
131 |
+
return w_out.repeat([1, G.mapping.num_ws, 1])
|
132 |
+
|
133 |
+
#----------------------------------------------------------------------------
|
134 |
+
|
135 |
+
@click.command()
|
136 |
+
@click.option('--network', 'network_pkl', help='Network pickle filename', required=True)
|
137 |
+
@click.option('--target', 'target_fname', help='Target image file to project to', required=True, metavar='FILE')
|
138 |
+
@click.option('--num-steps', help='Number of optimization steps', type=int, default=1000, show_default=True)
|
139 |
+
@click.option('--seed', help='Random seed', type=int, default=303, show_default=True)
|
140 |
+
@click.option('--save-video', help='Save an mp4 video of optimization progress', type=bool, default=True, show_default=True)
|
141 |
+
@click.option('--outdir', help='Where to save the output images', required=True, metavar='DIR')
|
142 |
+
def run_projection(
|
143 |
+
network_pkl: str,
|
144 |
+
target_fname: str,
|
145 |
+
outdir: str,
|
146 |
+
save_video: bool,
|
147 |
+
seed: int,
|
148 |
+
num_steps: int
|
149 |
+
):
|
150 |
+
"""Project given image to the latent space of pretrained network pickle.
|
151 |
+
|
152 |
+
Examples:
|
153 |
+
|
154 |
+
\b
|
155 |
+
python projector.py --outdir=out --target=~/mytargetimg.png \\
|
156 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
|
157 |
+
"""
|
158 |
+
np.random.seed(seed)
|
159 |
+
torch.manual_seed(seed)
|
160 |
+
|
161 |
+
# Load networks.
|
162 |
+
print('Loading networks from "%s"...' % network_pkl)
|
163 |
+
device = torch.device('cuda')
|
164 |
+
with dnnlib.util.open_url(network_pkl) as fp:
|
165 |
+
G = legacy.load_network_pkl(fp)['G_ema'].requires_grad_(False).to(device) # type: ignore
|
166 |
+
|
167 |
+
# Load target image.
|
168 |
+
target_pil = PIL.Image.open(target_fname).convert('RGB')
|
169 |
+
w, h = target_pil.size
|
170 |
+
s = min(w, h)
|
171 |
+
target_pil = target_pil.crop(((w - s) // 2, (h - s) // 2, (w + s) // 2, (h + s) // 2))
|
172 |
+
target_pil = target_pil.resize((G.img_resolution, G.img_resolution), PIL.Image.LANCZOS)
|
173 |
+
target_uint8 = np.array(target_pil, dtype=np.uint8)
|
174 |
+
|
175 |
+
# Optimize projection.
|
176 |
+
start_time = perf_counter()
|
177 |
+
projected_w_steps = project(
|
178 |
+
G,
|
179 |
+
target=torch.tensor(target_uint8.transpose([2, 0, 1]), device=device), # pylint: disable=not-callable
|
180 |
+
num_steps=num_steps,
|
181 |
+
device=device,
|
182 |
+
verbose=True
|
183 |
+
)
|
184 |
+
print (f'Elapsed: {(perf_counter()-start_time):.1f} s')
|
185 |
+
|
186 |
+
# Render debug output: optional video and projected image and W vector.
|
187 |
+
os.makedirs(outdir, exist_ok=True)
|
188 |
+
if save_video:
|
189 |
+
video = imageio.get_writer(f'{outdir}/proj.mp4', mode='I', fps=10, codec='libx264', bitrate='16M')
|
190 |
+
print (f'Saving optimization progress video "{outdir}/proj.mp4"')
|
191 |
+
for projected_w in projected_w_steps:
|
192 |
+
synth_image = G.synthesis(projected_w.unsqueeze(0), noise_mode='const')
|
193 |
+
synth_image = (synth_image + 1) * (255/2)
|
194 |
+
synth_image = synth_image.permute(0, 2, 3, 1).clamp(0, 255).to(torch.uint8)[0].cpu().numpy()
|
195 |
+
video.append_data(np.concatenate([target_uint8, synth_image], axis=1))
|
196 |
+
video.close()
|
197 |
+
|
198 |
+
# Save final projected frame and W vector.
|
199 |
+
target_pil.save(f'{outdir}/target.png')
|
200 |
+
projected_w = projected_w_steps[-1]
|
201 |
+
synth_image = G.synthesis(projected_w.unsqueeze(0), noise_mode='const')
|
202 |
+
synth_image = (synth_image + 1) * (255/2)
|
203 |
+
synth_image = synth_image.permute(0, 2, 3, 1).clamp(0, 255).to(torch.uint8)[0].cpu().numpy()
|
204 |
+
PIL.Image.fromarray(synth_image, 'RGB').save(f'{outdir}/proj.png')
|
205 |
+
np.savez(f'{outdir}/projected_w.npz', w=projected_w.unsqueeze(0).cpu().numpy())
|
206 |
+
|
207 |
+
#----------------------------------------------------------------------------
|
208 |
+
|
209 |
+
if __name__ == "__main__":
|
210 |
+
run_projection() # pylint: disable=no-value-for-parameter
|
211 |
+
|
212 |
+
#----------------------------------------------------------------------------
|
requirements.txt
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
gradio torch numpy pillow tqdm
|
2 |
+
gradio
|
3 |
+
torch
|
4 |
+
numpy
|
5 |
+
pillow
|
6 |
+
tqdm
|
style_mixing.py
ADDED
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Generate style mixing image matrix using pretrained network pickle."""
|
10 |
+
|
11 |
+
import os
|
12 |
+
import re
|
13 |
+
from typing import List
|
14 |
+
|
15 |
+
import click
|
16 |
+
import dnnlib
|
17 |
+
import numpy as np
|
18 |
+
import PIL.Image
|
19 |
+
import torch
|
20 |
+
|
21 |
+
import legacy
|
22 |
+
|
23 |
+
#----------------------------------------------------------------------------
|
24 |
+
|
25 |
+
def num_range(s: str) -> List[int]:
|
26 |
+
'''Accept either a comma separated list of numbers 'a,b,c' or a range 'a-c' and return as a list of ints.'''
|
27 |
+
|
28 |
+
range_re = re.compile(r'^(\d+)-(\d+)$')
|
29 |
+
m = range_re.match(s)
|
30 |
+
if m:
|
31 |
+
return list(range(int(m.group(1)), int(m.group(2))+1))
|
32 |
+
vals = s.split(',')
|
33 |
+
return [int(x) for x in vals]
|
34 |
+
|
35 |
+
#----------------------------------------------------------------------------
|
36 |
+
|
37 |
+
@click.command()
|
38 |
+
@click.option('--network', 'network_pkl', help='Network pickle filename', required=True)
|
39 |
+
@click.option('--rows', 'row_seeds', type=num_range, help='Random seeds to use for image rows', required=True)
|
40 |
+
@click.option('--cols', 'col_seeds', type=num_range, help='Random seeds to use for image columns', required=True)
|
41 |
+
@click.option('--styles', 'col_styles', type=num_range, help='Style layer range', default='0-6', show_default=True)
|
42 |
+
@click.option('--trunc', 'truncation_psi', type=float, help='Truncation psi', default=1, show_default=True)
|
43 |
+
@click.option('--noise-mode', help='Noise mode', type=click.Choice(['const', 'random', 'none']), default='const', show_default=True)
|
44 |
+
@click.option('--outdir', type=str, required=True)
|
45 |
+
def generate_style_mix(
|
46 |
+
network_pkl: str,
|
47 |
+
row_seeds: List[int],
|
48 |
+
col_seeds: List[int],
|
49 |
+
col_styles: List[int],
|
50 |
+
truncation_psi: float,
|
51 |
+
noise_mode: str,
|
52 |
+
outdir: str
|
53 |
+
):
|
54 |
+
"""Generate images using pretrained network pickle.
|
55 |
+
|
56 |
+
Examples:
|
57 |
+
|
58 |
+
\b
|
59 |
+
python style_mixing.py --outdir=out --rows=85,100,75,458,1500 --cols=55,821,1789,293 \\
|
60 |
+
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/metfaces.pkl
|
61 |
+
"""
|
62 |
+
print('Loading networks from "%s"...' % network_pkl)
|
63 |
+
device = torch.device('cuda')
|
64 |
+
with dnnlib.util.open_url(network_pkl) as f:
|
65 |
+
G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore
|
66 |
+
|
67 |
+
os.makedirs(outdir, exist_ok=True)
|
68 |
+
|
69 |
+
print('Generating W vectors...')
|
70 |
+
all_seeds = list(set(row_seeds + col_seeds))
|
71 |
+
all_z = np.stack([np.random.RandomState(seed).randn(G.z_dim) for seed in all_seeds])
|
72 |
+
all_w = G.mapping(torch.from_numpy(all_z).to(device), None)
|
73 |
+
w_avg = G.mapping.w_avg
|
74 |
+
all_w = w_avg + (all_w - w_avg) * truncation_psi
|
75 |
+
w_dict = {seed: w for seed, w in zip(all_seeds, list(all_w))}
|
76 |
+
|
77 |
+
print('Generating images...')
|
78 |
+
all_images = G.synthesis(all_w, noise_mode=noise_mode)
|
79 |
+
all_images = (all_images.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8).cpu().numpy()
|
80 |
+
image_dict = {(seed, seed): image for seed, image in zip(all_seeds, list(all_images))}
|
81 |
+
|
82 |
+
print('Generating style-mixed images...')
|
83 |
+
for row_seed in row_seeds:
|
84 |
+
for col_seed in col_seeds:
|
85 |
+
w = w_dict[row_seed].clone()
|
86 |
+
w[col_styles] = w_dict[col_seed][col_styles]
|
87 |
+
image = G.synthesis(w[np.newaxis], noise_mode=noise_mode)
|
88 |
+
image = (image.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
|
89 |
+
image_dict[(row_seed, col_seed)] = image[0].cpu().numpy()
|
90 |
+
|
91 |
+
print('Saving images...')
|
92 |
+
os.makedirs(outdir, exist_ok=True)
|
93 |
+
for (row_seed, col_seed), image in image_dict.items():
|
94 |
+
PIL.Image.fromarray(image, 'RGB').save(f'{outdir}/{row_seed}-{col_seed}.png')
|
95 |
+
|
96 |
+
print('Saving image grid...')
|
97 |
+
W = G.img_resolution
|
98 |
+
H = G.img_resolution
|
99 |
+
canvas = PIL.Image.new('RGB', (W * (len(col_seeds) + 1), H * (len(row_seeds) + 1)), 'black')
|
100 |
+
for row_idx, row_seed in enumerate([0] + row_seeds):
|
101 |
+
for col_idx, col_seed in enumerate([0] + col_seeds):
|
102 |
+
if row_idx == 0 and col_idx == 0:
|
103 |
+
continue
|
104 |
+
key = (row_seed, col_seed)
|
105 |
+
if row_idx == 0:
|
106 |
+
key = (col_seed, col_seed)
|
107 |
+
if col_idx == 0:
|
108 |
+
key = (row_seed, row_seed)
|
109 |
+
canvas.paste(PIL.Image.fromarray(image_dict[key], 'RGB'), (W * col_idx, H * row_idx))
|
110 |
+
canvas.save(f'{outdir}/grid.png')
|
111 |
+
|
112 |
+
|
113 |
+
#----------------------------------------------------------------------------
|
114 |
+
|
115 |
+
if __name__ == "__main__":
|
116 |
+
generate_style_mix() # pylint: disable=no-value-for-parameter
|
117 |
+
|
118 |
+
#----------------------------------------------------------------------------
|
torch_utils/__init__.py
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
# empty
|
torch_utils/custom_ops.py
ADDED
@@ -0,0 +1,126 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import os
|
10 |
+
import glob
|
11 |
+
import torch
|
12 |
+
import torch.utils.cpp_extension
|
13 |
+
import importlib
|
14 |
+
import hashlib
|
15 |
+
import shutil
|
16 |
+
from pathlib import Path
|
17 |
+
|
18 |
+
from torch.utils.file_baton import FileBaton
|
19 |
+
|
20 |
+
#----------------------------------------------------------------------------
|
21 |
+
# Global options.
|
22 |
+
|
23 |
+
verbosity = 'brief' # Verbosity level: 'none', 'brief', 'full'
|
24 |
+
|
25 |
+
#----------------------------------------------------------------------------
|
26 |
+
# Internal helper funcs.
|
27 |
+
|
28 |
+
def _find_compiler_bindir():
|
29 |
+
patterns = [
|
30 |
+
'C:/Program Files (x86)/Microsoft Visual Studio/*/Professional/VC/Tools/MSVC/*/bin/Hostx64/x64',
|
31 |
+
'C:/Program Files (x86)/Microsoft Visual Studio/*/BuildTools/VC/Tools/MSVC/*/bin/Hostx64/x64',
|
32 |
+
'C:/Program Files (x86)/Microsoft Visual Studio/*/Community/VC/Tools/MSVC/*/bin/Hostx64/x64',
|
33 |
+
'C:/Program Files (x86)/Microsoft Visual Studio */vc/bin',
|
34 |
+
]
|
35 |
+
for pattern in patterns:
|
36 |
+
matches = sorted(glob.glob(pattern))
|
37 |
+
if len(matches):
|
38 |
+
return matches[-1]
|
39 |
+
return None
|
40 |
+
|
41 |
+
#----------------------------------------------------------------------------
|
42 |
+
# Main entry point for compiling and loading C++/CUDA plugins.
|
43 |
+
|
44 |
+
_cached_plugins = dict()
|
45 |
+
|
46 |
+
def get_plugin(module_name, sources, **build_kwargs):
|
47 |
+
assert verbosity in ['none', 'brief', 'full']
|
48 |
+
|
49 |
+
# Already cached?
|
50 |
+
if module_name in _cached_plugins:
|
51 |
+
return _cached_plugins[module_name]
|
52 |
+
|
53 |
+
# Print status.
|
54 |
+
if verbosity == 'full':
|
55 |
+
print(f'Setting up PyTorch plugin "{module_name}"...')
|
56 |
+
elif verbosity == 'brief':
|
57 |
+
print(f'Setting up PyTorch plugin "{module_name}"... ', end='', flush=True)
|
58 |
+
|
59 |
+
try: # pylint: disable=too-many-nested-blocks
|
60 |
+
# Make sure we can find the necessary compiler binaries.
|
61 |
+
if os.name == 'nt' and os.system("where cl.exe >nul 2>nul") != 0:
|
62 |
+
compiler_bindir = _find_compiler_bindir()
|
63 |
+
if compiler_bindir is None:
|
64 |
+
raise RuntimeError(f'Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "{__file__}".')
|
65 |
+
os.environ['PATH'] += ';' + compiler_bindir
|
66 |
+
|
67 |
+
# Compile and load.
|
68 |
+
verbose_build = (verbosity == 'full')
|
69 |
+
|
70 |
+
# Incremental build md5sum trickery. Copies all the input source files
|
71 |
+
# into a cached build directory under a combined md5 digest of the input
|
72 |
+
# source files. Copying is done only if the combined digest has changed.
|
73 |
+
# This keeps input file timestamps and filenames the same as in previous
|
74 |
+
# extension builds, allowing for fast incremental rebuilds.
|
75 |
+
#
|
76 |
+
# This optimization is done only in case all the source files reside in
|
77 |
+
# a single directory (just for simplicity) and if the TORCH_EXTENSIONS_DIR
|
78 |
+
# environment variable is set (we take this as a signal that the user
|
79 |
+
# actually cares about this.)
|
80 |
+
source_dirs_set = set(os.path.dirname(source) for source in sources)
|
81 |
+
if len(source_dirs_set) == 1 and ('TORCH_EXTENSIONS_DIR' in os.environ):
|
82 |
+
all_source_files = sorted(list(x for x in Path(list(source_dirs_set)[0]).iterdir() if x.is_file()))
|
83 |
+
|
84 |
+
# Compute a combined hash digest for all source files in the same
|
85 |
+
# custom op directory (usually .cu, .cpp, .py and .h files).
|
86 |
+
hash_md5 = hashlib.md5()
|
87 |
+
for src in all_source_files:
|
88 |
+
with open(src, 'rb') as f:
|
89 |
+
hash_md5.update(f.read())
|
90 |
+
build_dir = torch.utils.cpp_extension._get_build_directory(module_name, verbose=verbose_build) # pylint: disable=protected-access
|
91 |
+
digest_build_dir = os.path.join(build_dir, hash_md5.hexdigest())
|
92 |
+
|
93 |
+
if not os.path.isdir(digest_build_dir):
|
94 |
+
os.makedirs(digest_build_dir, exist_ok=True)
|
95 |
+
baton = FileBaton(os.path.join(digest_build_dir, 'lock'))
|
96 |
+
if baton.try_acquire():
|
97 |
+
try:
|
98 |
+
for src in all_source_files:
|
99 |
+
shutil.copyfile(src, os.path.join(digest_build_dir, os.path.basename(src)))
|
100 |
+
finally:
|
101 |
+
baton.release()
|
102 |
+
else:
|
103 |
+
# Someone else is copying source files under the digest dir,
|
104 |
+
# wait until done and continue.
|
105 |
+
baton.wait()
|
106 |
+
digest_sources = [os.path.join(digest_build_dir, os.path.basename(x)) for x in sources]
|
107 |
+
torch.utils.cpp_extension.load(name=module_name, build_directory=build_dir,
|
108 |
+
verbose=verbose_build, sources=digest_sources, **build_kwargs)
|
109 |
+
else:
|
110 |
+
torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
|
111 |
+
module = importlib.import_module(module_name)
|
112 |
+
|
113 |
+
except:
|
114 |
+
if verbosity == 'brief':
|
115 |
+
print('Failed!')
|
116 |
+
raise
|
117 |
+
|
118 |
+
# Print status and add to cache.
|
119 |
+
if verbosity == 'full':
|
120 |
+
print(f'Done setting up PyTorch plugin "{module_name}".')
|
121 |
+
elif verbosity == 'brief':
|
122 |
+
print('Done.')
|
123 |
+
_cached_plugins[module_name] = module
|
124 |
+
return module
|
125 |
+
|
126 |
+
#----------------------------------------------------------------------------
|
torch_utils/misc.py
ADDED
@@ -0,0 +1,262 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
import re
|
10 |
+
import contextlib
|
11 |
+
import numpy as np
|
12 |
+
import torch
|
13 |
+
import warnings
|
14 |
+
import dnnlib
|
15 |
+
|
16 |
+
#----------------------------------------------------------------------------
|
17 |
+
# Cached construction of constant tensors. Avoids CPU=>GPU copy when the
|
18 |
+
# same constant is used multiple times.
|
19 |
+
|
20 |
+
_constant_cache = dict()
|
21 |
+
|
22 |
+
def constant(value, shape=None, dtype=None, device=None, memory_format=None):
|
23 |
+
value = np.asarray(value)
|
24 |
+
if shape is not None:
|
25 |
+
shape = tuple(shape)
|
26 |
+
if dtype is None:
|
27 |
+
dtype = torch.get_default_dtype()
|
28 |
+
if device is None:
|
29 |
+
device = torch.device('cpu')
|
30 |
+
if memory_format is None:
|
31 |
+
memory_format = torch.contiguous_format
|
32 |
+
|
33 |
+
key = (value.shape, value.dtype, value.tobytes(), shape, dtype, device, memory_format)
|
34 |
+
tensor = _constant_cache.get(key, None)
|
35 |
+
if tensor is None:
|
36 |
+
tensor = torch.as_tensor(value.copy(), dtype=dtype, device=device)
|
37 |
+
if shape is not None:
|
38 |
+
tensor, _ = torch.broadcast_tensors(tensor, torch.empty(shape))
|
39 |
+
tensor = tensor.contiguous(memory_format=memory_format)
|
40 |
+
_constant_cache[key] = tensor
|
41 |
+
return tensor
|
42 |
+
|
43 |
+
#----------------------------------------------------------------------------
|
44 |
+
# Replace NaN/Inf with specified numerical values.
|
45 |
+
|
46 |
+
try:
|
47 |
+
nan_to_num = torch.nan_to_num # 1.8.0a0
|
48 |
+
except AttributeError:
|
49 |
+
def nan_to_num(input, nan=0.0, posinf=None, neginf=None, *, out=None): # pylint: disable=redefined-builtin
|
50 |
+
assert isinstance(input, torch.Tensor)
|
51 |
+
if posinf is None:
|
52 |
+
posinf = torch.finfo(input.dtype).max
|
53 |
+
if neginf is None:
|
54 |
+
neginf = torch.finfo(input.dtype).min
|
55 |
+
assert nan == 0
|
56 |
+
return torch.clamp(input.unsqueeze(0).nansum(0), min=neginf, max=posinf, out=out)
|
57 |
+
|
58 |
+
#----------------------------------------------------------------------------
|
59 |
+
# Symbolic assert.
|
60 |
+
|
61 |
+
try:
|
62 |
+
symbolic_assert = torch._assert # 1.8.0a0 # pylint: disable=protected-access
|
63 |
+
except AttributeError:
|
64 |
+
symbolic_assert = torch.Assert # 1.7.0
|
65 |
+
|
66 |
+
#----------------------------------------------------------------------------
|
67 |
+
# Context manager to suppress known warnings in torch.jit.trace().
|
68 |
+
|
69 |
+
class suppress_tracer_warnings(warnings.catch_warnings):
|
70 |
+
def __enter__(self):
|
71 |
+
super().__enter__()
|
72 |
+
warnings.simplefilter('ignore', category=torch.jit.TracerWarning)
|
73 |
+
return self
|
74 |
+
|
75 |
+
#----------------------------------------------------------------------------
|
76 |
+
# Assert that the shape of a tensor matches the given list of integers.
|
77 |
+
# None indicates that the size of a dimension is allowed to vary.
|
78 |
+
# Performs symbolic assertion when used in torch.jit.trace().
|
79 |
+
|
80 |
+
def assert_shape(tensor, ref_shape):
|
81 |
+
if tensor.ndim != len(ref_shape):
|
82 |
+
raise AssertionError(f'Wrong number of dimensions: got {tensor.ndim}, expected {len(ref_shape)}')
|
83 |
+
for idx, (size, ref_size) in enumerate(zip(tensor.shape, ref_shape)):
|
84 |
+
if ref_size is None:
|
85 |
+
pass
|
86 |
+
elif isinstance(ref_size, torch.Tensor):
|
87 |
+
with suppress_tracer_warnings(): # as_tensor results are registered as constants
|
88 |
+
symbolic_assert(torch.equal(torch.as_tensor(size), ref_size), f'Wrong size for dimension {idx}')
|
89 |
+
elif isinstance(size, torch.Tensor):
|
90 |
+
with suppress_tracer_warnings(): # as_tensor results are registered as constants
|
91 |
+
symbolic_assert(torch.equal(size, torch.as_tensor(ref_size)), f'Wrong size for dimension {idx}: expected {ref_size}')
|
92 |
+
elif size != ref_size:
|
93 |
+
raise AssertionError(f'Wrong size for dimension {idx}: got {size}, expected {ref_size}')
|
94 |
+
|
95 |
+
#----------------------------------------------------------------------------
|
96 |
+
# Function decorator that calls torch.autograd.profiler.record_function().
|
97 |
+
|
98 |
+
def profiled_function(fn):
|
99 |
+
def decorator(*args, **kwargs):
|
100 |
+
with torch.autograd.profiler.record_function(fn.__name__):
|
101 |
+
return fn(*args, **kwargs)
|
102 |
+
decorator.__name__ = fn.__name__
|
103 |
+
return decorator
|
104 |
+
|
105 |
+
#----------------------------------------------------------------------------
|
106 |
+
# Sampler for torch.utils.data.DataLoader that loops over the dataset
|
107 |
+
# indefinitely, shuffling items as it goes.
|
108 |
+
|
109 |
+
class InfiniteSampler(torch.utils.data.Sampler):
|
110 |
+
def __init__(self, dataset, rank=0, num_replicas=1, shuffle=True, seed=0, window_size=0.5):
|
111 |
+
assert len(dataset) > 0
|
112 |
+
assert num_replicas > 0
|
113 |
+
assert 0 <= rank < num_replicas
|
114 |
+
assert 0 <= window_size <= 1
|
115 |
+
super().__init__(dataset)
|
116 |
+
self.dataset = dataset
|
117 |
+
self.rank = rank
|
118 |
+
self.num_replicas = num_replicas
|
119 |
+
self.shuffle = shuffle
|
120 |
+
self.seed = seed
|
121 |
+
self.window_size = window_size
|
122 |
+
|
123 |
+
def __iter__(self):
|
124 |
+
order = np.arange(len(self.dataset))
|
125 |
+
rnd = None
|
126 |
+
window = 0
|
127 |
+
if self.shuffle:
|
128 |
+
rnd = np.random.RandomState(self.seed)
|
129 |
+
rnd.shuffle(order)
|
130 |
+
window = int(np.rint(order.size * self.window_size))
|
131 |
+
|
132 |
+
idx = 0
|
133 |
+
while True:
|
134 |
+
i = idx % order.size
|
135 |
+
if idx % self.num_replicas == self.rank:
|
136 |
+
yield order[i]
|
137 |
+
if window >= 2:
|
138 |
+
j = (i - rnd.randint(window)) % order.size
|
139 |
+
order[i], order[j] = order[j], order[i]
|
140 |
+
idx += 1
|
141 |
+
|
142 |
+
#----------------------------------------------------------------------------
|
143 |
+
# Utilities for operating with torch.nn.Module parameters and buffers.
|
144 |
+
|
145 |
+
def params_and_buffers(module):
|
146 |
+
assert isinstance(module, torch.nn.Module)
|
147 |
+
return list(module.parameters()) + list(module.buffers())
|
148 |
+
|
149 |
+
def named_params_and_buffers(module):
|
150 |
+
assert isinstance(module, torch.nn.Module)
|
151 |
+
return list(module.named_parameters()) + list(module.named_buffers())
|
152 |
+
|
153 |
+
def copy_params_and_buffers(src_module, dst_module, require_all=False):
|
154 |
+
assert isinstance(src_module, torch.nn.Module)
|
155 |
+
assert isinstance(dst_module, torch.nn.Module)
|
156 |
+
src_tensors = {name: tensor for name, tensor in named_params_and_buffers(src_module)}
|
157 |
+
for name, tensor in named_params_and_buffers(dst_module):
|
158 |
+
assert (name in src_tensors) or (not require_all)
|
159 |
+
if name in src_tensors:
|
160 |
+
tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
|
161 |
+
|
162 |
+
#----------------------------------------------------------------------------
|
163 |
+
# Context manager for easily enabling/disabling DistributedDataParallel
|
164 |
+
# synchronization.
|
165 |
+
|
166 |
+
@contextlib.contextmanager
|
167 |
+
def ddp_sync(module, sync):
|
168 |
+
assert isinstance(module, torch.nn.Module)
|
169 |
+
if sync or not isinstance(module, torch.nn.parallel.DistributedDataParallel):
|
170 |
+
yield
|
171 |
+
else:
|
172 |
+
with module.no_sync():
|
173 |
+
yield
|
174 |
+
|
175 |
+
#----------------------------------------------------------------------------
|
176 |
+
# Check DistributedDataParallel consistency across processes.
|
177 |
+
|
178 |
+
def check_ddp_consistency(module, ignore_regex=None):
|
179 |
+
assert isinstance(module, torch.nn.Module)
|
180 |
+
for name, tensor in named_params_and_buffers(module):
|
181 |
+
fullname = type(module).__name__ + '.' + name
|
182 |
+
if ignore_regex is not None and re.fullmatch(ignore_regex, fullname):
|
183 |
+
continue
|
184 |
+
tensor = tensor.detach()
|
185 |
+
other = tensor.clone()
|
186 |
+
torch.distributed.broadcast(tensor=other, src=0)
|
187 |
+
assert (nan_to_num(tensor) == nan_to_num(other)).all(), fullname
|
188 |
+
|
189 |
+
#----------------------------------------------------------------------------
|
190 |
+
# Print summary table of module hierarchy.
|
191 |
+
|
192 |
+
def print_module_summary(module, inputs, max_nesting=3, skip_redundant=True):
|
193 |
+
assert isinstance(module, torch.nn.Module)
|
194 |
+
assert not isinstance(module, torch.jit.ScriptModule)
|
195 |
+
assert isinstance(inputs, (tuple, list))
|
196 |
+
|
197 |
+
# Register hooks.
|
198 |
+
entries = []
|
199 |
+
nesting = [0]
|
200 |
+
def pre_hook(_mod, _inputs):
|
201 |
+
nesting[0] += 1
|
202 |
+
def post_hook(mod, _inputs, outputs):
|
203 |
+
nesting[0] -= 1
|
204 |
+
if nesting[0] <= max_nesting:
|
205 |
+
outputs = list(outputs) if isinstance(outputs, (tuple, list)) else [outputs]
|
206 |
+
outputs = [t for t in outputs if isinstance(t, torch.Tensor)]
|
207 |
+
entries.append(dnnlib.EasyDict(mod=mod, outputs=outputs))
|
208 |
+
hooks = [mod.register_forward_pre_hook(pre_hook) for mod in module.modules()]
|
209 |
+
hooks += [mod.register_forward_hook(post_hook) for mod in module.modules()]
|
210 |
+
|
211 |
+
# Run module.
|
212 |
+
outputs = module(*inputs)
|
213 |
+
for hook in hooks:
|
214 |
+
hook.remove()
|
215 |
+
|
216 |
+
# Identify unique outputs, parameters, and buffers.
|
217 |
+
tensors_seen = set()
|
218 |
+
for e in entries:
|
219 |
+
e.unique_params = [t for t in e.mod.parameters() if id(t) not in tensors_seen]
|
220 |
+
e.unique_buffers = [t for t in e.mod.buffers() if id(t) not in tensors_seen]
|
221 |
+
e.unique_outputs = [t for t in e.outputs if id(t) not in tensors_seen]
|
222 |
+
tensors_seen |= {id(t) for t in e.unique_params + e.unique_buffers + e.unique_outputs}
|
223 |
+
|
224 |
+
# Filter out redundant entries.
|
225 |
+
if skip_redundant:
|
226 |
+
entries = [e for e in entries if len(e.unique_params) or len(e.unique_buffers) or len(e.unique_outputs)]
|
227 |
+
|
228 |
+
# Construct table.
|
229 |
+
rows = [[type(module).__name__, 'Parameters', 'Buffers', 'Output shape', 'Datatype']]
|
230 |
+
rows += [['---'] * len(rows[0])]
|
231 |
+
param_total = 0
|
232 |
+
buffer_total = 0
|
233 |
+
submodule_names = {mod: name for name, mod in module.named_modules()}
|
234 |
+
for e in entries:
|
235 |
+
name = '<top-level>' if e.mod is module else submodule_names[e.mod]
|
236 |
+
param_size = sum(t.numel() for t in e.unique_params)
|
237 |
+
buffer_size = sum(t.numel() for t in e.unique_buffers)
|
238 |
+
output_shapes = [str(list(e.outputs[0].shape)) for t in e.outputs]
|
239 |
+
output_dtypes = [str(t.dtype).split('.')[-1] for t in e.outputs]
|
240 |
+
rows += [[
|
241 |
+
name + (':0' if len(e.outputs) >= 2 else ''),
|
242 |
+
str(param_size) if param_size else '-',
|
243 |
+
str(buffer_size) if buffer_size else '-',
|
244 |
+
(output_shapes + ['-'])[0],
|
245 |
+
(output_dtypes + ['-'])[0],
|
246 |
+
]]
|
247 |
+
for idx in range(1, len(e.outputs)):
|
248 |
+
rows += [[name + f':{idx}', '-', '-', output_shapes[idx], output_dtypes[idx]]]
|
249 |
+
param_total += param_size
|
250 |
+
buffer_total += buffer_size
|
251 |
+
rows += [['---'] * len(rows[0])]
|
252 |
+
rows += [['Total', str(param_total), str(buffer_total), '-', '-']]
|
253 |
+
|
254 |
+
# Print table.
|
255 |
+
widths = [max(len(cell) for cell in column) for column in zip(*rows)]
|
256 |
+
print()
|
257 |
+
for row in rows:
|
258 |
+
print(' '.join(cell + ' ' * (width - len(cell)) for cell, width in zip(row, widths)))
|
259 |
+
print()
|
260 |
+
return outputs
|
261 |
+
|
262 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/__init__.py
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
# empty
|
torch_utils/ops/bias_act.cpp
ADDED
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
#include <torch/extension.h>
|
10 |
+
#include <ATen/cuda/CUDAContext.h>
|
11 |
+
#include <c10/cuda/CUDAGuard.h>
|
12 |
+
#include "bias_act.h"
|
13 |
+
|
14 |
+
//------------------------------------------------------------------------
|
15 |
+
|
16 |
+
static bool has_same_layout(torch::Tensor x, torch::Tensor y)
|
17 |
+
{
|
18 |
+
if (x.dim() != y.dim())
|
19 |
+
return false;
|
20 |
+
for (int64_t i = 0; i < x.dim(); i++)
|
21 |
+
{
|
22 |
+
if (x.size(i) != y.size(i))
|
23 |
+
return false;
|
24 |
+
if (x.size(i) >= 2 && x.stride(i) != y.stride(i))
|
25 |
+
return false;
|
26 |
+
}
|
27 |
+
return true;
|
28 |
+
}
|
29 |
+
|
30 |
+
//------------------------------------------------------------------------
|
31 |
+
|
32 |
+
static torch::Tensor bias_act(torch::Tensor x, torch::Tensor b, torch::Tensor xref, torch::Tensor yref, torch::Tensor dy, int grad, int dim, int act, float alpha, float gain, float clamp)
|
33 |
+
{
|
34 |
+
// Validate arguments.
|
35 |
+
TORCH_CHECK(x.is_cuda(), "x must reside on CUDA device");
|
36 |
+
TORCH_CHECK(b.numel() == 0 || (b.dtype() == x.dtype() && b.device() == x.device()), "b must have the same dtype and device as x");
|
37 |
+
TORCH_CHECK(xref.numel() == 0 || (xref.sizes() == x.sizes() && xref.dtype() == x.dtype() && xref.device() == x.device()), "xref must have the same shape, dtype, and device as x");
|
38 |
+
TORCH_CHECK(yref.numel() == 0 || (yref.sizes() == x.sizes() && yref.dtype() == x.dtype() && yref.device() == x.device()), "yref must have the same shape, dtype, and device as x");
|
39 |
+
TORCH_CHECK(dy.numel() == 0 || (dy.sizes() == x.sizes() && dy.dtype() == x.dtype() && dy.device() == x.device()), "dy must have the same dtype and device as x");
|
40 |
+
TORCH_CHECK(x.numel() <= INT_MAX, "x is too large");
|
41 |
+
TORCH_CHECK(b.dim() == 1, "b must have rank 1");
|
42 |
+
TORCH_CHECK(b.numel() == 0 || (dim >= 0 && dim < x.dim()), "dim is out of bounds");
|
43 |
+
TORCH_CHECK(b.numel() == 0 || b.numel() == x.size(dim), "b has wrong number of elements");
|
44 |
+
TORCH_CHECK(grad >= 0, "grad must be non-negative");
|
45 |
+
|
46 |
+
// Validate layout.
|
47 |
+
TORCH_CHECK(x.is_non_overlapping_and_dense(), "x must be non-overlapping and dense");
|
48 |
+
TORCH_CHECK(b.is_contiguous(), "b must be contiguous");
|
49 |
+
TORCH_CHECK(xref.numel() == 0 || has_same_layout(xref, x), "xref must have the same layout as x");
|
50 |
+
TORCH_CHECK(yref.numel() == 0 || has_same_layout(yref, x), "yref must have the same layout as x");
|
51 |
+
TORCH_CHECK(dy.numel() == 0 || has_same_layout(dy, x), "dy must have the same layout as x");
|
52 |
+
|
53 |
+
// Create output tensor.
|
54 |
+
const at::cuda::OptionalCUDAGuard device_guard(device_of(x));
|
55 |
+
torch::Tensor y = torch::empty_like(x);
|
56 |
+
TORCH_CHECK(has_same_layout(y, x), "y must have the same layout as x");
|
57 |
+
|
58 |
+
// Initialize CUDA kernel parameters.
|
59 |
+
bias_act_kernel_params p;
|
60 |
+
p.x = x.data_ptr();
|
61 |
+
p.b = (b.numel()) ? b.data_ptr() : NULL;
|
62 |
+
p.xref = (xref.numel()) ? xref.data_ptr() : NULL;
|
63 |
+
p.yref = (yref.numel()) ? yref.data_ptr() : NULL;
|
64 |
+
p.dy = (dy.numel()) ? dy.data_ptr() : NULL;
|
65 |
+
p.y = y.data_ptr();
|
66 |
+
p.grad = grad;
|
67 |
+
p.act = act;
|
68 |
+
p.alpha = alpha;
|
69 |
+
p.gain = gain;
|
70 |
+
p.clamp = clamp;
|
71 |
+
p.sizeX = (int)x.numel();
|
72 |
+
p.sizeB = (int)b.numel();
|
73 |
+
p.stepB = (b.numel()) ? (int)x.stride(dim) : 1;
|
74 |
+
|
75 |
+
// Choose CUDA kernel.
|
76 |
+
void* kernel;
|
77 |
+
AT_DISPATCH_FLOATING_TYPES_AND_HALF(x.scalar_type(), "upfirdn2d_cuda", [&]
|
78 |
+
{
|
79 |
+
kernel = choose_bias_act_kernel<scalar_t>(p);
|
80 |
+
});
|
81 |
+
TORCH_CHECK(kernel, "no CUDA kernel found for the specified activation func");
|
82 |
+
|
83 |
+
// Launch CUDA kernel.
|
84 |
+
p.loopX = 4;
|
85 |
+
int blockSize = 4 * 32;
|
86 |
+
int gridSize = (p.sizeX - 1) / (p.loopX * blockSize) + 1;
|
87 |
+
void* args[] = {&p};
|
88 |
+
AT_CUDA_CHECK(cudaLaunchKernel(kernel, gridSize, blockSize, args, 0, at::cuda::getCurrentCUDAStream()));
|
89 |
+
return y;
|
90 |
+
}
|
91 |
+
|
92 |
+
//------------------------------------------------------------------------
|
93 |
+
|
94 |
+
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m)
|
95 |
+
{
|
96 |
+
m.def("bias_act", &bias_act);
|
97 |
+
}
|
98 |
+
|
99 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/bias_act.cu
ADDED
@@ -0,0 +1,173 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
#include <c10/util/Half.h>
|
10 |
+
#include "bias_act.h"
|
11 |
+
|
12 |
+
//------------------------------------------------------------------------
|
13 |
+
// Helpers.
|
14 |
+
|
15 |
+
template <class T> struct InternalType;
|
16 |
+
template <> struct InternalType<double> { typedef double scalar_t; };
|
17 |
+
template <> struct InternalType<float> { typedef float scalar_t; };
|
18 |
+
template <> struct InternalType<c10::Half> { typedef float scalar_t; };
|
19 |
+
|
20 |
+
//------------------------------------------------------------------------
|
21 |
+
// CUDA kernel.
|
22 |
+
|
23 |
+
template <class T, int A>
|
24 |
+
__global__ void bias_act_kernel(bias_act_kernel_params p)
|
25 |
+
{
|
26 |
+
typedef typename InternalType<T>::scalar_t scalar_t;
|
27 |
+
int G = p.grad;
|
28 |
+
scalar_t alpha = (scalar_t)p.alpha;
|
29 |
+
scalar_t gain = (scalar_t)p.gain;
|
30 |
+
scalar_t clamp = (scalar_t)p.clamp;
|
31 |
+
scalar_t one = (scalar_t)1;
|
32 |
+
scalar_t two = (scalar_t)2;
|
33 |
+
scalar_t expRange = (scalar_t)80;
|
34 |
+
scalar_t halfExpRange = (scalar_t)40;
|
35 |
+
scalar_t seluScale = (scalar_t)1.0507009873554804934193349852946;
|
36 |
+
scalar_t seluAlpha = (scalar_t)1.6732632423543772848170429916717;
|
37 |
+
|
38 |
+
// Loop over elements.
|
39 |
+
int xi = blockIdx.x * p.loopX * blockDim.x + threadIdx.x;
|
40 |
+
for (int loopIdx = 0; loopIdx < p.loopX && xi < p.sizeX; loopIdx++, xi += blockDim.x)
|
41 |
+
{
|
42 |
+
// Load.
|
43 |
+
scalar_t x = (scalar_t)((const T*)p.x)[xi];
|
44 |
+
scalar_t b = (p.b) ? (scalar_t)((const T*)p.b)[(xi / p.stepB) % p.sizeB] : 0;
|
45 |
+
scalar_t xref = (p.xref) ? (scalar_t)((const T*)p.xref)[xi] : 0;
|
46 |
+
scalar_t yref = (p.yref) ? (scalar_t)((const T*)p.yref)[xi] : 0;
|
47 |
+
scalar_t dy = (p.dy) ? (scalar_t)((const T*)p.dy)[xi] : one;
|
48 |
+
scalar_t yy = (gain != 0) ? yref / gain : 0;
|
49 |
+
scalar_t y = 0;
|
50 |
+
|
51 |
+
// Apply bias.
|
52 |
+
((G == 0) ? x : xref) += b;
|
53 |
+
|
54 |
+
// linear
|
55 |
+
if (A == 1)
|
56 |
+
{
|
57 |
+
if (G == 0) y = x;
|
58 |
+
if (G == 1) y = x;
|
59 |
+
}
|
60 |
+
|
61 |
+
// relu
|
62 |
+
if (A == 2)
|
63 |
+
{
|
64 |
+
if (G == 0) y = (x > 0) ? x : 0;
|
65 |
+
if (G == 1) y = (yy > 0) ? x : 0;
|
66 |
+
}
|
67 |
+
|
68 |
+
// lrelu
|
69 |
+
if (A == 3)
|
70 |
+
{
|
71 |
+
if (G == 0) y = (x > 0) ? x : x * alpha;
|
72 |
+
if (G == 1) y = (yy > 0) ? x : x * alpha;
|
73 |
+
}
|
74 |
+
|
75 |
+
// tanh
|
76 |
+
if (A == 4)
|
77 |
+
{
|
78 |
+
if (G == 0) { scalar_t c = exp(x); scalar_t d = one / c; y = (x < -expRange) ? -one : (x > expRange) ? one : (c - d) / (c + d); }
|
79 |
+
if (G == 1) y = x * (one - yy * yy);
|
80 |
+
if (G == 2) y = x * (one - yy * yy) * (-two * yy);
|
81 |
+
}
|
82 |
+
|
83 |
+
// sigmoid
|
84 |
+
if (A == 5)
|
85 |
+
{
|
86 |
+
if (G == 0) y = (x < -expRange) ? 0 : one / (exp(-x) + one);
|
87 |
+
if (G == 1) y = x * yy * (one - yy);
|
88 |
+
if (G == 2) y = x * yy * (one - yy) * (one - two * yy);
|
89 |
+
}
|
90 |
+
|
91 |
+
// elu
|
92 |
+
if (A == 6)
|
93 |
+
{
|
94 |
+
if (G == 0) y = (x >= 0) ? x : exp(x) - one;
|
95 |
+
if (G == 1) y = (yy >= 0) ? x : x * (yy + one);
|
96 |
+
if (G == 2) y = (yy >= 0) ? 0 : x * (yy + one);
|
97 |
+
}
|
98 |
+
|
99 |
+
// selu
|
100 |
+
if (A == 7)
|
101 |
+
{
|
102 |
+
if (G == 0) y = (x >= 0) ? seluScale * x : (seluScale * seluAlpha) * (exp(x) - one);
|
103 |
+
if (G == 1) y = (yy >= 0) ? x * seluScale : x * (yy + seluScale * seluAlpha);
|
104 |
+
if (G == 2) y = (yy >= 0) ? 0 : x * (yy + seluScale * seluAlpha);
|
105 |
+
}
|
106 |
+
|
107 |
+
// softplus
|
108 |
+
if (A == 8)
|
109 |
+
{
|
110 |
+
if (G == 0) y = (x > expRange) ? x : log(exp(x) + one);
|
111 |
+
if (G == 1) y = x * (one - exp(-yy));
|
112 |
+
if (G == 2) { scalar_t c = exp(-yy); y = x * c * (one - c); }
|
113 |
+
}
|
114 |
+
|
115 |
+
// swish
|
116 |
+
if (A == 9)
|
117 |
+
{
|
118 |
+
if (G == 0)
|
119 |
+
y = (x < -expRange) ? 0 : x / (exp(-x) + one);
|
120 |
+
else
|
121 |
+
{
|
122 |
+
scalar_t c = exp(xref);
|
123 |
+
scalar_t d = c + one;
|
124 |
+
if (G == 1)
|
125 |
+
y = (xref > halfExpRange) ? x : x * c * (xref + d) / (d * d);
|
126 |
+
else
|
127 |
+
y = (xref > halfExpRange) ? 0 : x * c * (xref * (two - d) + two * d) / (d * d * d);
|
128 |
+
yref = (xref < -expRange) ? 0 : xref / (exp(-xref) + one) * gain;
|
129 |
+
}
|
130 |
+
}
|
131 |
+
|
132 |
+
// Apply gain.
|
133 |
+
y *= gain * dy;
|
134 |
+
|
135 |
+
// Clamp.
|
136 |
+
if (clamp >= 0)
|
137 |
+
{
|
138 |
+
if (G == 0)
|
139 |
+
y = (y > -clamp & y < clamp) ? y : (y >= 0) ? clamp : -clamp;
|
140 |
+
else
|
141 |
+
y = (yref > -clamp & yref < clamp) ? y : 0;
|
142 |
+
}
|
143 |
+
|
144 |
+
// Store.
|
145 |
+
((T*)p.y)[xi] = (T)y;
|
146 |
+
}
|
147 |
+
}
|
148 |
+
|
149 |
+
//------------------------------------------------------------------------
|
150 |
+
// CUDA kernel selection.
|
151 |
+
|
152 |
+
template <class T> void* choose_bias_act_kernel(const bias_act_kernel_params& p)
|
153 |
+
{
|
154 |
+
if (p.act == 1) return (void*)bias_act_kernel<T, 1>;
|
155 |
+
if (p.act == 2) return (void*)bias_act_kernel<T, 2>;
|
156 |
+
if (p.act == 3) return (void*)bias_act_kernel<T, 3>;
|
157 |
+
if (p.act == 4) return (void*)bias_act_kernel<T, 4>;
|
158 |
+
if (p.act == 5) return (void*)bias_act_kernel<T, 5>;
|
159 |
+
if (p.act == 6) return (void*)bias_act_kernel<T, 6>;
|
160 |
+
if (p.act == 7) return (void*)bias_act_kernel<T, 7>;
|
161 |
+
if (p.act == 8) return (void*)bias_act_kernel<T, 8>;
|
162 |
+
if (p.act == 9) return (void*)bias_act_kernel<T, 9>;
|
163 |
+
return NULL;
|
164 |
+
}
|
165 |
+
|
166 |
+
//------------------------------------------------------------------------
|
167 |
+
// Template specializations.
|
168 |
+
|
169 |
+
template void* choose_bias_act_kernel<double> (const bias_act_kernel_params& p);
|
170 |
+
template void* choose_bias_act_kernel<float> (const bias_act_kernel_params& p);
|
171 |
+
template void* choose_bias_act_kernel<c10::Half> (const bias_act_kernel_params& p);
|
172 |
+
|
173 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/bias_act.h
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
//------------------------------------------------------------------------
|
10 |
+
// CUDA kernel parameters.
|
11 |
+
|
12 |
+
struct bias_act_kernel_params
|
13 |
+
{
|
14 |
+
const void* x; // [sizeX]
|
15 |
+
const void* b; // [sizeB] or NULL
|
16 |
+
const void* xref; // [sizeX] or NULL
|
17 |
+
const void* yref; // [sizeX] or NULL
|
18 |
+
const void* dy; // [sizeX] or NULL
|
19 |
+
void* y; // [sizeX]
|
20 |
+
|
21 |
+
int grad;
|
22 |
+
int act;
|
23 |
+
float alpha;
|
24 |
+
float gain;
|
25 |
+
float clamp;
|
26 |
+
|
27 |
+
int sizeX;
|
28 |
+
int sizeB;
|
29 |
+
int stepB;
|
30 |
+
int loopX;
|
31 |
+
};
|
32 |
+
|
33 |
+
//------------------------------------------------------------------------
|
34 |
+
// CUDA kernel selection.
|
35 |
+
|
36 |
+
template <class T> void* choose_bias_act_kernel(const bias_act_kernel_params& p);
|
37 |
+
|
38 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/bias_act.py
ADDED
@@ -0,0 +1,212 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Custom PyTorch ops for efficient bias and activation."""
|
10 |
+
|
11 |
+
import os
|
12 |
+
import warnings
|
13 |
+
import numpy as np
|
14 |
+
import torch
|
15 |
+
import dnnlib
|
16 |
+
import traceback
|
17 |
+
|
18 |
+
from .. import custom_ops
|
19 |
+
from .. import misc
|
20 |
+
|
21 |
+
#----------------------------------------------------------------------------
|
22 |
+
|
23 |
+
activation_funcs = {
|
24 |
+
'linear': dnnlib.EasyDict(func=lambda x, **_: x, def_alpha=0, def_gain=1, cuda_idx=1, ref='', has_2nd_grad=False),
|
25 |
+
'relu': dnnlib.EasyDict(func=lambda x, **_: torch.nn.functional.relu(x), def_alpha=0, def_gain=np.sqrt(2), cuda_idx=2, ref='y', has_2nd_grad=False),
|
26 |
+
'lrelu': dnnlib.EasyDict(func=lambda x, alpha, **_: torch.nn.functional.leaky_relu(x, alpha), def_alpha=0.2, def_gain=np.sqrt(2), cuda_idx=3, ref='y', has_2nd_grad=False),
|
27 |
+
'tanh': dnnlib.EasyDict(func=lambda x, **_: torch.tanh(x), def_alpha=0, def_gain=1, cuda_idx=4, ref='y', has_2nd_grad=True),
|
28 |
+
'sigmoid': dnnlib.EasyDict(func=lambda x, **_: torch.sigmoid(x), def_alpha=0, def_gain=1, cuda_idx=5, ref='y', has_2nd_grad=True),
|
29 |
+
'elu': dnnlib.EasyDict(func=lambda x, **_: torch.nn.functional.elu(x), def_alpha=0, def_gain=1, cuda_idx=6, ref='y', has_2nd_grad=True),
|
30 |
+
'selu': dnnlib.EasyDict(func=lambda x, **_: torch.nn.functional.selu(x), def_alpha=0, def_gain=1, cuda_idx=7, ref='y', has_2nd_grad=True),
|
31 |
+
'softplus': dnnlib.EasyDict(func=lambda x, **_: torch.nn.functional.softplus(x), def_alpha=0, def_gain=1, cuda_idx=8, ref='y', has_2nd_grad=True),
|
32 |
+
'swish': dnnlib.EasyDict(func=lambda x, **_: torch.sigmoid(x) * x, def_alpha=0, def_gain=np.sqrt(2), cuda_idx=9, ref='x', has_2nd_grad=True),
|
33 |
+
}
|
34 |
+
|
35 |
+
#----------------------------------------------------------------------------
|
36 |
+
|
37 |
+
_inited = False
|
38 |
+
_plugin = None
|
39 |
+
_null_tensor = torch.empty([0])
|
40 |
+
|
41 |
+
def _init():
|
42 |
+
global _inited, _plugin
|
43 |
+
if not _inited:
|
44 |
+
_inited = True
|
45 |
+
sources = ['bias_act.cpp', 'bias_act.cu']
|
46 |
+
sources = [os.path.join(os.path.dirname(__file__), s) for s in sources]
|
47 |
+
try:
|
48 |
+
_plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
|
49 |
+
except:
|
50 |
+
warnings.warn('Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
|
51 |
+
return _plugin is not None
|
52 |
+
|
53 |
+
#----------------------------------------------------------------------------
|
54 |
+
|
55 |
+
def bias_act(x, b=None, dim=1, act='linear', alpha=None, gain=None, clamp=None, impl='cuda'):
|
56 |
+
r"""Fused bias and activation function.
|
57 |
+
|
58 |
+
Adds bias `b` to activation tensor `x`, evaluates activation function `act`,
|
59 |
+
and scales the result by `gain`. Each of the steps is optional. In most cases,
|
60 |
+
the fused op is considerably more efficient than performing the same calculation
|
61 |
+
using standard PyTorch ops. It supports first and second order gradients,
|
62 |
+
but not third order gradients.
|
63 |
+
|
64 |
+
Args:
|
65 |
+
x: Input activation tensor. Can be of any shape.
|
66 |
+
b: Bias vector, or `None` to disable. Must be a 1D tensor of the same type
|
67 |
+
as `x`. The shape must be known, and it must match the dimension of `x`
|
68 |
+
corresponding to `dim`.
|
69 |
+
dim: The dimension in `x` corresponding to the elements of `b`.
|
70 |
+
The value of `dim` is ignored if `b` is not specified.
|
71 |
+
act: Name of the activation function to evaluate, or `"linear"` to disable.
|
72 |
+
Can be e.g. `"relu"`, `"lrelu"`, `"tanh"`, `"sigmoid"`, `"swish"`, etc.
|
73 |
+
See `activation_funcs` for a full list. `None` is not allowed.
|
74 |
+
alpha: Shape parameter for the activation function, or `None` to use the default.
|
75 |
+
gain: Scaling factor for the output tensor, or `None` to use default.
|
76 |
+
See `activation_funcs` for the default scaling of each activation function.
|
77 |
+
If unsure, consider specifying 1.
|
78 |
+
clamp: Clamp the output values to `[-clamp, +clamp]`, or `None` to disable
|
79 |
+
the clamping (default).
|
80 |
+
impl: Name of the implementation to use. Can be `"ref"` or `"cuda"` (default).
|
81 |
+
|
82 |
+
Returns:
|
83 |
+
Tensor of the same shape and datatype as `x`.
|
84 |
+
"""
|
85 |
+
assert isinstance(x, torch.Tensor)
|
86 |
+
assert impl in ['ref', 'cuda']
|
87 |
+
if impl == 'cuda' and x.device.type == 'cuda' and _init():
|
88 |
+
return _bias_act_cuda(dim=dim, act=act, alpha=alpha, gain=gain, clamp=clamp).apply(x, b)
|
89 |
+
return _bias_act_ref(x=x, b=b, dim=dim, act=act, alpha=alpha, gain=gain, clamp=clamp)
|
90 |
+
|
91 |
+
#----------------------------------------------------------------------------
|
92 |
+
|
93 |
+
@misc.profiled_function
|
94 |
+
def _bias_act_ref(x, b=None, dim=1, act='linear', alpha=None, gain=None, clamp=None):
|
95 |
+
"""Slow reference implementation of `bias_act()` using standard TensorFlow ops.
|
96 |
+
"""
|
97 |
+
assert isinstance(x, torch.Tensor)
|
98 |
+
assert clamp is None or clamp >= 0
|
99 |
+
spec = activation_funcs[act]
|
100 |
+
alpha = float(alpha if alpha is not None else spec.def_alpha)
|
101 |
+
gain = float(gain if gain is not None else spec.def_gain)
|
102 |
+
clamp = float(clamp if clamp is not None else -1)
|
103 |
+
|
104 |
+
# Add bias.
|
105 |
+
if b is not None:
|
106 |
+
assert isinstance(b, torch.Tensor) and b.ndim == 1
|
107 |
+
assert 0 <= dim < x.ndim
|
108 |
+
assert b.shape[0] == x.shape[dim]
|
109 |
+
x = x + b.reshape([-1 if i == dim else 1 for i in range(x.ndim)])
|
110 |
+
|
111 |
+
# Evaluate activation function.
|
112 |
+
alpha = float(alpha)
|
113 |
+
x = spec.func(x, alpha=alpha)
|
114 |
+
|
115 |
+
# Scale by gain.
|
116 |
+
gain = float(gain)
|
117 |
+
if gain != 1:
|
118 |
+
x = x * gain
|
119 |
+
|
120 |
+
# Clamp.
|
121 |
+
if clamp >= 0:
|
122 |
+
x = x.clamp(-clamp, clamp) # pylint: disable=invalid-unary-operand-type
|
123 |
+
return x
|
124 |
+
|
125 |
+
#----------------------------------------------------------------------------
|
126 |
+
|
127 |
+
_bias_act_cuda_cache = dict()
|
128 |
+
|
129 |
+
def _bias_act_cuda(dim=1, act='linear', alpha=None, gain=None, clamp=None):
|
130 |
+
"""Fast CUDA implementation of `bias_act()` using custom ops.
|
131 |
+
"""
|
132 |
+
# Parse arguments.
|
133 |
+
assert clamp is None or clamp >= 0
|
134 |
+
spec = activation_funcs[act]
|
135 |
+
alpha = float(alpha if alpha is not None else spec.def_alpha)
|
136 |
+
gain = float(gain if gain is not None else spec.def_gain)
|
137 |
+
clamp = float(clamp if clamp is not None else -1)
|
138 |
+
|
139 |
+
# Lookup from cache.
|
140 |
+
key = (dim, act, alpha, gain, clamp)
|
141 |
+
if key in _bias_act_cuda_cache:
|
142 |
+
return _bias_act_cuda_cache[key]
|
143 |
+
|
144 |
+
# Forward op.
|
145 |
+
class BiasActCuda(torch.autograd.Function):
|
146 |
+
@staticmethod
|
147 |
+
def forward(ctx, x, b): # pylint: disable=arguments-differ
|
148 |
+
ctx.memory_format = torch.channels_last if x.ndim > 2 and x.stride()[1] == 1 else torch.contiguous_format
|
149 |
+
x = x.contiguous(memory_format=ctx.memory_format)
|
150 |
+
b = b.contiguous() if b is not None else _null_tensor
|
151 |
+
y = x
|
152 |
+
if act != 'linear' or gain != 1 or clamp >= 0 or b is not _null_tensor:
|
153 |
+
y = _plugin.bias_act(x, b, _null_tensor, _null_tensor, _null_tensor, 0, dim, spec.cuda_idx, alpha, gain, clamp)
|
154 |
+
ctx.save_for_backward(
|
155 |
+
x if 'x' in spec.ref or spec.has_2nd_grad else _null_tensor,
|
156 |
+
b if 'x' in spec.ref or spec.has_2nd_grad else _null_tensor,
|
157 |
+
y if 'y' in spec.ref else _null_tensor)
|
158 |
+
return y
|
159 |
+
|
160 |
+
@staticmethod
|
161 |
+
def backward(ctx, dy): # pylint: disable=arguments-differ
|
162 |
+
dy = dy.contiguous(memory_format=ctx.memory_format)
|
163 |
+
x, b, y = ctx.saved_tensors
|
164 |
+
dx = None
|
165 |
+
db = None
|
166 |
+
|
167 |
+
if ctx.needs_input_grad[0] or ctx.needs_input_grad[1]:
|
168 |
+
dx = dy
|
169 |
+
if act != 'linear' or gain != 1 or clamp >= 0:
|
170 |
+
dx = BiasActCudaGrad.apply(dy, x, b, y)
|
171 |
+
|
172 |
+
if ctx.needs_input_grad[1]:
|
173 |
+
db = dx.sum([i for i in range(dx.ndim) if i != dim])
|
174 |
+
|
175 |
+
return dx, db
|
176 |
+
|
177 |
+
# Backward op.
|
178 |
+
class BiasActCudaGrad(torch.autograd.Function):
|
179 |
+
@staticmethod
|
180 |
+
def forward(ctx, dy, x, b, y): # pylint: disable=arguments-differ
|
181 |
+
ctx.memory_format = torch.channels_last if dy.ndim > 2 and dy.stride()[1] == 1 else torch.contiguous_format
|
182 |
+
dx = _plugin.bias_act(dy, b, x, y, _null_tensor, 1, dim, spec.cuda_idx, alpha, gain, clamp)
|
183 |
+
ctx.save_for_backward(
|
184 |
+
dy if spec.has_2nd_grad else _null_tensor,
|
185 |
+
x, b, y)
|
186 |
+
return dx
|
187 |
+
|
188 |
+
@staticmethod
|
189 |
+
def backward(ctx, d_dx): # pylint: disable=arguments-differ
|
190 |
+
d_dx = d_dx.contiguous(memory_format=ctx.memory_format)
|
191 |
+
dy, x, b, y = ctx.saved_tensors
|
192 |
+
d_dy = None
|
193 |
+
d_x = None
|
194 |
+
d_b = None
|
195 |
+
d_y = None
|
196 |
+
|
197 |
+
if ctx.needs_input_grad[0]:
|
198 |
+
d_dy = BiasActCudaGrad.apply(d_dx, x, b, y)
|
199 |
+
|
200 |
+
if spec.has_2nd_grad and (ctx.needs_input_grad[1] or ctx.needs_input_grad[2]):
|
201 |
+
d_x = _plugin.bias_act(d_dx, b, x, y, dy, 2, dim, spec.cuda_idx, alpha, gain, clamp)
|
202 |
+
|
203 |
+
if spec.has_2nd_grad and ctx.needs_input_grad[2]:
|
204 |
+
d_b = d_x.sum([i for i in range(d_x.ndim) if i != dim])
|
205 |
+
|
206 |
+
return d_dy, d_x, d_b, d_y
|
207 |
+
|
208 |
+
# Add to cache.
|
209 |
+
_bias_act_cuda_cache[key] = BiasActCuda
|
210 |
+
return BiasActCuda
|
211 |
+
|
212 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/conv2d_gradfix.py
ADDED
@@ -0,0 +1,170 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Custom replacement for `torch.nn.functional.conv2d` that supports
|
10 |
+
arbitrarily high order gradients with zero performance penalty."""
|
11 |
+
|
12 |
+
import warnings
|
13 |
+
import contextlib
|
14 |
+
import torch
|
15 |
+
|
16 |
+
# pylint: disable=redefined-builtin
|
17 |
+
# pylint: disable=arguments-differ
|
18 |
+
# pylint: disable=protected-access
|
19 |
+
|
20 |
+
#----------------------------------------------------------------------------
|
21 |
+
|
22 |
+
enabled = False # Enable the custom op by setting this to true.
|
23 |
+
weight_gradients_disabled = False # Forcefully disable computation of gradients with respect to the weights.
|
24 |
+
|
25 |
+
@contextlib.contextmanager
|
26 |
+
def no_weight_gradients():
|
27 |
+
global weight_gradients_disabled
|
28 |
+
old = weight_gradients_disabled
|
29 |
+
weight_gradients_disabled = True
|
30 |
+
yield
|
31 |
+
weight_gradients_disabled = old
|
32 |
+
|
33 |
+
#----------------------------------------------------------------------------
|
34 |
+
|
35 |
+
def conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1):
|
36 |
+
if _should_use_custom_op(input):
|
37 |
+
return _conv2d_gradfix(transpose=False, weight_shape=weight.shape, stride=stride, padding=padding, output_padding=0, dilation=dilation, groups=groups).apply(input, weight, bias)
|
38 |
+
return torch.nn.functional.conv2d(input=input, weight=weight, bias=bias, stride=stride, padding=padding, dilation=dilation, groups=groups)
|
39 |
+
|
40 |
+
def conv_transpose2d(input, weight, bias=None, stride=1, padding=0, output_padding=0, groups=1, dilation=1):
|
41 |
+
if _should_use_custom_op(input):
|
42 |
+
return _conv2d_gradfix(transpose=True, weight_shape=weight.shape, stride=stride, padding=padding, output_padding=output_padding, groups=groups, dilation=dilation).apply(input, weight, bias)
|
43 |
+
return torch.nn.functional.conv_transpose2d(input=input, weight=weight, bias=bias, stride=stride, padding=padding, output_padding=output_padding, groups=groups, dilation=dilation)
|
44 |
+
|
45 |
+
#----------------------------------------------------------------------------
|
46 |
+
|
47 |
+
def _should_use_custom_op(input):
|
48 |
+
assert isinstance(input, torch.Tensor)
|
49 |
+
if (not enabled) or (not torch.backends.cudnn.enabled):
|
50 |
+
return False
|
51 |
+
if input.device.type != 'cuda':
|
52 |
+
return False
|
53 |
+
if any(torch.__version__.startswith(x) for x in ['1.7.', '1.8.', '1.9']):
|
54 |
+
return True
|
55 |
+
warnings.warn(f'conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d().')
|
56 |
+
return False
|
57 |
+
|
58 |
+
def _tuple_of_ints(xs, ndim):
|
59 |
+
xs = tuple(xs) if isinstance(xs, (tuple, list)) else (xs,) * ndim
|
60 |
+
assert len(xs) == ndim
|
61 |
+
assert all(isinstance(x, int) for x in xs)
|
62 |
+
return xs
|
63 |
+
|
64 |
+
#----------------------------------------------------------------------------
|
65 |
+
|
66 |
+
_conv2d_gradfix_cache = dict()
|
67 |
+
|
68 |
+
def _conv2d_gradfix(transpose, weight_shape, stride, padding, output_padding, dilation, groups):
|
69 |
+
# Parse arguments.
|
70 |
+
ndim = 2
|
71 |
+
weight_shape = tuple(weight_shape)
|
72 |
+
stride = _tuple_of_ints(stride, ndim)
|
73 |
+
padding = _tuple_of_ints(padding, ndim)
|
74 |
+
output_padding = _tuple_of_ints(output_padding, ndim)
|
75 |
+
dilation = _tuple_of_ints(dilation, ndim)
|
76 |
+
|
77 |
+
# Lookup from cache.
|
78 |
+
key = (transpose, weight_shape, stride, padding, output_padding, dilation, groups)
|
79 |
+
if key in _conv2d_gradfix_cache:
|
80 |
+
return _conv2d_gradfix_cache[key]
|
81 |
+
|
82 |
+
# Validate arguments.
|
83 |
+
assert groups >= 1
|
84 |
+
assert len(weight_shape) == ndim + 2
|
85 |
+
assert all(stride[i] >= 1 for i in range(ndim))
|
86 |
+
assert all(padding[i] >= 0 for i in range(ndim))
|
87 |
+
assert all(dilation[i] >= 0 for i in range(ndim))
|
88 |
+
if not transpose:
|
89 |
+
assert all(output_padding[i] == 0 for i in range(ndim))
|
90 |
+
else: # transpose
|
91 |
+
assert all(0 <= output_padding[i] < max(stride[i], dilation[i]) for i in range(ndim))
|
92 |
+
|
93 |
+
# Helpers.
|
94 |
+
common_kwargs = dict(stride=stride, padding=padding, dilation=dilation, groups=groups)
|
95 |
+
def calc_output_padding(input_shape, output_shape):
|
96 |
+
if transpose:
|
97 |
+
return [0, 0]
|
98 |
+
return [
|
99 |
+
input_shape[i + 2]
|
100 |
+
- (output_shape[i + 2] - 1) * stride[i]
|
101 |
+
- (1 - 2 * padding[i])
|
102 |
+
- dilation[i] * (weight_shape[i + 2] - 1)
|
103 |
+
for i in range(ndim)
|
104 |
+
]
|
105 |
+
|
106 |
+
# Forward & backward.
|
107 |
+
class Conv2d(torch.autograd.Function):
|
108 |
+
@staticmethod
|
109 |
+
def forward(ctx, input, weight, bias):
|
110 |
+
assert weight.shape == weight_shape
|
111 |
+
if not transpose:
|
112 |
+
output = torch.nn.functional.conv2d(input=input, weight=weight, bias=bias, **common_kwargs)
|
113 |
+
else: # transpose
|
114 |
+
output = torch.nn.functional.conv_transpose2d(input=input, weight=weight, bias=bias, output_padding=output_padding, **common_kwargs)
|
115 |
+
ctx.save_for_backward(input, weight)
|
116 |
+
return output
|
117 |
+
|
118 |
+
@staticmethod
|
119 |
+
def backward(ctx, grad_output):
|
120 |
+
input, weight = ctx.saved_tensors
|
121 |
+
grad_input = None
|
122 |
+
grad_weight = None
|
123 |
+
grad_bias = None
|
124 |
+
|
125 |
+
if ctx.needs_input_grad[0]:
|
126 |
+
p = calc_output_padding(input_shape=input.shape, output_shape=grad_output.shape)
|
127 |
+
grad_input = _conv2d_gradfix(transpose=(not transpose), weight_shape=weight_shape, output_padding=p, **common_kwargs).apply(grad_output, weight, None)
|
128 |
+
assert grad_input.shape == input.shape
|
129 |
+
|
130 |
+
if ctx.needs_input_grad[1] and not weight_gradients_disabled:
|
131 |
+
grad_weight = Conv2dGradWeight.apply(grad_output, input)
|
132 |
+
assert grad_weight.shape == weight_shape
|
133 |
+
|
134 |
+
if ctx.needs_input_grad[2]:
|
135 |
+
grad_bias = grad_output.sum([0, 2, 3])
|
136 |
+
|
137 |
+
return grad_input, grad_weight, grad_bias
|
138 |
+
|
139 |
+
# Gradient with respect to the weights.
|
140 |
+
class Conv2dGradWeight(torch.autograd.Function):
|
141 |
+
@staticmethod
|
142 |
+
def forward(ctx, grad_output, input):
|
143 |
+
op = torch._C._jit_get_operation('aten::cudnn_convolution_backward_weight' if not transpose else 'aten::cudnn_convolution_transpose_backward_weight')
|
144 |
+
flags = [torch.backends.cudnn.benchmark, torch.backends.cudnn.deterministic, torch.backends.cudnn.allow_tf32]
|
145 |
+
grad_weight = op(weight_shape, grad_output, input, padding, stride, dilation, groups, *flags)
|
146 |
+
assert grad_weight.shape == weight_shape
|
147 |
+
ctx.save_for_backward(grad_output, input)
|
148 |
+
return grad_weight
|
149 |
+
|
150 |
+
@staticmethod
|
151 |
+
def backward(ctx, grad2_grad_weight):
|
152 |
+
grad_output, input = ctx.saved_tensors
|
153 |
+
grad2_grad_output = None
|
154 |
+
grad2_input = None
|
155 |
+
|
156 |
+
if ctx.needs_input_grad[0]:
|
157 |
+
grad2_grad_output = Conv2d.apply(input, grad2_grad_weight, None)
|
158 |
+
assert grad2_grad_output.shape == grad_output.shape
|
159 |
+
|
160 |
+
if ctx.needs_input_grad[1]:
|
161 |
+
p = calc_output_padding(input_shape=input.shape, output_shape=grad_output.shape)
|
162 |
+
grad2_input = _conv2d_gradfix(transpose=(not transpose), weight_shape=weight_shape, output_padding=p, **common_kwargs).apply(grad_output, grad2_grad_weight, None)
|
163 |
+
assert grad2_input.shape == input.shape
|
164 |
+
|
165 |
+
return grad2_grad_output, grad2_input
|
166 |
+
|
167 |
+
_conv2d_gradfix_cache[key] = Conv2d
|
168 |
+
return Conv2d
|
169 |
+
|
170 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/conv2d_resample.py
ADDED
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""2D convolution with optional up/downsampling."""
|
10 |
+
|
11 |
+
import torch
|
12 |
+
|
13 |
+
from .. import misc
|
14 |
+
from . import conv2d_gradfix
|
15 |
+
from . import upfirdn2d
|
16 |
+
from .upfirdn2d import _parse_padding
|
17 |
+
from .upfirdn2d import _get_filter_size
|
18 |
+
|
19 |
+
#----------------------------------------------------------------------------
|
20 |
+
|
21 |
+
def _get_weight_shape(w):
|
22 |
+
with misc.suppress_tracer_warnings(): # this value will be treated as a constant
|
23 |
+
shape = [int(sz) for sz in w.shape]
|
24 |
+
misc.assert_shape(w, shape)
|
25 |
+
return shape
|
26 |
+
|
27 |
+
#----------------------------------------------------------------------------
|
28 |
+
|
29 |
+
def _conv2d_wrapper(x, w, stride=1, padding=0, groups=1, transpose=False, flip_weight=True):
|
30 |
+
"""Wrapper for the underlying `conv2d()` and `conv_transpose2d()` implementations.
|
31 |
+
"""
|
32 |
+
out_channels, in_channels_per_group, kh, kw = _get_weight_shape(w)
|
33 |
+
|
34 |
+
# Flip weight if requested.
|
35 |
+
if not flip_weight: # conv2d() actually performs correlation (flip_weight=True) not convolution (flip_weight=False).
|
36 |
+
w = w.flip([2, 3])
|
37 |
+
|
38 |
+
# Workaround performance pitfall in cuDNN 8.0.5, triggered when using
|
39 |
+
# 1x1 kernel + memory_format=channels_last + less than 64 channels.
|
40 |
+
if kw == 1 and kh == 1 and stride == 1 and padding in [0, [0, 0], (0, 0)] and not transpose:
|
41 |
+
if x.stride()[1] == 1 and min(out_channels, in_channels_per_group) < 64:
|
42 |
+
if out_channels <= 4 and groups == 1:
|
43 |
+
in_shape = x.shape
|
44 |
+
x = w.squeeze(3).squeeze(2) @ x.reshape([in_shape[0], in_channels_per_group, -1])
|
45 |
+
x = x.reshape([in_shape[0], out_channels, in_shape[2], in_shape[3]])
|
46 |
+
else:
|
47 |
+
x = x.to(memory_format=torch.contiguous_format)
|
48 |
+
w = w.to(memory_format=torch.contiguous_format)
|
49 |
+
x = conv2d_gradfix.conv2d(x, w, groups=groups)
|
50 |
+
return x.to(memory_format=torch.channels_last)
|
51 |
+
|
52 |
+
# Otherwise => execute using conv2d_gradfix.
|
53 |
+
op = conv2d_gradfix.conv_transpose2d if transpose else conv2d_gradfix.conv2d
|
54 |
+
return op(x, w, stride=stride, padding=padding, groups=groups)
|
55 |
+
|
56 |
+
#----------------------------------------------------------------------------
|
57 |
+
|
58 |
+
@misc.profiled_function
|
59 |
+
def conv2d_resample(x, w, f=None, up=1, down=1, padding=0, groups=1, flip_weight=True, flip_filter=False):
|
60 |
+
r"""2D convolution with optional up/downsampling.
|
61 |
+
|
62 |
+
Padding is performed only once at the beginning, not between the operations.
|
63 |
+
|
64 |
+
Args:
|
65 |
+
x: Input tensor of shape
|
66 |
+
`[batch_size, in_channels, in_height, in_width]`.
|
67 |
+
w: Weight tensor of shape
|
68 |
+
`[out_channels, in_channels//groups, kernel_height, kernel_width]`.
|
69 |
+
f: Low-pass filter for up/downsampling. Must be prepared beforehand by
|
70 |
+
calling upfirdn2d.setup_filter(). None = identity (default).
|
71 |
+
up: Integer upsampling factor (default: 1).
|
72 |
+
down: Integer downsampling factor (default: 1).
|
73 |
+
padding: Padding with respect to the upsampled image. Can be a single number
|
74 |
+
or a list/tuple `[x, y]` or `[x_before, x_after, y_before, y_after]`
|
75 |
+
(default: 0).
|
76 |
+
groups: Split input channels into N groups (default: 1).
|
77 |
+
flip_weight: False = convolution, True = correlation (default: True).
|
78 |
+
flip_filter: False = convolution, True = correlation (default: False).
|
79 |
+
|
80 |
+
Returns:
|
81 |
+
Tensor of the shape `[batch_size, num_channels, out_height, out_width]`.
|
82 |
+
"""
|
83 |
+
# Validate arguments.
|
84 |
+
assert isinstance(x, torch.Tensor) and (x.ndim == 4)
|
85 |
+
assert isinstance(w, torch.Tensor) and (w.ndim == 4) and (w.dtype == x.dtype)
|
86 |
+
assert f is None or (isinstance(f, torch.Tensor) and f.ndim in [1, 2] and f.dtype == torch.float32)
|
87 |
+
assert isinstance(up, int) and (up >= 1)
|
88 |
+
assert isinstance(down, int) and (down >= 1)
|
89 |
+
assert isinstance(groups, int) and (groups >= 1)
|
90 |
+
out_channels, in_channels_per_group, kh, kw = _get_weight_shape(w)
|
91 |
+
fw, fh = _get_filter_size(f)
|
92 |
+
px0, px1, py0, py1 = _parse_padding(padding)
|
93 |
+
|
94 |
+
# Adjust padding to account for up/downsampling.
|
95 |
+
if up > 1:
|
96 |
+
px0 += (fw + up - 1) // 2
|
97 |
+
px1 += (fw - up) // 2
|
98 |
+
py0 += (fh + up - 1) // 2
|
99 |
+
py1 += (fh - up) // 2
|
100 |
+
if down > 1:
|
101 |
+
px0 += (fw - down + 1) // 2
|
102 |
+
px1 += (fw - down) // 2
|
103 |
+
py0 += (fh - down + 1) // 2
|
104 |
+
py1 += (fh - down) // 2
|
105 |
+
|
106 |
+
# Fast path: 1x1 convolution with downsampling only => downsample first, then convolve.
|
107 |
+
if kw == 1 and kh == 1 and (down > 1 and up == 1):
|
108 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, down=down, padding=[px0,px1,py0,py1], flip_filter=flip_filter)
|
109 |
+
x = _conv2d_wrapper(x=x, w=w, groups=groups, flip_weight=flip_weight)
|
110 |
+
return x
|
111 |
+
|
112 |
+
# Fast path: 1x1 convolution with upsampling only => convolve first, then upsample.
|
113 |
+
if kw == 1 and kh == 1 and (up > 1 and down == 1):
|
114 |
+
x = _conv2d_wrapper(x=x, w=w, groups=groups, flip_weight=flip_weight)
|
115 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, up=up, padding=[px0,px1,py0,py1], gain=up**2, flip_filter=flip_filter)
|
116 |
+
return x
|
117 |
+
|
118 |
+
# Fast path: downsampling only => use strided convolution.
|
119 |
+
if down > 1 and up == 1:
|
120 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, padding=[px0,px1,py0,py1], flip_filter=flip_filter)
|
121 |
+
x = _conv2d_wrapper(x=x, w=w, stride=down, groups=groups, flip_weight=flip_weight)
|
122 |
+
return x
|
123 |
+
|
124 |
+
# Fast path: upsampling with optional downsampling => use transpose strided convolution.
|
125 |
+
if up > 1:
|
126 |
+
if groups == 1:
|
127 |
+
w = w.transpose(0, 1)
|
128 |
+
else:
|
129 |
+
w = w.reshape(groups, out_channels // groups, in_channels_per_group, kh, kw)
|
130 |
+
w = w.transpose(1, 2)
|
131 |
+
w = w.reshape(groups * in_channels_per_group, out_channels // groups, kh, kw)
|
132 |
+
px0 -= kw - 1
|
133 |
+
px1 -= kw - up
|
134 |
+
py0 -= kh - 1
|
135 |
+
py1 -= kh - up
|
136 |
+
pxt = max(min(-px0, -px1), 0)
|
137 |
+
pyt = max(min(-py0, -py1), 0)
|
138 |
+
x = _conv2d_wrapper(x=x, w=w, stride=up, padding=[pyt,pxt], groups=groups, transpose=True, flip_weight=(not flip_weight))
|
139 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, padding=[px0+pxt,px1+pxt,py0+pyt,py1+pyt], gain=up**2, flip_filter=flip_filter)
|
140 |
+
if down > 1:
|
141 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, down=down, flip_filter=flip_filter)
|
142 |
+
return x
|
143 |
+
|
144 |
+
# Fast path: no up/downsampling, padding supported by the underlying implementation => use plain conv2d.
|
145 |
+
if up == 1 and down == 1:
|
146 |
+
if px0 == px1 and py0 == py1 and px0 >= 0 and py0 >= 0:
|
147 |
+
return _conv2d_wrapper(x=x, w=w, padding=[py0,px0], groups=groups, flip_weight=flip_weight)
|
148 |
+
|
149 |
+
# Fallback: Generic reference implementation.
|
150 |
+
x = upfirdn2d.upfirdn2d(x=x, f=(f if up > 1 else None), up=up, padding=[px0,px1,py0,py1], gain=up**2, flip_filter=flip_filter)
|
151 |
+
x = _conv2d_wrapper(x=x, w=w, groups=groups, flip_weight=flip_weight)
|
152 |
+
if down > 1:
|
153 |
+
x = upfirdn2d.upfirdn2d(x=x, f=f, down=down, flip_filter=flip_filter)
|
154 |
+
return x
|
155 |
+
|
156 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/fma.py
ADDED
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Fused multiply-add, with slightly faster gradients than `torch.addcmul()`."""
|
10 |
+
|
11 |
+
import torch
|
12 |
+
|
13 |
+
#----------------------------------------------------------------------------
|
14 |
+
|
15 |
+
def fma(a, b, c): # => a * b + c
|
16 |
+
return _FusedMultiplyAdd.apply(a, b, c)
|
17 |
+
|
18 |
+
#----------------------------------------------------------------------------
|
19 |
+
|
20 |
+
class _FusedMultiplyAdd(torch.autograd.Function): # a * b + c
|
21 |
+
@staticmethod
|
22 |
+
def forward(ctx, a, b, c): # pylint: disable=arguments-differ
|
23 |
+
out = torch.addcmul(c, a, b)
|
24 |
+
ctx.save_for_backward(a, b)
|
25 |
+
ctx.c_shape = c.shape
|
26 |
+
return out
|
27 |
+
|
28 |
+
@staticmethod
|
29 |
+
def backward(ctx, dout): # pylint: disable=arguments-differ
|
30 |
+
a, b = ctx.saved_tensors
|
31 |
+
c_shape = ctx.c_shape
|
32 |
+
da = None
|
33 |
+
db = None
|
34 |
+
dc = None
|
35 |
+
|
36 |
+
if ctx.needs_input_grad[0]:
|
37 |
+
da = _unbroadcast(dout * b, a.shape)
|
38 |
+
|
39 |
+
if ctx.needs_input_grad[1]:
|
40 |
+
db = _unbroadcast(dout * a, b.shape)
|
41 |
+
|
42 |
+
if ctx.needs_input_grad[2]:
|
43 |
+
dc = _unbroadcast(dout, c_shape)
|
44 |
+
|
45 |
+
return da, db, dc
|
46 |
+
|
47 |
+
#----------------------------------------------------------------------------
|
48 |
+
|
49 |
+
def _unbroadcast(x, shape):
|
50 |
+
extra_dims = x.ndim - len(shape)
|
51 |
+
assert extra_dims >= 0
|
52 |
+
dim = [i for i in range(x.ndim) if x.shape[i] > 1 and (i < extra_dims or shape[i - extra_dims] == 1)]
|
53 |
+
if len(dim):
|
54 |
+
x = x.sum(dim=dim, keepdim=True)
|
55 |
+
if extra_dims:
|
56 |
+
x = x.reshape(-1, *x.shape[extra_dims+1:])
|
57 |
+
assert x.shape == shape
|
58 |
+
return x
|
59 |
+
|
60 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/grid_sample_gradfix.py
ADDED
@@ -0,0 +1,83 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Custom replacement for `torch.nn.functional.grid_sample` that
|
10 |
+
supports arbitrarily high order gradients between the input and output.
|
11 |
+
Only works on 2D images and assumes
|
12 |
+
`mode='bilinear'`, `padding_mode='zeros'`, `align_corners=False`."""
|
13 |
+
|
14 |
+
import warnings
|
15 |
+
import torch
|
16 |
+
|
17 |
+
# pylint: disable=redefined-builtin
|
18 |
+
# pylint: disable=arguments-differ
|
19 |
+
# pylint: disable=protected-access
|
20 |
+
|
21 |
+
#----------------------------------------------------------------------------
|
22 |
+
|
23 |
+
enabled = False # Enable the custom op by setting this to true.
|
24 |
+
|
25 |
+
#----------------------------------------------------------------------------
|
26 |
+
|
27 |
+
def grid_sample(input, grid):
|
28 |
+
if _should_use_custom_op():
|
29 |
+
return _GridSample2dForward.apply(input, grid)
|
30 |
+
return torch.nn.functional.grid_sample(input=input, grid=grid, mode='bilinear', padding_mode='zeros', align_corners=False)
|
31 |
+
|
32 |
+
#----------------------------------------------------------------------------
|
33 |
+
|
34 |
+
def _should_use_custom_op():
|
35 |
+
if not enabled:
|
36 |
+
return False
|
37 |
+
if any(torch.__version__.startswith(x) for x in ['1.7.', '1.8.', '1.9']):
|
38 |
+
return True
|
39 |
+
warnings.warn(f'grid_sample_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.grid_sample().')
|
40 |
+
return False
|
41 |
+
|
42 |
+
#----------------------------------------------------------------------------
|
43 |
+
|
44 |
+
class _GridSample2dForward(torch.autograd.Function):
|
45 |
+
@staticmethod
|
46 |
+
def forward(ctx, input, grid):
|
47 |
+
assert input.ndim == 4
|
48 |
+
assert grid.ndim == 4
|
49 |
+
output = torch.nn.functional.grid_sample(input=input, grid=grid, mode='bilinear', padding_mode='zeros', align_corners=False)
|
50 |
+
ctx.save_for_backward(input, grid)
|
51 |
+
return output
|
52 |
+
|
53 |
+
@staticmethod
|
54 |
+
def backward(ctx, grad_output):
|
55 |
+
input, grid = ctx.saved_tensors
|
56 |
+
grad_input, grad_grid = _GridSample2dBackward.apply(grad_output, input, grid)
|
57 |
+
return grad_input, grad_grid
|
58 |
+
|
59 |
+
#----------------------------------------------------------------------------
|
60 |
+
|
61 |
+
class _GridSample2dBackward(torch.autograd.Function):
|
62 |
+
@staticmethod
|
63 |
+
def forward(ctx, grad_output, input, grid):
|
64 |
+
op = torch._C._jit_get_operation('aten::grid_sampler_2d_backward')
|
65 |
+
grad_input, grad_grid = op(grad_output, input, grid, 0, 0, False)
|
66 |
+
ctx.save_for_backward(grid)
|
67 |
+
return grad_input, grad_grid
|
68 |
+
|
69 |
+
@staticmethod
|
70 |
+
def backward(ctx, grad2_grad_input, grad2_grad_grid):
|
71 |
+
_ = grad2_grad_grid # unused
|
72 |
+
grid, = ctx.saved_tensors
|
73 |
+
grad2_grad_output = None
|
74 |
+
grad2_input = None
|
75 |
+
grad2_grid = None
|
76 |
+
|
77 |
+
if ctx.needs_input_grad[0]:
|
78 |
+
grad2_grad_output = _GridSample2dForward.apply(grad2_grad_input, grid)
|
79 |
+
|
80 |
+
assert not ctx.needs_input_grad[2]
|
81 |
+
return grad2_grad_output, grad2_input, grad2_grid
|
82 |
+
|
83 |
+
#----------------------------------------------------------------------------
|
torch_utils/ops/upfirdn2d.cpp
ADDED
@@ -0,0 +1,103 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
#include <torch/extension.h>
|
10 |
+
#include <ATen/cuda/CUDAContext.h>
|
11 |
+
#include <c10/cuda/CUDAGuard.h>
|
12 |
+
#include "upfirdn2d.h"
|
13 |
+
|
14 |
+
//------------------------------------------------------------------------
|
15 |
+
|
16 |
+
static torch::Tensor upfirdn2d(torch::Tensor x, torch::Tensor f, int upx, int upy, int downx, int downy, int padx0, int padx1, int pady0, int pady1, bool flip, float gain)
|
17 |
+
{
|
18 |
+
// Validate arguments.
|
19 |
+
TORCH_CHECK(x.is_cuda(), "x must reside on CUDA device");
|
20 |
+
TORCH_CHECK(f.device() == x.device(), "f must reside on the same device as x");
|
21 |
+
TORCH_CHECK(f.dtype() == torch::kFloat, "f must be float32");
|
22 |
+
TORCH_CHECK(x.numel() <= INT_MAX, "x is too large");
|
23 |
+
TORCH_CHECK(f.numel() <= INT_MAX, "f is too large");
|
24 |
+
TORCH_CHECK(x.dim() == 4, "x must be rank 4");
|
25 |
+
TORCH_CHECK(f.dim() == 2, "f must be rank 2");
|
26 |
+
TORCH_CHECK(f.size(0) >= 1 && f.size(1) >= 1, "f must be at least 1x1");
|
27 |
+
TORCH_CHECK(upx >= 1 && upy >= 1, "upsampling factor must be at least 1");
|
28 |
+
TORCH_CHECK(downx >= 1 && downy >= 1, "downsampling factor must be at least 1");
|
29 |
+
|
30 |
+
// Create output tensor.
|
31 |
+
const at::cuda::OptionalCUDAGuard device_guard(device_of(x));
|
32 |
+
int outW = ((int)x.size(3) * upx + padx0 + padx1 - (int)f.size(1) + downx) / downx;
|
33 |
+
int outH = ((int)x.size(2) * upy + pady0 + pady1 - (int)f.size(0) + downy) / downy;
|
34 |
+
TORCH_CHECK(outW >= 1 && outH >= 1, "output must be at least 1x1");
|
35 |
+
torch::Tensor y = torch::empty({x.size(0), x.size(1), outH, outW}, x.options(), x.suggest_memory_format());
|
36 |
+
TORCH_CHECK(y.numel() <= INT_MAX, "output is too large");
|
37 |
+
|
38 |
+
// Initialize CUDA kernel parameters.
|
39 |
+
upfirdn2d_kernel_params p;
|
40 |
+
p.x = x.data_ptr();
|
41 |
+
p.f = f.data_ptr<float>();
|
42 |
+
p.y = y.data_ptr();
|
43 |
+
p.up = make_int2(upx, upy);
|
44 |
+
p.down = make_int2(downx, downy);
|
45 |
+
p.pad0 = make_int2(padx0, pady0);
|
46 |
+
p.flip = (flip) ? 1 : 0;
|
47 |
+
p.gain = gain;
|
48 |
+
p.inSize = make_int4((int)x.size(3), (int)x.size(2), (int)x.size(1), (int)x.size(0));
|
49 |
+
p.inStride = make_int4((int)x.stride(3), (int)x.stride(2), (int)x.stride(1), (int)x.stride(0));
|
50 |
+
p.filterSize = make_int2((int)f.size(1), (int)f.size(0));
|
51 |
+
p.filterStride = make_int2((int)f.stride(1), (int)f.stride(0));
|
52 |
+
p.outSize = make_int4((int)y.size(3), (int)y.size(2), (int)y.size(1), (int)y.size(0));
|
53 |
+
p.outStride = make_int4((int)y.stride(3), (int)y.stride(2), (int)y.stride(1), (int)y.stride(0));
|
54 |
+
p.sizeMajor = (p.inStride.z == 1) ? p.inSize.w : p.inSize.w * p.inSize.z;
|
55 |
+
p.sizeMinor = (p.inStride.z == 1) ? p.inSize.z : 1;
|
56 |
+
|
57 |
+
// Choose CUDA kernel.
|
58 |
+
upfirdn2d_kernel_spec spec;
|
59 |
+
AT_DISPATCH_FLOATING_TYPES_AND_HALF(x.scalar_type(), "upfirdn2d_cuda", [&]
|
60 |
+
{
|
61 |
+
spec = choose_upfirdn2d_kernel<scalar_t>(p);
|
62 |
+
});
|
63 |
+
|
64 |
+
// Set looping options.
|
65 |
+
p.loopMajor = (p.sizeMajor - 1) / 16384 + 1;
|
66 |
+
p.loopMinor = spec.loopMinor;
|
67 |
+
p.loopX = spec.loopX;
|
68 |
+
p.launchMinor = (p.sizeMinor - 1) / p.loopMinor + 1;
|
69 |
+
p.launchMajor = (p.sizeMajor - 1) / p.loopMajor + 1;
|
70 |
+
|
71 |
+
// Compute grid size.
|
72 |
+
dim3 blockSize, gridSize;
|
73 |
+
if (spec.tileOutW < 0) // large
|
74 |
+
{
|
75 |
+
blockSize = dim3(4, 32, 1);
|
76 |
+
gridSize = dim3(
|
77 |
+
((p.outSize.y - 1) / blockSize.x + 1) * p.launchMinor,
|
78 |
+
(p.outSize.x - 1) / (blockSize.y * p.loopX) + 1,
|
79 |
+
p.launchMajor);
|
80 |
+
}
|
81 |
+
else // small
|
82 |
+
{
|
83 |
+
blockSize = dim3(256, 1, 1);
|
84 |
+
gridSize = dim3(
|
85 |
+
((p.outSize.y - 1) / spec.tileOutH + 1) * p.launchMinor,
|
86 |
+
(p.outSize.x - 1) / (spec.tileOutW * p.loopX) + 1,
|
87 |
+
p.launchMajor);
|
88 |
+
}
|
89 |
+
|
90 |
+
// Launch CUDA kernel.
|
91 |
+
void* args[] = {&p};
|
92 |
+
AT_CUDA_CHECK(cudaLaunchKernel(spec.kernel, gridSize, blockSize, args, 0, at::cuda::getCurrentCUDAStream()));
|
93 |
+
return y;
|
94 |
+
}
|
95 |
+
|
96 |
+
//------------------------------------------------------------------------
|
97 |
+
|
98 |
+
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m)
|
99 |
+
{
|
100 |
+
m.def("upfirdn2d", &upfirdn2d);
|
101 |
+
}
|
102 |
+
|
103 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/upfirdn2d.cu
ADDED
@@ -0,0 +1,350 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
#include <c10/util/Half.h>
|
10 |
+
#include "upfirdn2d.h"
|
11 |
+
|
12 |
+
//------------------------------------------------------------------------
|
13 |
+
// Helpers.
|
14 |
+
|
15 |
+
template <class T> struct InternalType;
|
16 |
+
template <> struct InternalType<double> { typedef double scalar_t; };
|
17 |
+
template <> struct InternalType<float> { typedef float scalar_t; };
|
18 |
+
template <> struct InternalType<c10::Half> { typedef float scalar_t; };
|
19 |
+
|
20 |
+
static __device__ __forceinline__ int floor_div(int a, int b)
|
21 |
+
{
|
22 |
+
int t = 1 - a / b;
|
23 |
+
return (a + t * b) / b - t;
|
24 |
+
}
|
25 |
+
|
26 |
+
//------------------------------------------------------------------------
|
27 |
+
// Generic CUDA implementation for large filters.
|
28 |
+
|
29 |
+
template <class T> static __global__ void upfirdn2d_kernel_large(upfirdn2d_kernel_params p)
|
30 |
+
{
|
31 |
+
typedef typename InternalType<T>::scalar_t scalar_t;
|
32 |
+
|
33 |
+
// Calculate thread index.
|
34 |
+
int minorBase = blockIdx.x * blockDim.x + threadIdx.x;
|
35 |
+
int outY = minorBase / p.launchMinor;
|
36 |
+
minorBase -= outY * p.launchMinor;
|
37 |
+
int outXBase = blockIdx.y * p.loopX * blockDim.y + threadIdx.y;
|
38 |
+
int majorBase = blockIdx.z * p.loopMajor;
|
39 |
+
if (outXBase >= p.outSize.x | outY >= p.outSize.y | majorBase >= p.sizeMajor)
|
40 |
+
return;
|
41 |
+
|
42 |
+
// Setup Y receptive field.
|
43 |
+
int midY = outY * p.down.y + p.up.y - 1 - p.pad0.y;
|
44 |
+
int inY = min(max(floor_div(midY, p.up.y), 0), p.inSize.y);
|
45 |
+
int h = min(max(floor_div(midY + p.filterSize.y, p.up.y), 0), p.inSize.y) - inY;
|
46 |
+
int filterY = midY + p.filterSize.y - (inY + 1) * p.up.y;
|
47 |
+
if (p.flip)
|
48 |
+
filterY = p.filterSize.y - 1 - filterY;
|
49 |
+
|
50 |
+
// Loop over major, minor, and X.
|
51 |
+
for (int majorIdx = 0, major = majorBase; majorIdx < p.loopMajor & major < p.sizeMajor; majorIdx++, major++)
|
52 |
+
for (int minorIdx = 0, minor = minorBase; minorIdx < p.loopMinor & minor < p.sizeMinor; minorIdx++, minor += p.launchMinor)
|
53 |
+
{
|
54 |
+
int nc = major * p.sizeMinor + minor;
|
55 |
+
int n = nc / p.inSize.z;
|
56 |
+
int c = nc - n * p.inSize.z;
|
57 |
+
for (int loopX = 0, outX = outXBase; loopX < p.loopX & outX < p.outSize.x; loopX++, outX += blockDim.y)
|
58 |
+
{
|
59 |
+
// Setup X receptive field.
|
60 |
+
int midX = outX * p.down.x + p.up.x - 1 - p.pad0.x;
|
61 |
+
int inX = min(max(floor_div(midX, p.up.x), 0), p.inSize.x);
|
62 |
+
int w = min(max(floor_div(midX + p.filterSize.x, p.up.x), 0), p.inSize.x) - inX;
|
63 |
+
int filterX = midX + p.filterSize.x - (inX + 1) * p.up.x;
|
64 |
+
if (p.flip)
|
65 |
+
filterX = p.filterSize.x - 1 - filterX;
|
66 |
+
|
67 |
+
// Initialize pointers.
|
68 |
+
const T* xp = &((const T*)p.x)[inX * p.inStride.x + inY * p.inStride.y + c * p.inStride.z + n * p.inStride.w];
|
69 |
+
const float* fp = &p.f[filterX * p.filterStride.x + filterY * p.filterStride.y];
|
70 |
+
int filterStepX = ((p.flip) ? p.up.x : -p.up.x) * p.filterStride.x;
|
71 |
+
int filterStepY = ((p.flip) ? p.up.y : -p.up.y) * p.filterStride.y;
|
72 |
+
|
73 |
+
// Inner loop.
|
74 |
+
scalar_t v = 0;
|
75 |
+
for (int y = 0; y < h; y++)
|
76 |
+
{
|
77 |
+
for (int x = 0; x < w; x++)
|
78 |
+
{
|
79 |
+
v += (scalar_t)(*xp) * (scalar_t)(*fp);
|
80 |
+
xp += p.inStride.x;
|
81 |
+
fp += filterStepX;
|
82 |
+
}
|
83 |
+
xp += p.inStride.y - w * p.inStride.x;
|
84 |
+
fp += filterStepY - w * filterStepX;
|
85 |
+
}
|
86 |
+
|
87 |
+
// Store result.
|
88 |
+
v *= p.gain;
|
89 |
+
((T*)p.y)[outX * p.outStride.x + outY * p.outStride.y + c * p.outStride.z + n * p.outStride.w] = (T)v;
|
90 |
+
}
|
91 |
+
}
|
92 |
+
}
|
93 |
+
|
94 |
+
//------------------------------------------------------------------------
|
95 |
+
// Specialized CUDA implementation for small filters.
|
96 |
+
|
97 |
+
template <class T, int upx, int upy, int downx, int downy, int filterW, int filterH, int tileOutW, int tileOutH, int loopMinor>
|
98 |
+
static __global__ void upfirdn2d_kernel_small(upfirdn2d_kernel_params p)
|
99 |
+
{
|
100 |
+
typedef typename InternalType<T>::scalar_t scalar_t;
|
101 |
+
const int tileInW = ((tileOutW - 1) * downx + filterW - 1) / upx + 1;
|
102 |
+
const int tileInH = ((tileOutH - 1) * downy + filterH - 1) / upy + 1;
|
103 |
+
__shared__ volatile scalar_t sf[filterH][filterW];
|
104 |
+
__shared__ volatile scalar_t sx[tileInH][tileInW][loopMinor];
|
105 |
+
|
106 |
+
// Calculate tile index.
|
107 |
+
int minorBase = blockIdx.x;
|
108 |
+
int tileOutY = minorBase / p.launchMinor;
|
109 |
+
minorBase -= tileOutY * p.launchMinor;
|
110 |
+
minorBase *= loopMinor;
|
111 |
+
tileOutY *= tileOutH;
|
112 |
+
int tileOutXBase = blockIdx.y * p.loopX * tileOutW;
|
113 |
+
int majorBase = blockIdx.z * p.loopMajor;
|
114 |
+
if (tileOutXBase >= p.outSize.x | tileOutY >= p.outSize.y | majorBase >= p.sizeMajor)
|
115 |
+
return;
|
116 |
+
|
117 |
+
// Load filter (flipped).
|
118 |
+
for (int tapIdx = threadIdx.x; tapIdx < filterH * filterW; tapIdx += blockDim.x)
|
119 |
+
{
|
120 |
+
int fy = tapIdx / filterW;
|
121 |
+
int fx = tapIdx - fy * filterW;
|
122 |
+
scalar_t v = 0;
|
123 |
+
if (fx < p.filterSize.x & fy < p.filterSize.y)
|
124 |
+
{
|
125 |
+
int ffx = (p.flip) ? fx : p.filterSize.x - 1 - fx;
|
126 |
+
int ffy = (p.flip) ? fy : p.filterSize.y - 1 - fy;
|
127 |
+
v = (scalar_t)p.f[ffx * p.filterStride.x + ffy * p.filterStride.y];
|
128 |
+
}
|
129 |
+
sf[fy][fx] = v;
|
130 |
+
}
|
131 |
+
|
132 |
+
// Loop over major and X.
|
133 |
+
for (int majorIdx = 0, major = majorBase; majorIdx < p.loopMajor & major < p.sizeMajor; majorIdx++, major++)
|
134 |
+
{
|
135 |
+
int baseNC = major * p.sizeMinor + minorBase;
|
136 |
+
int n = baseNC / p.inSize.z;
|
137 |
+
int baseC = baseNC - n * p.inSize.z;
|
138 |
+
for (int loopX = 0, tileOutX = tileOutXBase; loopX < p.loopX & tileOutX < p.outSize.x; loopX++, tileOutX += tileOutW)
|
139 |
+
{
|
140 |
+
// Load input pixels.
|
141 |
+
int tileMidX = tileOutX * downx + upx - 1 - p.pad0.x;
|
142 |
+
int tileMidY = tileOutY * downy + upy - 1 - p.pad0.y;
|
143 |
+
int tileInX = floor_div(tileMidX, upx);
|
144 |
+
int tileInY = floor_div(tileMidY, upy);
|
145 |
+
__syncthreads();
|
146 |
+
for (int inIdx = threadIdx.x; inIdx < tileInH * tileInW * loopMinor; inIdx += blockDim.x)
|
147 |
+
{
|
148 |
+
int relC = inIdx;
|
149 |
+
int relInX = relC / loopMinor;
|
150 |
+
int relInY = relInX / tileInW;
|
151 |
+
relC -= relInX * loopMinor;
|
152 |
+
relInX -= relInY * tileInW;
|
153 |
+
int c = baseC + relC;
|
154 |
+
int inX = tileInX + relInX;
|
155 |
+
int inY = tileInY + relInY;
|
156 |
+
scalar_t v = 0;
|
157 |
+
if (inX >= 0 & inY >= 0 & inX < p.inSize.x & inY < p.inSize.y & c < p.inSize.z)
|
158 |
+
v = (scalar_t)((const T*)p.x)[inX * p.inStride.x + inY * p.inStride.y + c * p.inStride.z + n * p.inStride.w];
|
159 |
+
sx[relInY][relInX][relC] = v;
|
160 |
+
}
|
161 |
+
|
162 |
+
// Loop over output pixels.
|
163 |
+
__syncthreads();
|
164 |
+
for (int outIdx = threadIdx.x; outIdx < tileOutH * tileOutW * loopMinor; outIdx += blockDim.x)
|
165 |
+
{
|
166 |
+
int relC = outIdx;
|
167 |
+
int relOutX = relC / loopMinor;
|
168 |
+
int relOutY = relOutX / tileOutW;
|
169 |
+
relC -= relOutX * loopMinor;
|
170 |
+
relOutX -= relOutY * tileOutW;
|
171 |
+
int c = baseC + relC;
|
172 |
+
int outX = tileOutX + relOutX;
|
173 |
+
int outY = tileOutY + relOutY;
|
174 |
+
|
175 |
+
// Setup receptive field.
|
176 |
+
int midX = tileMidX + relOutX * downx;
|
177 |
+
int midY = tileMidY + relOutY * downy;
|
178 |
+
int inX = floor_div(midX, upx);
|
179 |
+
int inY = floor_div(midY, upy);
|
180 |
+
int relInX = inX - tileInX;
|
181 |
+
int relInY = inY - tileInY;
|
182 |
+
int filterX = (inX + 1) * upx - midX - 1; // flipped
|
183 |
+
int filterY = (inY + 1) * upy - midY - 1; // flipped
|
184 |
+
|
185 |
+
// Inner loop.
|
186 |
+
if (outX < p.outSize.x & outY < p.outSize.y & c < p.outSize.z)
|
187 |
+
{
|
188 |
+
scalar_t v = 0;
|
189 |
+
#pragma unroll
|
190 |
+
for (int y = 0; y < filterH / upy; y++)
|
191 |
+
#pragma unroll
|
192 |
+
for (int x = 0; x < filterW / upx; x++)
|
193 |
+
v += sx[relInY + y][relInX + x][relC] * sf[filterY + y * upy][filterX + x * upx];
|
194 |
+
v *= p.gain;
|
195 |
+
((T*)p.y)[outX * p.outStride.x + outY * p.outStride.y + c * p.outStride.z + n * p.outStride.w] = (T)v;
|
196 |
+
}
|
197 |
+
}
|
198 |
+
}
|
199 |
+
}
|
200 |
+
}
|
201 |
+
|
202 |
+
//------------------------------------------------------------------------
|
203 |
+
// CUDA kernel selection.
|
204 |
+
|
205 |
+
template <class T> upfirdn2d_kernel_spec choose_upfirdn2d_kernel(const upfirdn2d_kernel_params& p)
|
206 |
+
{
|
207 |
+
int s = p.inStride.z, fx = p.filterSize.x, fy = p.filterSize.y;
|
208 |
+
|
209 |
+
upfirdn2d_kernel_spec spec = {(void*)upfirdn2d_kernel_large<T>, -1,-1,1, 4}; // contiguous
|
210 |
+
if (s == 1) spec = {(void*)upfirdn2d_kernel_large<T>, -1,-1,4, 1}; // channels_last
|
211 |
+
|
212 |
+
if (s != 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 1 && p.down.y == 1) // contiguous
|
213 |
+
{
|
214 |
+
if (fx <= 7 && fy <= 7 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 7,7, 64,16,1>, 64,16,1, 1};
|
215 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 6,6, 64,16,1>, 64,16,1, 1};
|
216 |
+
if (fx <= 5 && fy <= 5 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 5,5, 64,16,1>, 64,16,1, 1};
|
217 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 4,4, 64,16,1>, 64,16,1, 1};
|
218 |
+
if (fx <= 3 && fy <= 3 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 3,3, 64,16,1>, 64,16,1, 1};
|
219 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 24,1, 128,8,1>, 128,8,1, 1};
|
220 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 20,1, 128,8,1>, 128,8,1, 1};
|
221 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 16,1, 128,8,1>, 128,8,1, 1};
|
222 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 12,1, 128,8,1>, 128,8,1, 1};
|
223 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 8,1, 128,8,1>, 128,8,1, 1};
|
224 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,24, 32,32,1>, 32,32,1, 1};
|
225 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,20, 32,32,1>, 32,32,1, 1};
|
226 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,16, 32,32,1>, 32,32,1, 1};
|
227 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,12, 32,32,1>, 32,32,1, 1};
|
228 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,8, 32,32,1>, 32,32,1, 1};
|
229 |
+
}
|
230 |
+
if (s == 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 1 && p.down.y == 1) // channels_last
|
231 |
+
{
|
232 |
+
if (fx <= 7 && fy <= 7 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 7,7, 16,16,8>, 16,16,8, 1};
|
233 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 4,4, 16,16,8>, 16,16,8, 1};
|
234 |
+
if (fx <= 5 && fy <= 5 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 4,4, 16,16,8>, 16,16,8, 1};
|
235 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 4,4, 16,16,8>, 16,16,8, 1};
|
236 |
+
if (fx <= 3 && fy <= 3 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 4,4, 16,16,8>, 16,16,8, 1};
|
237 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 24,1, 128,1,16>, 128,1,16, 1};
|
238 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 20,1, 128,1,16>, 128,1,16, 1};
|
239 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 16,1, 128,1,16>, 128,1,16, 1};
|
240 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 12,1, 128,1,16>, 128,1,16, 1};
|
241 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 8,1, 128,1,16>, 128,1,16, 1};
|
242 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,24, 1,128,16>, 1,128,16, 1};
|
243 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,20, 1,128,16>, 1,128,16, 1};
|
244 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,16, 1,128,16>, 1,128,16, 1};
|
245 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,12, 1,128,16>, 1,128,16, 1};
|
246 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,1, 1,8, 1,128,16>, 1,128,16, 1};
|
247 |
+
}
|
248 |
+
if (s != 1 && p.up.x == 2 && p.up.y == 2 && p.down.x == 1 && p.down.y == 1) // contiguous
|
249 |
+
{
|
250 |
+
if (fx <= 8 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 8,8, 64,16,1>, 64,16,1, 1};
|
251 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 6,6, 64,16,1>, 64,16,1, 1};
|
252 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 4,4, 64,16,1>, 64,16,1, 1};
|
253 |
+
if (fx <= 2 && fy <= 2 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 2,2, 64,16,1>, 64,16,1, 1};
|
254 |
+
}
|
255 |
+
if (s == 1 && p.up.x == 2 && p.up.y == 2 && p.down.x == 1 && p.down.y == 1) // channels_last
|
256 |
+
{
|
257 |
+
if (fx <= 8 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 8,8, 16,16,8>, 16,16,8, 1};
|
258 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 6,6, 16,16,8>, 16,16,8, 1};
|
259 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 4,4, 16,16,8>, 16,16,8, 1};
|
260 |
+
if (fx <= 2 && fy <= 2 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,2, 1,1, 2,2, 16,16,8>, 16,16,8, 1};
|
261 |
+
}
|
262 |
+
if (s != 1 && p.up.x == 2 && p.up.y == 1 && p.down.x == 1 && p.down.y == 1) // contiguous
|
263 |
+
{
|
264 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 24,1, 128,8,1>, 128,8,1, 1};
|
265 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 20,1, 128,8,1>, 128,8,1, 1};
|
266 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 16,1, 128,8,1>, 128,8,1, 1};
|
267 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 12,1, 128,8,1>, 128,8,1, 1};
|
268 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 8,1, 128,8,1>, 128,8,1, 1};
|
269 |
+
}
|
270 |
+
if (s == 1 && p.up.x == 2 && p.up.y == 1 && p.down.x == 1 && p.down.y == 1) // channels_last
|
271 |
+
{
|
272 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 24,1, 128,1,16>, 128,1,16, 1};
|
273 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 20,1, 128,1,16>, 128,1,16, 1};
|
274 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 16,1, 128,1,16>, 128,1,16, 1};
|
275 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 12,1, 128,1,16>, 128,1,16, 1};
|
276 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 2,1, 1,1, 8,1, 128,1,16>, 128,1,16, 1};
|
277 |
+
}
|
278 |
+
if (s != 1 && p.up.x == 1 && p.up.y == 2 && p.down.x == 1 && p.down.y == 1) // contiguous
|
279 |
+
{
|
280 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,24, 32,32,1>, 32,32,1, 1};
|
281 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,20, 32,32,1>, 32,32,1, 1};
|
282 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,16, 32,32,1>, 32,32,1, 1};
|
283 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,12, 32,32,1>, 32,32,1, 1};
|
284 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,8, 32,32,1>, 32,32,1, 1};
|
285 |
+
}
|
286 |
+
if (s == 1 && p.up.x == 1 && p.up.y == 2 && p.down.x == 1 && p.down.y == 1) // channels_last
|
287 |
+
{
|
288 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,24, 1,128,16>, 1,128,16, 1};
|
289 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,20, 1,128,16>, 1,128,16, 1};
|
290 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,16, 1,128,16>, 1,128,16, 1};
|
291 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,12, 1,128,16>, 1,128,16, 1};
|
292 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,2, 1,1, 1,8, 1,128,16>, 1,128,16, 1};
|
293 |
+
}
|
294 |
+
if (s != 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 2 && p.down.y == 2) // contiguous
|
295 |
+
{
|
296 |
+
if (fx <= 8 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 8,8, 32,8,1>, 32,8,1, 1};
|
297 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 6,6, 32,8,1>, 32,8,1, 1};
|
298 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 4,4, 32,8,1>, 32,8,1, 1};
|
299 |
+
if (fx <= 2 && fy <= 2 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 2,2, 32,8,1>, 32,8,1, 1};
|
300 |
+
}
|
301 |
+
if (s == 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 2 && p.down.y == 2) // channels_last
|
302 |
+
{
|
303 |
+
if (fx <= 8 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 8,8, 8,8,8>, 8,8,8, 1};
|
304 |
+
if (fx <= 6 && fy <= 6 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 6,6, 8,8,8>, 8,8,8, 1};
|
305 |
+
if (fx <= 4 && fy <= 4 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 4,4, 8,8,8>, 8,8,8, 1};
|
306 |
+
if (fx <= 2 && fy <= 2 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,2, 2,2, 8,8,8>, 8,8,8, 1};
|
307 |
+
}
|
308 |
+
if (s != 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 2 && p.down.y == 1) // contiguous
|
309 |
+
{
|
310 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 24,1, 64,8,1>, 64,8,1, 1};
|
311 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 20,1, 64,8,1>, 64,8,1, 1};
|
312 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 16,1, 64,8,1>, 64,8,1, 1};
|
313 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 12,1, 64,8,1>, 64,8,1, 1};
|
314 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 8,1, 64,8,1>, 64,8,1, 1};
|
315 |
+
}
|
316 |
+
if (s == 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 2 && p.down.y == 1) // channels_last
|
317 |
+
{
|
318 |
+
if (fx <= 24 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 24,1, 64,1,8>, 64,1,8, 1};
|
319 |
+
if (fx <= 20 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 20,1, 64,1,8>, 64,1,8, 1};
|
320 |
+
if (fx <= 16 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 16,1, 64,1,8>, 64,1,8, 1};
|
321 |
+
if (fx <= 12 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 12,1, 64,1,8>, 64,1,8, 1};
|
322 |
+
if (fx <= 8 && fy <= 1 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 2,1, 8,1, 64,1,8>, 64,1,8, 1};
|
323 |
+
}
|
324 |
+
if (s != 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 1 && p.down.y == 2) // contiguous
|
325 |
+
{
|
326 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,24, 32,16,1>, 32,16,1, 1};
|
327 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,20, 32,16,1>, 32,16,1, 1};
|
328 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,16, 32,16,1>, 32,16,1, 1};
|
329 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,12, 32,16,1>, 32,16,1, 1};
|
330 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,8, 32,16,1>, 32,16,1, 1};
|
331 |
+
}
|
332 |
+
if (s == 1 && p.up.x == 1 && p.up.y == 1 && p.down.x == 1 && p.down.y == 2) // channels_last
|
333 |
+
{
|
334 |
+
if (fx <= 1 && fy <= 24) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,24, 1,64,8>, 1,64,8, 1};
|
335 |
+
if (fx <= 1 && fy <= 20) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,20, 1,64,8>, 1,64,8, 1};
|
336 |
+
if (fx <= 1 && fy <= 16) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,16, 1,64,8>, 1,64,8, 1};
|
337 |
+
if (fx <= 1 && fy <= 12) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,12, 1,64,8>, 1,64,8, 1};
|
338 |
+
if (fx <= 1 && fy <= 8 ) spec = {(void*)upfirdn2d_kernel_small<T, 1,1, 1,2, 1,8, 1,64,8>, 1,64,8, 1};
|
339 |
+
}
|
340 |
+
return spec;
|
341 |
+
}
|
342 |
+
|
343 |
+
//------------------------------------------------------------------------
|
344 |
+
// Template specializations.
|
345 |
+
|
346 |
+
template upfirdn2d_kernel_spec choose_upfirdn2d_kernel<double> (const upfirdn2d_kernel_params& p);
|
347 |
+
template upfirdn2d_kernel_spec choose_upfirdn2d_kernel<float> (const upfirdn2d_kernel_params& p);
|
348 |
+
template upfirdn2d_kernel_spec choose_upfirdn2d_kernel<c10::Half>(const upfirdn2d_kernel_params& p);
|
349 |
+
|
350 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/upfirdn2d.h
ADDED
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
//
|
3 |
+
// NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
// and proprietary rights in and to this software, related documentation
|
5 |
+
// and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
// distribution of this software and related documentation without an express
|
7 |
+
// license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
#include <cuda_runtime.h>
|
10 |
+
|
11 |
+
//------------------------------------------------------------------------
|
12 |
+
// CUDA kernel parameters.
|
13 |
+
|
14 |
+
struct upfirdn2d_kernel_params
|
15 |
+
{
|
16 |
+
const void* x;
|
17 |
+
const float* f;
|
18 |
+
void* y;
|
19 |
+
|
20 |
+
int2 up;
|
21 |
+
int2 down;
|
22 |
+
int2 pad0;
|
23 |
+
int flip;
|
24 |
+
float gain;
|
25 |
+
|
26 |
+
int4 inSize; // [width, height, channel, batch]
|
27 |
+
int4 inStride;
|
28 |
+
int2 filterSize; // [width, height]
|
29 |
+
int2 filterStride;
|
30 |
+
int4 outSize; // [width, height, channel, batch]
|
31 |
+
int4 outStride;
|
32 |
+
int sizeMinor;
|
33 |
+
int sizeMajor;
|
34 |
+
|
35 |
+
int loopMinor;
|
36 |
+
int loopMajor;
|
37 |
+
int loopX;
|
38 |
+
int launchMinor;
|
39 |
+
int launchMajor;
|
40 |
+
};
|
41 |
+
|
42 |
+
//------------------------------------------------------------------------
|
43 |
+
// CUDA kernel specialization.
|
44 |
+
|
45 |
+
struct upfirdn2d_kernel_spec
|
46 |
+
{
|
47 |
+
void* kernel;
|
48 |
+
int tileOutW;
|
49 |
+
int tileOutH;
|
50 |
+
int loopMinor;
|
51 |
+
int loopX;
|
52 |
+
};
|
53 |
+
|
54 |
+
//------------------------------------------------------------------------
|
55 |
+
// CUDA kernel selection.
|
56 |
+
|
57 |
+
template <class T> upfirdn2d_kernel_spec choose_upfirdn2d_kernel(const upfirdn2d_kernel_params& p);
|
58 |
+
|
59 |
+
//------------------------------------------------------------------------
|
torch_utils/ops/upfirdn2d.py
ADDED
@@ -0,0 +1,384 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
|
2 |
+
#
|
3 |
+
# NVIDIA CORPORATION and its licensors retain all intellectual property
|
4 |
+
# and proprietary rights in and to this software, related documentation
|
5 |
+
# and any modifications thereto. Any use, reproduction, disclosure or
|
6 |
+
# distribution of this software and related documentation without an express
|
7 |
+
# license agreement from NVIDIA CORPORATION is strictly prohibited.
|
8 |
+
|
9 |
+
"""Custom PyTorch ops for efficient resampling of 2D images."""
|
10 |
+
|
11 |
+
import os
|
12 |
+
import warnings
|
13 |
+
import numpy as np
|
14 |
+
import torch
|
15 |
+
import traceback
|
16 |
+
|
17 |
+
from .. import custom_ops
|
18 |
+
from .. import misc
|
19 |
+
from . import conv2d_gradfix
|
20 |
+
|
21 |
+
#----------------------------------------------------------------------------
|
22 |
+
|
23 |
+
_inited = False
|
24 |
+
_plugin = None
|
25 |
+
|
26 |
+
def _init():
|
27 |
+
global _inited, _plugin
|
28 |
+
if not _inited:
|
29 |
+
sources = ['upfirdn2d.cpp', 'upfirdn2d.cu']
|
30 |
+
sources = [os.path.join(os.path.dirname(__file__), s) for s in sources]
|
31 |
+
try:
|
32 |
+
_plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
|
33 |
+
except:
|
34 |
+
warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
|
35 |
+
return _plugin is not None
|
36 |
+
|
37 |
+
def _parse_scaling(scaling):
|
38 |
+
if isinstance(scaling, int):
|
39 |
+
scaling = [scaling, scaling]
|
40 |
+
assert isinstance(scaling, (list, tuple))
|
41 |
+
assert all(isinstance(x, int) for x in scaling)
|
42 |
+
sx, sy = scaling
|
43 |
+
assert sx >= 1 and sy >= 1
|
44 |
+
return sx, sy
|
45 |
+
|
46 |
+
def _parse_padding(padding):
|
47 |
+
if isinstance(padding, int):
|
48 |
+
padding = [padding, padding]
|
49 |
+
assert isinstance(padding, (list, tuple))
|
50 |
+
assert all(isinstance(x, int) for x in padding)
|
51 |
+
if len(padding) == 2:
|
52 |
+
padx, pady = padding
|
53 |
+
padding = [padx, padx, pady, pady]
|
54 |
+
padx0, padx1, pady0, pady1 = padding
|
55 |
+
return padx0, padx1, pady0, pady1
|
56 |
+
|
57 |
+
def _get_filter_size(f):
|
58 |
+
if f is None:
|
59 |
+
return 1, 1
|
60 |
+
assert isinstance(f, torch.Tensor) and f.ndim in [1, 2]
|
61 |
+
fw = f.shape[-1]
|
62 |
+
fh = f.shape[0]
|
63 |
+
with misc.suppress_tracer_warnings():
|
64 |
+
fw = int(fw)
|
65 |
+
fh = int(fh)
|
66 |
+
misc.assert_shape(f, [fh, fw][:f.ndim])
|
67 |
+
assert fw >= 1 and fh >= 1
|
68 |
+
return fw, fh
|
69 |
+
|
70 |
+
#----------------------------------------------------------------------------
|
71 |
+
|
72 |
+
def setup_filter(f, device=torch.device('cpu'), normalize=True, flip_filter=False, gain=1, separable=None):
|
73 |
+
r"""Convenience function to setup 2D FIR filter for `upfirdn2d()`.
|
74 |
+
|
75 |
+
Args:
|
76 |
+
f: Torch tensor, numpy array, or python list of the shape
|
77 |
+
`[filter_height, filter_width]` (non-separable),
|
78 |
+
`[filter_taps]` (separable),
|
79 |
+
`[]` (impulse), or
|
80 |
+
`None` (identity).
|
81 |
+
device: Result device (default: cpu).
|
82 |
+
normalize: Normalize the filter so that it retains the magnitude
|
83 |
+
for constant input signal (DC)? (default: True).
|
84 |
+
flip_filter: Flip the filter? (default: False).
|
85 |
+
gain: Overall scaling factor for signal magnitude (default: 1).
|
86 |
+
separable: Return a separable filter? (default: select automatically).
|
87 |
+
|
88 |
+
Returns:
|
89 |
+
Float32 tensor of the shape
|
90 |
+
`[filter_height, filter_width]` (non-separable) or
|
91 |
+
`[filter_taps]` (separable).
|
92 |
+
"""
|
93 |
+
# Validate.
|
94 |
+
if f is None:
|
95 |
+
f = 1
|
96 |
+
f = torch.as_tensor(f, dtype=torch.float32)
|
97 |
+
assert f.ndim in [0, 1, 2]
|
98 |
+
assert f.numel() > 0
|
99 |
+
if f.ndim == 0:
|
100 |
+
f = f[np.newaxis]
|
101 |
+
|
102 |
+
# Separable?
|
103 |
+
if separable is None:
|
104 |
+
separable = (f.ndim == 1 and f.numel() >= 8)
|
105 |
+
if f.ndim == 1 and not separable:
|
106 |
+
f = f.ger(f)
|
107 |
+
assert f.ndim == (1 if separable else 2)
|
108 |
+
|
109 |
+
# Apply normalize, flip, gain, and device.
|
110 |
+
if normalize:
|
111 |
+
f /= f.sum()
|
112 |
+
if flip_filter:
|
113 |
+
f = f.flip(list(range(f.ndim)))
|
114 |
+
f = f * (gain ** (f.ndim / 2))
|
115 |
+
f = f.to(device=device)
|
116 |
+
return f
|
117 |
+
|
118 |
+
#----------------------------------------------------------------------------
|
119 |
+
|
120 |
+
def upfirdn2d(x, f, up=1, down=1, padding=0, flip_filter=False, gain=1, impl='cuda'):
|
121 |
+
r"""Pad, upsample, filter, and downsample a batch of 2D images.
|
122 |
+
|
123 |
+
Performs the following sequence of operations for each channel:
|
124 |
+
|
125 |
+
1. Upsample the image by inserting N-1 zeros after each pixel (`up`).
|
126 |
+
|
127 |
+
2. Pad the image with the specified number of zeros on each side (`padding`).
|
128 |
+
Negative padding corresponds to cropping the image.
|
129 |
+
|
130 |
+
3. Convolve the image with the specified 2D FIR filter (`f`), shrinking it
|
131 |
+
so that the footprint of all output pixels lies within the input image.
|
132 |
+
|
133 |
+
4. Downsample the image by keeping every Nth pixel (`down`).
|
134 |
+
|
135 |
+
This sequence of operations bears close resemblance to scipy.signal.upfirdn().
|
136 |
+
The fused op is considerably more efficient than performing the same calculation
|
137 |
+
using standard PyTorch ops. It supports gradients of arbitrary order.
|
138 |
+
|
139 |
+
Args:
|
140 |
+
x: Float32/float64/float16 input tensor of the shape
|
141 |
+
`[batch_size, num_channels, in_height, in_width]`.
|
142 |
+
f: Float32 FIR filter of the shape
|
143 |
+
`[filter_height, filter_width]` (non-separable),
|
144 |
+
`[filter_taps]` (separable), or
|
145 |
+
`None` (identity).
|
146 |
+
up: Integer upsampling factor. Can be a single int or a list/tuple
|
147 |
+
`[x, y]` (default: 1).
|
148 |
+
down: Integer downsampling factor. Can be a single int or a list/tuple
|
149 |
+
`[x, y]` (default: 1).
|
150 |
+
padding: Padding with respect to the upsampled image. Can be a single number
|
151 |
+
or a list/tuple `[x, y]` or `[x_before, x_after, y_before, y_after]`
|
152 |
+
(default: 0).
|
153 |
+
flip_filter: False = convolution, True = correlation (default: False).
|
154 |
+
gain: Overall scaling factor for signal magnitude (default: 1).
|
155 |
+
impl: Implementation to use. Can be `'ref'` or `'cuda'` (default: `'cuda'`).
|
156 |
+
|
157 |
+
Returns:
|
158 |
+
Tensor of the shape `[batch_size, num_channels, out_height, out_width]`.
|
159 |
+
"""
|
160 |
+
assert isinstance(x, torch.Tensor)
|
161 |
+
assert impl in ['ref', 'cuda']
|
162 |
+
if impl == 'cuda' and x.device.type == 'cuda' and _init():
|
163 |
+
return _upfirdn2d_cuda(up=up, down=down, padding=padding, flip_filter=flip_filter, gain=gain).apply(x, f)
|
164 |
+
return _upfirdn2d_ref(x, f, up=up, down=down, padding=padding, flip_filter=flip_filter, gain=gain)
|
165 |
+
|
166 |
+
#----------------------------------------------------------------------------
|
167 |
+
|
168 |
+
@misc.profiled_function
|
169 |
+
def _upfirdn2d_ref(x, f, up=1, down=1, padding=0, flip_filter=False, gain=1):
|
170 |
+
"""Slow reference implementation of `upfirdn2d()` using standard PyTorch ops.
|
171 |
+
"""
|
172 |
+
# Validate arguments.
|
173 |
+
assert isinstance(x, torch.Tensor) and x.ndim == 4
|
174 |
+
if f is None:
|
175 |
+
f = torch.ones([1, 1], dtype=torch.float32, device=x.device)
|
176 |
+
assert isinstance(f, torch.Tensor) and f.ndim in [1, 2]
|
177 |
+
assert f.dtype == torch.float32 and not f.requires_grad
|
178 |
+
batch_size, num_channels, in_height, in_width = x.shape
|
179 |
+
upx, upy = _parse_scaling(up)
|
180 |
+
downx, downy = _parse_scaling(down)
|
181 |
+
padx0, padx1, pady0, pady1 = _parse_padding(padding)
|
182 |
+
|
183 |
+
# Upsample by inserting zeros.
|
184 |
+
x = x.reshape([batch_size, num_channels, in_height, 1, in_width, 1])
|
185 |
+
x = torch.nn.functional.pad(x, [0, upx - 1, 0, 0, 0, upy - 1])
|
186 |
+
x = x.reshape([batch_size, num_channels, in_height * upy, in_width * upx])
|
187 |
+
|
188 |
+
# Pad or crop.
|
189 |
+
x = torch.nn.functional.pad(x, [max(padx0, 0), max(padx1, 0), max(pady0, 0), max(pady1, 0)])
|
190 |
+
x = x[:, :, max(-pady0, 0) : x.shape[2] - max(-pady1, 0), max(-padx0, 0) : x.shape[3] - max(-padx1, 0)]
|
191 |
+
|
192 |
+
# Setup filter.
|
193 |
+
f = f * (gain ** (f.ndim / 2))
|
194 |
+
f = f.to(x.dtype)
|
195 |
+
if not flip_filter:
|
196 |
+
f = f.flip(list(range(f.ndim)))
|
197 |
+
|
198 |
+
# Convolve with the filter.
|
199 |
+
f = f[np.newaxis, np.newaxis].repeat([num_channels, 1] + [1] * f.ndim)
|
200 |
+
if f.ndim == 4:
|
201 |
+
x = conv2d_gradfix.conv2d(input=x, weight=f, groups=num_channels)
|
202 |
+
else:
|
203 |
+
x = conv2d_gradfix.conv2d(input=x, weight=f.unsqueeze(2), groups=num_channels)
|
204 |
+
x = conv2d_gradfix.conv2d(input=x, weight=f.unsqueeze(3), groups=num_channels)
|
205 |
+
|
206 |
+
# Downsample by throwing away pixels.
|
207 |
+
x = x[:, :, ::downy, ::downx]
|
208 |
+
return x
|
209 |
+
|
210 |
+
#----------------------------------------------------------------------------
|
211 |
+
|
212 |
+
_upfirdn2d_cuda_cache = dict()
|
213 |
+
|
214 |
+
def _upfirdn2d_cuda(up=1, down=1, padding=0, flip_filter=False, gain=1):
|
215 |
+
"""Fast CUDA implementation of `upfirdn2d()` using custom ops.
|
216 |
+
"""
|
217 |
+
# Parse arguments.
|
218 |
+
upx, upy = _parse_scaling(up)
|
219 |
+
downx, downy = _parse_scaling(down)
|
220 |
+
padx0, padx1, pady0, pady1 = _parse_padding(padding)
|
221 |
+
|
222 |
+
# Lookup from cache.
|
223 |
+
key = (upx, upy, downx, downy, padx0, padx1, pady0, pady1, flip_filter, gain)
|
224 |
+
if key in _upfirdn2d_cuda_cache:
|
225 |
+
return _upfirdn2d_cuda_cache[key]
|
226 |
+
|
227 |
+
# Forward op.
|
228 |
+
class Upfirdn2dCuda(torch.autograd.Function):
|
229 |
+
@staticmethod
|
230 |
+
def forward(ctx, x, f): # pylint: disable=arguments-differ
|
231 |
+
assert isinstance(x, torch.Tensor) and x.ndim == 4
|
232 |
+
if f is None:
|
233 |
+
f = torch.ones([1, 1], dtype=torch.float32, device=x.device)
|
234 |
+
assert isinstance(f, torch.Tensor) and f.ndim in [1, 2]
|
235 |
+
y = x
|
236 |
+
if f.ndim == 2:
|
237 |
+
y = _plugin.upfirdn2d(y, f, upx, upy, downx, downy, padx0, padx1, pady0, pady1, flip_filter, gain)
|
238 |
+
else:
|
239 |
+
y = _plugin.upfirdn2d(y, f.unsqueeze(0), upx, 1, downx, 1, padx0, padx1, 0, 0, flip_filter, np.sqrt(gain))
|
240 |
+
y = _plugin.upfirdn2d(y, f.unsqueeze(1), 1, upy, 1, downy, 0, 0, pady0, pady1, flip_filter, np.sqrt(gain))
|
241 |
+
ctx.save_for_backward(f)
|
242 |
+
ctx.x_shape = x.shape
|
243 |
+
return y
|
244 |
+
|
245 |
+
@staticmethod
|
246 |
+
def backward(ctx, dy): # pylint: disable=arguments-differ
|
247 |
+
f, = ctx.saved_tensors
|
248 |
+
_, _, ih, iw = ctx.x_shape
|
249 |
+
_, _, oh, ow = dy.shape
|
250 |
+
fw, fh = _get_filter_size(f)
|
251 |
+
p = [
|
252 |
+
fw - padx0 - 1,
|
253 |
+
iw * upx - ow * downx + padx0 - upx + 1,
|
254 |
+
fh - pady0 - 1,
|
255 |
+
ih * upy - oh * downy + pady0 - upy + 1,
|
256 |
+
]
|
257 |
+
dx = None
|
258 |
+
df = None
|
259 |
+
|
260 |
+
if ctx.needs_input_grad[0]:
|
261 |
+
dx = _upfirdn2d_cuda(up=down, down=up, padding=p, flip_filter=(not flip_filter), gain=gain).apply(dy, f)
|
262 |
+
|
263 |
+
assert not ctx.needs_input_grad[1]
|
264 |
+
return dx, df
|
265 |
+
|
266 |
+
# Add to cache.
|
267 |
+
_upfirdn2d_cuda_cache[key] = Upfirdn2dCuda
|
268 |
+
return Upfirdn2dCuda
|
269 |
+
|
270 |
+
#----------------------------------------------------------------------------
|
271 |
+
|
272 |
+
def filter2d(x, f, padding=0, flip_filter=False, gain=1, impl='cuda'):
|
273 |
+
r"""Filter a batch of 2D images using the given 2D FIR filter.
|
274 |
+
|
275 |
+
By default, the result is padded so that its shape matches the input.
|
276 |
+
User-specified padding is applied on top of that, with negative values
|
277 |
+
indicating cropping. Pixels outside the image are assumed to be zero.
|
278 |
+
|
279 |
+
Args:
|
280 |
+
x: Float32/float64/float16 input tensor of the shape
|
281 |
+
`[batch_size, num_channels, in_height, in_width]`.
|
282 |
+
f: Float32 FIR filter of the shape
|
283 |
+
`[filter_height, filter_width]` (non-separable),
|
284 |
+
`[filter_taps]` (separable), or
|
285 |
+
`None` (identity).
|
286 |
+
padding: Padding with respect to the output. Can be a single number or a
|
287 |
+
list/tuple `[x, y]` or `[x_before, x_after, y_before, y_after]`
|
288 |
+
(default: 0).
|
289 |
+
flip_filter: False = convolution, True = correlation (default: False).
|
290 |
+
gain: Overall scaling factor for signal magnitude (default: 1).
|
291 |
+
impl: Implementation to use. Can be `'ref'` or `'cuda'` (default: `'cuda'`).
|
292 |
+
|
293 |
+
Returns:
|
294 |
+
Tensor of the shape `[batch_size, num_channels, out_height, out_width]`.
|
295 |
+
"""
|
296 |
+
padx0, padx1, pady0, pady1 = _parse_padding(padding)
|
297 |
+
fw, fh = _get_filter_size(f)
|
298 |
+
p = [
|
299 |
+
padx0 + fw // 2,
|
300 |
+
padx1 + (fw - 1) // 2,
|
301 |
+
pady0 + fh // 2,
|
302 |
+
pady1 + (fh - 1) // 2,
|
303 |
+
]
|
304 |
+
return upfirdn2d(x, f, padding=p, flip_filter=flip_filter, gain=gain, impl=impl)
|
305 |
+
|
306 |
+
#----------------------------------------------------------------------------
|
307 |
+
|
308 |
+
def upsample2d(x, f, up=2, padding=0, flip_filter=False, gain=1, impl='cuda'):
|
309 |
+
r"""Upsample a batch of 2D images using the given 2D FIR filter.
|
310 |
+
|
311 |
+
By default, the result is padded so that its shape is a multiple of the input.
|
312 |
+
User-specified padding is applied on top of that, with negative values
|
313 |
+
indicating cropping. Pixels outside the image are assumed to be zero.
|
314 |
+
|
315 |
+
Args:
|
316 |
+
x: Float32/float64/float16 input tensor of the shape
|
317 |
+
`[batch_size, num_channels, in_height, in_width]`.
|
318 |
+
f: Float32 FIR filter of the shape
|
319 |
+
`[filter_height, filter_width]` (non-separable),
|
320 |
+
`[filter_taps]` (separable), or
|
321 |
+
`None` (identity).
|
322 |
+
up: Integer upsampling factor. Can be a single int or a list/tuple
|
323 |
+
`[x, y]` (default: 1).
|
324 |
+
padding: Padding with respect to the output. Can be a single number or a
|
325 |
+
list/tuple `[x, y]` or `[x_before, x_after, y_before, y_after]`
|
326 |
+
(default: 0).
|
327 |
+
flip_filter: False = convolution, True = correlation (default: False).
|
328 |
+
gain: Overall scaling factor for signal magnitude (default: 1).
|
329 |
+
impl: Implementation to use. Can be `'ref'` or `'cuda'` (default: `'cuda'`).
|
330 |
+
|
331 |
+
Returns:
|
332 |
+
Tensor of the shape `[batch_size, num_channels, out_height, out_width]`.
|
333 |
+
"""
|
334 |
+
upx, upy = _parse_scaling(up)
|
335 |
+
padx0, padx1, pady0, pady1 = _parse_padding(padding)
|
336 |
+
fw, fh = _get_filter_size(f)
|
337 |
+
p = [
|
338 |
+
padx0 + (fw + upx - 1) // 2,
|
339 |
+
padx1 + (fw - upx) // 2,
|
340 |
+
pady0 + (fh + upy - 1) // 2,
|
341 |
+
pady1 + (fh - upy) // 2,
|
342 |
+
]
|
343 |
+
return upfirdn2d(x, f, up=up, padding=p, flip_filter=flip_filter, gain=gain*upx*upy, impl=impl)
|
344 |
+
|
345 |
+
#----------------------------------------------------------------------------
|
346 |
+
|
347 |
+
def downsample2d(x, f, down=2, padding=0, flip_filter=False, gain=1, impl='cuda'):
|
348 |
+
r"""Downsample a batch of 2D images using the given 2D FIR filter.
|
349 |
+
|
350 |
+
By default, the result is padded so that its shape is a fraction of the input.
|
351 |
+
User-specified padding is applied on top of that, with negative values
|
352 |
+
indicating cropping. Pixels outside the image are assumed to be zero.
|
353 |
+
|
354 |
+
Args:
|
355 |
+
x: Float32/float64/float16 input tensor of the shape
|
356 |
+
`[batch_size, num_channels, in_height, in_width]`.
|
357 |
+
f: Float32 FIR filter of the shape
|
358 |
+
`[filter_height, filter_width]` (non-separable),
|
359 |
+
`[filter_taps]` (separable), or
|
360 |
+
`None` (identity).
|
361 |
+
down: Integer downsampling factor. Can be a single int or a list/tuple
|
362 |
+
`[x, y]` (default: 1).
|
363 |
+
padding: Padding with respect to the input. Can be a single number or a
|
364 |
+
list/tuple `[x, y]` or `[x_before, x_after, y_before, y_after]`
|
365 |
+
(default: 0).
|
366 |
+
flip_filter: False = convolution, True = correlation (default: False).
|
367 |
+
gain: Overall scaling factor for signal magnitude (default: 1).
|
368 |
+
impl: Implementation to use. Can be `'ref'` or `'cuda'` (default: `'cuda'`).
|
369 |
+
|
370 |
+
Returns:
|
371 |
+
Tensor of the shape `[batch_size, num_channels, out_height, out_width]`.
|
372 |
+
"""
|
373 |
+
downx, downy = _parse_scaling(down)
|
374 |
+
padx0, padx1, pady0, pady1 = _parse_padding(padding)
|
375 |
+
fw, fh = _get_filter_size(f)
|
376 |
+
p = [
|
377 |
+
padx0 + (fw - downx + 1) // 2,
|
378 |
+
padx1 + (fw - downx) // 2,
|
379 |
+
pady0 + (fh - downy + 1) // 2,
|
380 |
+
pady1 + (fh - downy) // 2,
|
381 |
+
]
|
382 |
+
return upfirdn2d(x, f, down=down, padding=p, flip_filter=flip_filter, gain=gain, impl=impl)
|
383 |
+
|
384 |
+
#----------------------------------------------------------------------------
|