Update README.md
Browse files
README.md
CHANGED
@@ -1,15 +1,50 @@
|
|
1 |
-
---
|
2 |
-
library_name: keras
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: keras
|
3 |
+
license: mit
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
pipeline_tag: image-to-image
|
7 |
+
tags:
|
8 |
+
- art
|
9 |
+
- pixel_art
|
10 |
+
- character_sprite
|
11 |
+
- missing_data_imputation
|
12 |
+
- image_to_image
|
13 |
+
---
|
14 |
+
|
15 |
+
## Model description
|
16 |
+
|
17 |
+
The MDIGAN-Characters model was proposed in SBGames 2024 ([paper on ArXiv][paper-arxiv], [page][paper-page] [demo][paper-demo])
|
18 |
+
It is a model trained for the task of generating characters in a missing pose: for instance,
|
19 |
+
given images of a character facing back, left, and right, it can generate the character facing front (missing data imputation task).
|
20 |
+
![](https://i.imgur.com/s5ONl9Q.png)
|
21 |
+
|
22 |
+
The model's architecture is based on [CollaGAN][paper-collagan]'s, a model trained to impute images in missing domains
|
23 |
+
in a multi-domain scenario. In our case, the domains are the sides a character might face, i.e., back, left, front, and right.
|
24 |
+
|
25 |
+
We tested providing 3 images to the model, to generate the missing one. But we also evaluated the quality of the generated
|
26 |
+
images when the model receives 2 or 1 input image.
|
27 |
+
|
28 |
+
The inputs to the model are the target (missing) domain and 4 image-like tensors with size 64x64x4 in the order
|
29 |
+
back, left, front, and right. The input images should be floating point tensors in the range of [-1, 1].
|
30 |
+
In place of the missing image(s), we must provide a tensor with shape 64x64x4 filled with zeros.
|
31 |
+
|
32 |
+
|
33 |
+
[paper-collagan]: https://www.computer.org/csdl/proceedings-article/cvpr/2019/329300c482/1gys5gg67QY
|
34 |
+
[paper-arxiv]: https://arxiv.org/abs/2409.10721
|
35 |
+
[paper-page]: https://fegemo.github.io/mdigan-characters
|
36 |
+
[paper-demo]: https://fegemo.github.io/interactive-generator
|
37 |
+
|
38 |
+
## Intended uses & limitations
|
39 |
+
|
40 |
+
This can be used for research purposes only. The quality of the generated images vary a lot, and a
|
41 |
+
post-processing step to quantize the colors of the generated image to the intended palette is benefitial.
|
42 |
+
|
43 |
+
|
44 |
+
## Training and evaluation data
|
45 |
+
|
46 |
+
The model was trained with the [PAC dataset][pac], which features 12,074 paired images of pixel art characters
|
47 |
+
in 4 directions: back, left, front, and right. Compared to StarGAN and Pix2Pix-based baselines, the MDIGAN-Characters
|
48 |
+
model yielded much better images when it received 3 images, and still good images when only 2 are provided.
|
49 |
+
|
50 |
+
[pac]: https://github.com/plucksquire/pac/
|