File size: 7,434 Bytes
1ad8ca8
 
 
 
 
 
 
 
 
 
 
 
 
e480f7f
 
1ad8ca8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e480f7f
1ad8ca8
e480f7f
 
1ad8ca8
e480f7f
 
 
 
1ad8ca8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e480f7f
 
1ad8ca8
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
library_name: diffusers
datasets:
- gvecchio/MatSynth
language:
- en
tags:
- pbr
- materials
- svbrdf
- 3d
- textures
license: openrail
inference: false
pipeline_tag: image-to-3d
---

<!-- # ⚒️ MatForger -->
![alt text](./img/MatForger.png)


> Three Textures for the Designers under the sky, \
> Seven for the Artists in their studios of light, \
> Nine for the Architects doomed to try, \
> One for the Developers on their screens so bright \
> In the Land of Graphics where the Pixels lie. \
> One Forge to craft them all, One Code to find them, \
> One Model to bring them all and to the mesh bind them \
> In the Land of Graphics where the Pixels lie.

<sup><sub>Our deep apologies to J. R. R. Tolkien</sub></sup>


## 🤖 Model Details

### Overview

**MatForger** is a generative diffusion model designed specifically for generating Physically Based Rendering (PBR) materials. Inspired by the [MatFuse](https://arxiv.org/abs/2308.11408) model and trained on the comprehensive [MatSynth](https://huggingface.co/datasets/gvecchio/MatSynth) dataset, MatForger pushes the boundaries of material synthesis. 
It employs the noise rolling technique, derived from [ControlMat](https://arxiv.org/abs/2309.01700), to produce tileable maps. The model generates multiple maps, including basecolor, normal, height, roughness, and metallic, catering to a wide range of material design needs.

### Features
- **High-Quality PBR Material Generation:** Produces detailed and realistic materials suited for various applications.
- **Tileable Textures:** Utilizes a noise rolling approach to ensure textures are tileable, enhancing their usability in larger scenes.
- **Versatile Outputs:** Generates multiple texture maps (basecolor, normal, height, roughness, metallic) to meet the requirements of complex material designs.
- **Text and Image Conditioning:** Can be conditioned with either images or text inputs to guide material generation, offering flexibility in creative workflows.

### Model Description
MatForger architecture is based on **MatFuse**. It differs from it in using a continuous VAE instead of a vector quantized autoencoder (VQ-VAE). Additionally we distilled the multiencoder VAE into a single-encoder model, thus reducing the model complexity but retaining the disentangled latent representation of MatFuse.

## ⚒️ MatForger at work

MatForger can be conditioned via text prompts or images to generate high-quality materials. Following some examples of materials generated using MatForge. For each sample we report the prompt, the generated maps (basecolor, normal, height, roughness, metallic) and the resulting rendering.

<details>
    <summary>Text2Mat samples</summary>
    <img src="./img/MatForger_gen-text.png" alt="Text2Mat generation samples">
</details>

<details>
    <summary>Image2Mat samples</summary>
    <img src="./img/MatForger_gen-img.png" alt="Image2Mat generation samples">
</details>

## 🧑‍💻 How to use

MatForger requires a custom pipeline due to the data type.

You can use it in [🧨 diffusers](https://github.com/huggingface/diffusers):

```python
import torch

from PIL import Image

from diffusers import DiffusionPipeline

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

pipe = DiffusionPipeline.from_pretrained(
    "gvecchio/MatForger",
    trust_remote_code=True,
)

pipe.enable_vae_tiling()

pipe.enable_freeu(s1=0.9, s2=0.2, b1=1.1, b2=1.2)
pipe.to(device)

# model prompting with image
prompt = Image.open("bricks.png")
image = pipe(
    prompt,
    guidance_scale=6.0,
    height=512,
    width=512,
    num_inference_steps=25,
).images[0]

# model promptiong with text
prompt = "terracotta brick wall"
image = pipe(
    prompt,
    guidance_scale=6.0,
    height=512,
    width=512,
    num_inference_steps=25,
).images[0]

# get maps from prediction
basecolor = image.basecolor
normal = image.normal
height = image.height
roughness = image.roughness
metallic = image.metallic

```

## ⚠️ Bias and Limitations

**Quality**: The model was trained on a variety of synthetic and real data from the [MatSynth](https://huggingface.co/datasets/gvecchio/MatSynth) dataset. 
However, it might fail to generate complex materials or patterns that differ significantly from the training data distribution. If the results don't meet your expectations be patient and keep trying.

**Resolution**: MatForger can generate high resolution materials via latent patching. The model, however, was trained at a **256x256** resolution and artifacts or inconsistencies might appear at higher resolution.

**❗Note:** MatForge is a home-trained model, with limited resources. We will try to update it regularly and improve its performances. \
We welcome contributions, feedback, and suggestions to enhance its capabilities and address its limitations. Please be patient as we work towards making MatForger an even more powerful tool.

## 💡 Upcoming features ideas

As MatForger continues to evolve, we're working on several features aimed at enhancing its utility and effectiveness. As we continue to refine and expand its capabilities, here are some of the possible upcoming enhancements:

- **Opacity**: Generate opacity map for materials requiring transparency.

- **Material Inpainting**: A feature designed to allow users to modify and enhance materials by filling in gaps or correcting imperfections directly within the generated textures.

- **Sketch-Based Material Generation**: We're exploring ways to convert simple sketches into detailed materials. This aims to simplify the material creation process, making it more accessible to users without in-depth technical expertise.

- **Color Palette Conditioning**: Future updates will offer improved control over the color palette of generated materials, enabling users to achieve more precise color matching for their projects.

- **Material Estimation from Photographs**: We aim to refine the model's ability to interpret and recreate the material properties observed in photographs, facilitating the creation of materials that closely mimic real-world textures.

### 🎯 Ongoing Development and Openness to Feedback
MatForger is a RESEARCH TOOL, thus its development is an ongoing process and highly subsceptible to our research agenda. 
However, we are committed to improving MatForger's capabilities and addressing any limitations and implementing suggestions we receive from our users.

### 🤝 How to Contribute
**Feature Suggestions**: If you have ideas for new features or improvements, we're eager to hear them. Reach out to us! Your suggestions play a crucial role in guiding the direction of MatForger's development.

**Dataset Contributions**: Enhancing the diversity of our training data can significantly improve the model's performance. If you have access to textures, materials, or data that could benefit MatForger, consider contributing.

**Feedback**: User feedback is invaluable for identifying areas for improvement. Whether it's through reporting issues or sharing your experiences, your insights help us make MatForger better.

**Training**: If you have spare compute resources and wish to contribute to the training of MatForger reach out to giuseppevecchio@hotmail.com.

## Terms of Use:
We hope that the release of this model will make community-based research efforts more accessible. This model is governed by a Openrail License and is intended for research purposes.