File size: 7,879 Bytes
584ce58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
---
license: gpl-3.0
tags:
- img2img
- denoiser
- image
---

# denoise_medium_v1

denoise_medium_v1 is an image denoiser made for images that have low-light noise.

It performs slightly better than [denoise_small_v1](https://huggingface.co/vericudebuget/denoise_small_v1) on images that have less colorfull noise and can reconstruct a higher level of detail from the original.


## Model Details



### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** [ConvoLite AI]
- **Funded by:** [VDB]
- **Model type:** [img2img]
- **License:** [gpl-3.0]


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
For comercial and noncomercial use.

### Direct Use
For CPU, use the code below:
``` python
import os
import torch
import torch.nn as nn
from PIL import Image
from torchvision.transforms import ToTensor
import numpy as np
from concurrent.futures import ThreadPoolExecutor

class DenoisingModel(nn.Module):
    def __init__(self):
        super(DenoisingModel, self).__init__()
        self.enc1 = nn.Sequential(
            nn.Conv2d(3, 64, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(64, 64, 3, padding=1),
            nn.ReLU()
        )
        self.pool1 = nn.MaxPool2d(2, 2)
        
        self.up1 = nn.ConvTranspose2d(64, 64, 2, stride=2)
        self.dec1 = nn.Sequential(
            nn.Conv2d(64, 64, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(64, 3, 3, padding=1)
        )

    def forward(self, x):
        e1 = self.enc1(x)
        p1 = self.pool1(e1)
        u1 = self.up1(p1)
        d1 = self.dec1(u1)
        return d1

def denoise_patch(model, patch):
    transform = ToTensor()
    input_patch = transform(patch).unsqueeze(0)
    
    with torch.no_grad():
        output_patch = model(input_patch)
    
    denoised_patch = output_patch.squeeze(0).permute(1, 2, 0).numpy() * 255
    denoised_patch = np.clip(denoised_patch, 0, 255).astype(np.uint8)
    
    original_patch = np.array(patch)
    very_bright_mask = original_patch > 240
    bright_mask = (original_patch > 220) & (original_patch <= 240)
    
    denoised_patch[very_bright_mask] = original_patch[very_bright_mask]
    
    blend_factor = 0.7
    denoised_patch[bright_mask] = (
        blend_factor * original_patch[bright_mask] +
        (1 - blend_factor) * denoised_patch[bright_mask]
    )
    
    return denoised_patch

def denoise_image(image_path, model_path, patch_size=256, num_threads=4, overlap=32):
    model = DenoisingModel()
    checkpoint = torch.load(model_path, map_location=torch.device('cpu'))
    model.load_state_dict(checkpoint['model_state_dict'])
    model.eval()
    
    # Load and get original image dimensions
    image = Image.open(image_path).convert("RGB")
    width, height = image.size
    
    # Calculate padding needed
    pad_right = patch_size - (width % patch_size) if width % patch_size != 0 else 0
    pad_bottom = patch_size - (height % patch_size) if height % patch_size != 0 else 0
    
    # Add padding with reflection instead of zeros
    padded_width = width + pad_right
    padded_height = height + pad_bottom
    
    # Create padded image using reflection padding
    padded_image = Image.new("RGB", (padded_width, padded_height))
    padded_image.paste(image, (0, 0))
    
    # Fill right border with reflected content
    if pad_right > 0:
        right_border = image.crop((width - pad_right, 0, width, height))
        padded_image.paste(right_border.transpose(Image.FLIP_LEFT_RIGHT), (width, 0))
    
    # Fill bottom border with reflected content
    if pad_bottom > 0:
        bottom_border = image.crop((0, height - pad_bottom, width, height))
        padded_image.paste(bottom_border.transpose(Image.FLIP_TOP_BOTTOM), (0, height))
    
    # Fill corner if needed
    if pad_right > 0 and pad_bottom > 0:
        corner = image.crop((width - pad_right, height - pad_bottom, width, height))
        padded_image.paste(corner.transpose(Image.FLIP_LEFT_RIGHT).transpose(Image.FLIP_TOP_BOTTOM), 
                          (width, height))
    
    # Generate patches with positions
    patches = []
    positions = []
    for i in range(0, padded_height, patch_size - overlap):
        for j in range(0, padded_width, patch_size - overlap):
            patch = padded_image.crop((j, i, min(j + patch_size, padded_width), min(i + patch_size, padded_height)))
            patches.append(patch)
            positions.append((i, j))
    
    # Process patches in parallel
    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        denoised_patches = list(executor.map(lambda p: denoise_patch(model, p), patches))
    
    # Initialize output arrays
    denoised_image = np.zeros((padded_height, padded_width, 3), dtype=np.float32)
    weight_map = np.zeros((padded_height, padded_width), dtype=np.float32)
    
    # Create smooth blending weights
    for (i, j), denoised_patch in zip(positions, denoised_patches):
        patch_height, patch_width, _ = denoised_patch.shape
        patch_weights = np.ones((patch_height, patch_width), dtype=np.float32)
        if i > 0:
            patch_weights[:overlap, :] *= np.linspace(0, 1, overlap)[:, np.newaxis]
        if j > 0:
            patch_weights[:, :overlap] *= np.linspace(0, 1, overlap)[np.newaxis, :]
        if i + patch_height < padded_height:
            patch_weights[-overlap:, :] *= np.linspace(1, 0, overlap)[:, np.newaxis]
        if j + patch_width < padded_width:
            patch_weights[:, -overlap:] *= np.linspace(1, 0, overlap)[np.newaxis, :]
        
        # Clip the patch values to prevent very bright pixels
        denoised_patch = np.clip(denoised_patch, 0, 255)
        
        denoised_image[i:i + patch_height, j:j + patch_width] += (
            denoised_patch * patch_weights[:, :, np.newaxis]
        )
        weight_map[i:i + patch_height, j:j + patch_width] += patch_weights
    
    # Normalize by weights
    mask = weight_map > 0
    denoised_image[mask] = denoised_image[mask] / weight_map[mask, np.newaxis]
    
    # Crop to original size
    denoised_image = denoised_image[:height, :width]
    denoised_image = np.clip(denoised_image, 0, 255).astype(np.uint8)
    
    # Save the result
    denoised_image_path = os.path.splitext(image_path)[0] + "_denoised.png"
    print(f"Saving denoised image to {denoised_image_path}")
    
    Image.fromarray(denoised_image).save(denoised_image_path)

if __name__ == "__main__":
    image_path = input("Enter the path of the image: ")
    model_path = r"path/to/model.pkl"
    denoise_image(image_path, model_path, num_threads=12)
    print("Denoising completed.")  # Use the number of threads your processor has.)
```


### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

If the image does not have a high level of noise, it is not recommended to use this model, as it will produce less than ideal results.


## Training Details

This model was trained on a single Nvidia T4 GPU for around one hour.

### Training Data

Around 10 GB of publicly available images under the Creative Commons license.

#### Speed

With an AMD Ryzen 5 5500 it can denoise a 2k image in approx. 2 seconds using multithreading. Still have not tested it out with CUDA, but it's probably faster.



#### Hardware


| Specifications | Minimum | Recommended |
|----------|----------|----------|
| CPU | Intel Core i7-2700K or something else that can run Python | AMD Ryzen 5 5500 |
| RAM | 4 GB | 16 GB |
| GPU | not needed | Nvidia GTX 1660 Ti |


#### Software

Python



## Model Card Authors 

Vericu de Buget


## Model Card Contact

[convolite@europe.com](mailto:convolite@europe.com)
[ConvoLite](https://convolite.github.io/selector.html)