File size: 1,150 Bytes
1c389fc
 
 
 
 
5b3f018
1c389fc
5b3f018
1c389fc
 
 
 
5b3f018
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
library_name: transformers
tags: []
---

# Unofficial port of the VQ model in the Latent Diffusion Model (LDM)

This is an unofficial port of the VQ model in the [Latent Diffusion Model (LDM)](https://github.com/CompVis/latent-diffusion).


## Model Details

See `models/first_stage_models/vq-f16` in the [original repository](https://github.com/CompVis/latent-diffusion).

## How to use

```bash
python -m venv .venv && source .venv/bin/activate
pip install huggingface-hub
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt
pip install -r requirements.txt
```

```python
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel

image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)

# Image reconstruction
img = Image.open("path/to/image.png")
example = image_processor(img)
with torch.inference_mode():
    recon, _ = model.model(example["image"].unsqueeze(0))
recon_img = image_processor.postprocess(recon[0])
recon_img.save("recon.png")
```