|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# Unofficial port of the VQ model in the Latent Diffusion Model (LDM) |
|
|
|
This is an unofficial port of the VQ model in the [Latent Diffusion Model (LDM)](https://github.com/CompVis/latent-diffusion). |
|
|
|
|
|
## Model Details |
|
|
|
See `models/first_stage_models/vq-f16` in the [original repository](https://github.com/CompVis/latent-diffusion). |
|
|
|
## How to use |
|
|
|
```bash |
|
python -m venv .venv && source .venv/bin/activate |
|
pip install huggingface-hub |
|
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt |
|
pip install -r requirements.txt |
|
``` |
|
|
|
```python |
|
import torch |
|
from PIL import Image |
|
from transformers import AutoImageProcessor, AutoModel |
|
|
|
image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True) |
|
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True) |
|
|
|
# Image reconstruction |
|
img = Image.open("path/to/image.png") |
|
example = image_processor(img) |
|
with torch.inference_mode(): |
|
recon, _ = model.model(example["image"].unsqueeze(0)) |
|
recon_img = image_processor.postprocess(recon[0]) |
|
recon_img.save("recon.png") |
|
``` |
|
|