ldm-vq-f16 / README.md
ktrk115's picture
Update README.md
5b3f018 verified
---
library_name: transformers
tags: []
---
# Unofficial port of the VQ model in the Latent Diffusion Model (LDM)
This is an unofficial port of the VQ model in the [Latent Diffusion Model (LDM)](https://github.com/CompVis/latent-diffusion).
## Model Details
See `models/first_stage_models/vq-f16` in the [original repository](https://github.com/CompVis/latent-diffusion).
## How to use
```bash
python -m venv .venv && source .venv/bin/activate
pip install huggingface-hub
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt
pip install -r requirements.txt
```
```python
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel
image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
# Image reconstruction
img = Image.open("path/to/image.png")
example = image_processor(img)
with torch.inference_mode():
recon, _ = model.model(example["image"].unsqueeze(0))
recon_img = image_processor.postprocess(recon[0])
recon_img.save("recon.png")
```