ldm-vq-f16 / README.md
ktrk115's picture
Update README.md
5b3f018 verified
metadata
library_name: transformers
tags: []

Unofficial port of the VQ model in the Latent Diffusion Model (LDM)

This is an unofficial port of the VQ model in the Latent Diffusion Model (LDM).

Model Details

See models/first_stage_models/vq-f16 in the original repository.

How to use

python -m venv .venv && source .venv/bin/activate
pip install huggingface-hub
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt
pip install -r requirements.txt
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel

image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)

# Image reconstruction
img = Image.open("path/to/image.png")
example = image_processor(img)
with torch.inference_mode():
    recon, _ = model.model(example["image"].unsqueeze(0))
recon_img = image_processor.postprocess(recon[0])
recon_img.save("recon.png")