Unofficial port of the VQ model in the Latent Diffusion Model (LDM)

This is an unofficial port of the VQ model in the Latent Diffusion Model (LDM).

Model Details

See models/first_stage_models/vq-f16 in the original repository.

How to use

python -m venv .venv && source .venv/bin/activate
pip install huggingface-hub
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt
pip install -r requirements.txt
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel

image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)

# Image reconstruction
img = Image.open("path/to/image.png")
example = image_processor(img)
with torch.inference_mode():
    recon, _ = model.model(example["image"].unsqueeze(0))
recon_img = image_processor.postprocess(recon[0])
recon_img.save("recon.png")
Downloads last month
7
Safetensors
Model size
69.6M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.