Spaces:

xiaozaa
/

catvton-flux-try-on

Running on Zero

App Files Files Community

xiaozaa commited on 8 days ago

Commit

f610e83

•

1 Parent(s): b633718

add fid score

Browse files

Files changed (2) hide show

README.md +14 -2
script/fid_eval.py +43 -0

README.md CHANGED Viewed

@@ -1,8 +1,19 @@
 # catvton-flux
-An advanced virtual try-on solution that combines the power of [CATVTON](https://arxiv.org/abs/2407.15886) (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting model for realistic and accurate clothing transfer.
 Also inspired by [In-Context LoRA](https://arxiv.org/abs/2410.23775) for prompt engineering.
 ## Showcase
 | Original | Garment | Result |
 |----------|---------|---------|
@@ -41,9 +52,10 @@ python app.py
 ## TODO:
-- [ ] Release the FID score
 - [x] Add gradio demo
 - [ ] Release updated weights with better performance
 ## Citation

 # catvton-flux
+An state-of-the-art virtual try-on solution that combines the power of [CATVTON](https://arxiv.org/abs/2407.15886) (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting model for realistic and accurate clothing transfer.
 Also inspired by [In-Context LoRA](https://arxiv.org/abs/2410.23775) for prompt engineering.
+## Update
+[![SOTA](https://img.shields.io/badge/SOTA-FID%205.59-brightgreen)](https://drive.google.com/file/d/1T2W5R1xH_uszGVD8p6UUAtWyx43rxGmI/view?usp=sharing)
+[![Dataset](https://img.shields.io/badge/Dataset-VITON--HD-blue)](https://github.com/shadow2496/VITON-HD)
+---
+**Latest Achievement** (2024/11/24):
+- Released FID score and gradio demo
+- CatVton-Flux-Alpha achieved **SOTA** performance with FID: `5.593255043029785` on VITON-HD dataset. Test configuration: scale 30, step 30. My VITON-HD test inferencing results available [here](https://drive.google.com/file/d/1T2W5R1xH_uszGVD8p6UUAtWyx43rxGmI/view?usp=sharing)
+---
 ## Showcase
 | Original | Garment | Result |
 |----------|---------|---------|
 ## TODO:
+- [x] Release the FID score
 - [x] Add gradio demo
 - [ ] Release updated weights with better performance
+- [ ] Train a smaller model
 ## Citation

script/fid_eval.py ADDED Viewed

	@@ -0,0 +1,43 @@

+from PIL import Image
+import os
+import numpy as np
+from torchvision.transforms import functional as F
+import torch
+from torchmetrics.image.fid import FrechetInceptionDistance
+# Paths setup
+generated_dataset_path = "output/tryon_results"
+original_dataset_path = "data/VITON-HD/test/image"  # Replace with your actual original dataset path
+# Get generated images
+image_paths = sorted([os.path.join(generated_dataset_path, x) for x in os.listdir(generated_dataset_path)])
+generated_images = [np.array(Image.open(path).convert("RGB")) for path in image_paths]
+# Get corresponding original images
+original_images = []
+for gen_path in image_paths:
+    # Extract the XXXXXX part from "tryon_XXXXXX.jpg"
+    base_name = os.path.basename(gen_path)  # get filename from path
+    original_id = base_name.replace("tryon_", "")  # remove "tryon_" prefix
+    # Construct original image path
+    original_path = os.path.join(original_dataset_path, original_id)
+    original_images.append(np.array(Image.open(original_path).convert("RGB")))
+def preprocess_image(image):
+    image = torch.tensor(image).unsqueeze(0)
+    image = image.permute(0, 3, 1, 2) / 255.0
+    return F.center_crop(image, (768, 1024))
+real_images = torch.cat([preprocess_image(image) for image in original_images])
+fake_images = torch.cat([preprocess_image(image) for image in generated_images])
+print(real_images.shape, fake_images.shape)
+fid = FrechetInceptionDistance(normalize=True)
+fid.update(real_images, real=True)
+fid.update(fake_images, real=False)
+print(f"FID: {float(fid.compute())}")