CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation

This is our official implementation of CAT-Seg🐱!

[arXiv] [Project]
by Seokju Cho*, Heeseong Shin*, Sunghwan Hong, Seungjun An, Seungjun Lee, Anurag Arnab, Paul Hongsuck Seo, Seungryong Kim

Introduction

We introduce cost aggregation to open-vocabulary semantic segmentation, which jointly aggregates both image and text modalities within the matching cost.

Installation

Install required packages.

conda create --name catseg python=3.8
conda activate catseg
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt

Data Preparation

Training

Preparation

you have to blah

Training script

python train.py --config configs/eval/{a847 | pc459 | a150 | pc59 | pas20 | pas20b}.yaml

Evaluation

python eval.py --config configs/eval/{a847 | pc459 | a150 | pc59 | pas20 | pas20b}.yaml

Citing CAT-Seg🐱 :pray:

@article{liang2022open,
  title={Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP},
  author={Liang, Feng and Wu, Bichen and Dai, Xiaoliang and Li, Kunpeng and Zhao, Yinan and Zhang, Hang and Zhang, Peizhao and Vajda, Peter and Marculescu, Diana},
  journal={arXiv preprint arXiv:2210.04150},
  year={2022}
}