zhihfu commited on
Commit
fd68db8
Β·
verified Β·
1 Parent(s): 56d4ce6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -3
README.md CHANGED
@@ -1,3 +1,124 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - composed-image-retrieval
5
+ - vision-language
6
+ - multimodal
7
+ - noisy-correspondence
8
+ - blip-2
9
+ - pytorch
10
+ ---
11
+
12
+ <a id="top"></a>
13
+ <div align="center">
14
+ <h1>☁️ Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval</h1>
15
+
16
+ <p>
17
+ <b>Zhiheng Fu</b><sup>1</sup>&nbsp;
18
+ <b>Yupeng Hu</b><sup>1βœ‰</sup>&nbsp;
19
+ <b>Qianyun Yang</b><sup>1</sup>&nbsp;
20
+ <b>Shiqi Zhang</b><sup>1</sup>&nbsp;
21
+ <b>Zhiwei Chen</b><sup>1</sup>&nbsp;
22
+ <b>Zixu Li</b><sup>1</sup>
23
+ </p>
24
+
25
+ <p>
26
+ <sup>1</sup>School of Software, Shandong University
27
+ </p>
28
+ </div>
29
+
30
+ These are the official pre-trained model weights and configuration files for **Air-Know**, a robust framework designed for Composed Image Retrieval (CIR) under Noisy Correspondence Learning (NCL) settings.
31
+
32
+ πŸ”— **Paper:** [Accepted by CVPR 2026]
33
+ πŸ”— **GitHub Repository:** [ZhihFu/Air-Know](https://github.com/ZhihFu/Air-Know)
34
+ πŸ”— **Project Website:** [Air-Know Webpage](https://zhihfu.github.io/Air-Know.github.io/)
35
+
36
+ ---
37
+
38
+ ## πŸ“Œ Model Information
39
+
40
+ ### 1. Model Name
41
+ **Air-Know** (Arbiter-Calibrated Knowledge-Internalizing Robust Network) Checkpoints.
42
+
43
+ ### 2. Task Type & Applicable Tasks
44
+ - **Task Type:** Composed Image Retrieval (CIR) / Noisy Correspondence Learning / Vision-Language
45
+ - **Applicable Tasks:** Robust multimodal retrieval that effectively mitigates the impact of Noisy Triplet Correspondence (NTC) in training data, while still maintaining highly competitive performance in traditional fully-supervised (0% noise) environments.
46
+
47
+ ### 3. Project Introduction
48
+ **Air-Know** is built upon the BLIP-2/LAVIS framework and tackles the noisy correspondence problem in CIR through three primary modules:
49
+ - βš–οΈ **External Prior Arbitration:** Leverages an offline multimodal expert to generate reliable arbitration priors, bypassing the often-unreliable "small-loss hypothesis".
50
+ - 🧠 **Expert-Knowledge Internalization:** Transfers these priors into a lightweight proxy network to structurally prevent the memorization of ambiguous partial matches.
51
+ - πŸ”„ **Dual-Stream Reconciliation:** Dynamically integrates the internalized knowledge to provide robust online feedback, guiding the final representation learning.
52
+
53
+ ### 4. Training Data Source
54
+ The model was primarily trained and evaluated on standard CIR datasets under various simulated noise ratios (e.g., 0.0, 0.2, 0.5, 0.8):
55
+ - **FashionIQ** (Fashion Domain)
56
+ - **CIRR** (Open Domain)
57
+
58
+ ---
59
+
60
+ ## πŸš€ Usage & Basic Inference
61
+
62
+ These weights are designed to be used directly with the official Air-Know GitHub repository.
63
+
64
+ ### Step 1: Prepare the Environment
65
+ Clone the GitHub repository and install dependencies (evaluated on Python 3.8.10 and PyTorch 2.1.0 with CUDA 12.1+):
66
+ ```bash
67
+ git clone [https://github.com/ZhihFu/Air-Know](https://github.com/ZhihFu/Air-Know)
68
+ cd Air-Know
69
+ conda create -n airknow python=3.8 -y
70
+ conda activate airknow
71
+
72
+ # Install PyTorch
73
+ pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
74
+
75
+ # Install core dependencies
76
+ pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16
77
+ ```
78
+
79
+ ### Step 2: Download Model Weights & Data
80
+ Download the checkpoint folders (e.g., `cirr_noise0.8` or `fashioniq_noise0.8`) from this Hugging Face repository and place them in your local `checkpoints/` directory.
81
+
82
+ Ensure you also download and structure the base dataset images (CIRR and FashionIQ) as specified in the [GitHub repo's Data Preparation section](https://github.com/ZhihFu/Air-Know).
83
+
84
+ ### Step 3: Run Testing / Inference
85
+ To generate prediction files on the CIRR dataset for submission to the CIRR Evaluation Server using the downloaded checkpoint, run:
86
+ ```bash
87
+ python src/cirr_test_submission.py checkpoints/cirr_noise0.8/
88
+ ```
89
+ *(The script will automatically output a `.json` file based on the best checkpoint in the specified folder).*
90
+
91
+ To train the model under specific noise ratios (e.g., `0.8`), you can run:
92
+ ```bash
93
+ python train_BLIP2.py \
94
+ --dataset cirr \
95
+ --cirr_path "/path/to/CIRR/" \
96
+ --model_dir "./checkpoints/cirr_noise0.8" \
97
+ --noise_ratio 0.8 \
98
+ --batch_size 256 \
99
+ --num_epochs 20 \
100
+ --lr 2e-5
101
+ ```
102
+
103
+ ---
104
+
105
+ ## ⚠️ Limitations & Notes
106
+
107
+ **Disclaimer:** This framework and its pre-trained weights are strictly intended for **academic research purposes**.
108
+ - The model requires access to the original source datasets (CIRR, FashionIQ) for full evaluation. Users must comply with the original licenses of those respective datasets.
109
+ - The `noise_ratio` parameter is a simulated interference during training; performance in wild, unstructured noisy environments may vary.
110
+
111
+ ---
112
+
113
+ ## πŸ“β­οΈ Citation
114
+
115
+ If you find our work or these model weights useful in your research, please consider leaving a **Star** ⭐️ on our GitHub repo and citing our paper:
116
+
117
+ ```bibtex
118
+ @InProceedings{Air-Know,
119
+ title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
120
+ author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
121
+ booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
122
+ year = {2026}
123
+ }
124
+ ```