File size: 1,986 Bytes
ee21b96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
<!---
Copyright 2022 The OFA-Sys Team. 
All rights reserved.
This source code is licensed under the Apache 2.0 license found in the LICENSE file in the root directory.
-->

## Prompt Tuning for Generative Multimodal Pretrained Models

### Overview
This is the code for **"Prompt Tuning for Generative Multimodal Pretrained Models"**, [Check our paper on ArXiv](https://arxiv.org/abs/2208.02532). This paper explores prompt tuning for generative multimodal pretrained models, instead of the constrastive learning models. We specifically focuses on the unified sequence-to-sequence learning framework and implement on our OFA models. 
<br>

### Requirements
* python 3.7.4
* pytorch 1.8.1
* torchvision 0.9.1
* JAVA 1.8 (for COCO evaluation)
<br></br>

### Installation
```bash
pip install -r requirements.txt
```
<br>

### Datasets and Checkpoints
See [datasets.md](datasets.md) and [checkpoints.md](checkpoints.md).
<br>

### Training
We provide a demo script (`run_scripts/refcoco/train_refcoco_prefix.sh`) that has all the required parts for training.

```sh
sh ./run_scripts/refcoco/train_refcoco_prefix.sh
```
A few options of note:
*   `--encoder-prompt` :: whether to insert prompts to the encoder
*   `--decoder-prompt` :: whether to insert prompts to the decoder
*   `--encoder-prompt-length` :: encoder prompt length
*   `--decoder-prompt-length` :: decoder prompt length
*   `--bitfit` :: whether to use bitfit
*   `--adapter` :: whether to use adapter
*   `--adapter-dim` :: adapter projection dim

We recommend that your workspace directory should be organized like this: 
```
OFA/
β”œβ”€β”€ checkpoints/
β”‚Β Β  β”œβ”€β”€ ofa_base.pt
β”‚Β Β  β”œβ”€β”€ ofa_large.pt
β”‚Β Β  └── ...
β”œβ”€β”€ criterions/
β”œβ”€β”€ data/
β”œβ”€β”€ dataset/
β”‚Β Β  β”œβ”€β”€ caption_data/
β”‚Β Β  β”œβ”€β”€ refcoco_data/
β”‚Β Β  └── ...
β”œβ”€β”€ fairseq/
β”œβ”€β”€ models/
β”œβ”€β”€ run_scripts/
β”œβ”€β”€ tasks/
β”œβ”€β”€ train.py
β”œβ”€β”€ trainer.py
└── utils/
```
<br>