BridgeTower
/

bridgetower-base-itm-mlm

Inference Endpoints

Model card Files Files and versions Community

anahita-b commited on Dec 13, 2022

Commit

ab358db

•

1 Parent(s): 27d6951

Update README.md

Files changed (1) hide show

README.md +4 -7

README.md CHANGED Viewed

@@ -10,13 +10,15 @@ datasets:
 - mscoco_captions
 ---
-# BridgeTower base-itm model
 The BridgeTower model was proposed in [BridgeTower: Building Bridges Between Encoders in Vision-Language Representative Learning] by Xiao Xu, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan.
 The model was pretrained model on English language using masked language modeling (MLM) and image text matching (ITM)objectives. It was introduced in
 [this paper](https://arxiv.org/pdf/2206.08657.pdf) and first released in
 [this repository](https://github.com/microsoft/BridgeTower).
 ## Model description
 The abstract from the paper is the following:
@@ -24,7 +26,6 @@ Vision-Language (VL) models with the Two-Tower architecture have dominated visua
 ## Intended uses & limitations(TODO)
-You can use the raw model for image and text retrieval.
 ### How to use
@@ -103,11 +104,7 @@ The model was pre-trained for 100k steps on 8 NVIDIA A100 GPUs with a batch size
 The optimizer used was AdamW with a learning rate of 1e-5. No data augmentation was used except for center-crop. The image resolution in pre-training is set to 288 x 288.
 ## Evaluation results
-When fine-tuned on downstream tasks, this model achieves the following results:
-| Task | | | | | | | | |
-|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|
-| | | | | | | | | |
 ### BibTeX entry and citation info
 ```bibtex

 - mscoco_captions
 ---
+# BridgeTower base-itm-mlm model
 The BridgeTower model was proposed in [BridgeTower: Building Bridges Between Encoders in Vision-Language Representative Learning] by Xiao Xu, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan.
 The model was pretrained model on English language using masked language modeling (MLM) and image text matching (ITM)objectives. It was introduced in
 [this paper](https://arxiv.org/pdf/2206.08657.pdf) and first released in
 [this repository](https://github.com/microsoft/BridgeTower).
+BridgeTower got accepted to [AAAI'23](https://aaai.org/Conferences/AAAI-23/).
 ## Model description
 The abstract from the paper is the following:
 ## Intended uses & limitations(TODO)
 ### How to use
 The optimizer used was AdamW with a learning rate of 1e-5. No data augmentation was used except for center-crop. The image resolution in pre-training is set to 288 x 288.
 ## Evaluation results
+Please refer to [Table 5](https://arxiv.org/pdf/2206.08657.pdf) for BridgeTower's performance on Image Retrieval and other down stream tasks.
 ### BibTeX entry and citation info
 ```bibtex