AhmedSSabir commited on
Commit
22d77c4
1 Parent(s): cf0ecfa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -24,6 +24,37 @@ The model is trained with a strict filter of 0.4 similarity distance thresholds
24
  For the [dataset](https://huggingface.co/datasets/AhmedSSabir/Textual-Image-Caption-Dataset)
25
 
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ```
28
  conda create -n BERT_visual python=3.6 anaconda
29
  conda activate BERT_visual
 
24
  For the [dataset](https://huggingface.co/datasets/AhmedSSabir/Textual-Image-Caption-Dataset)
25
 
26
 
27
+
28
+ ## # Result with SoTA pre-trained image Captioning BLIP
29
+
30
+
31
+ Comparison result with BLIP (125M pre-trained images) [Table 7 COCO Caption Karpathy testset](https://arxiv.org/pdf/2201.12086.pdf).
32
+ For the VilBERT model (3.5 pre-trained images) please refer to the paper.
33
+
34
+ ## Accuarcy
35
+
36
+ | Model | B-1 | B-2 | B-3 | B-4 | M | R | C | S |BERTscore |
37
+ |----------------------------------|---------|-------|--------|-------|--------|--------|-------|--------|---------|
38
+ | BLIP Beam Search b=3 | .797 | .649 | **.514** | **.403** | **.311** | **.606** |**1.365** |**.243** | **.9484** |
39
+ | + BERT-CNN $th=0$ | .798 | .646 | .506 | .392 | .305 | .598 | 1.339 | .238 | .9473 |
40
+ | + BERT-CNN $th\geq0.2$ | .798 | .647 | .507 | .393 | .306 | .600 | 1.342 | .238 | .9473 |
41
+ | + BERT-CNN $th\geq0.3$ | .802 | .651 | .511 | .397 | .307 | .601 | 1.349 | .238 | .9479 |
42
+ | + BERT-CNN $th\geq0.4$ | **.806** | **.654** | .513 | .397 | .303 | .599 | 1.343 | .235 | .9476 |
43
+
44
+ ## Diversity
45
+
46
+ | Model | Uniq | V | MBlue-1↓ | Div-1 |Div-2 | SBERT-sts|
47
+ |----------------------------------|---------|-------|----------|-------|-------|----------|
48
+ | BLIP Beam Search b=3 | **8.60** | 1406 | .461 | .68 | .80 | .8058 |
49
+ | + BERT-CNN $th=0$ | 8.49 | **1532** | .457 | .68 | .80 | .8046 |
50
+ | + BERT-CNN $th\geq0.2$ | 8.48 | 1486 | .458 | .68 | .80 | .8052 |
51
+ | + BERT-CNN $th\geq0.3$ | 8.41 | 1448 | .458 | .68 | .80 | **.8060** |
52
+ | + BERT-CNN $th\geq0.4$ | 8.30 | 1448 | **.455** | .68 | .80 | .8053 |
53
+ |human | 9.14 | 3425 | .375 | .74 | .84 | NA |
54
+
55
+
56
+
57
+
58
  ```
59
  conda create -n BERT_visual python=3.6 anaconda
60
  conda activate BERT_visual