Feature Extraction
PyTorch
michelecafagna26 commited on
Commit
cd8b8ac
1 Parent(s): dbbde98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -1,3 +1,70 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+
4
+ datasets:
5
+ - coco
6
+ - openimagesv5
7
+ - bbjects365v1
8
+ - visualgenome
9
+
10
+ library_name: pytorch
11
+ tags:
12
+ - pytorch
13
  ---
14
+
15
+ # Model Card: VinVL VisualBackbone
16
+
17
+ Disclaimer: The model is taken from the official repository, it can be found here: [microsoft/scene_graph_benchmark](https://github.com/microsoft/scene_graph_benchmark)
18
+
19
+ # Usage:
20
+
21
+ More info about how to use this model can be found here: [michelecafagna26/vinvl-visualbackbone](https://github.com/michelecafagna26/vinvl-visualbackbone)
22
+
23
+ # Quick start: Feature extraction
24
+
25
+ ```python
26
+ from scene_graph_benchmark.wrappers import VinVLVisualBackbone
27
+
28
+ img_file = "scene_graph_bechmark/demo/woman_fish.jpg"
29
+
30
+ detector = VinVLVisualBackbone()
31
+
32
+ dets = detector(img_file)
33
+
34
+ ```
35
+
36
+ `dets` contains the following keys: ["boxes", "classes", "scores", "features", "spatial_features"]
37
+
38
+ You can obtain the full VinVL's visual features by concatenating the "features" and the "spatial_features"
39
+
40
+ ```python
41
+ import numpy as np
42
+
43
+ v_feats = np.concatenate((dets['features'], dets['spatial_features']), axis=1)
44
+ # v_feats.shape = (num_boxes, 2054)
45
+ ```
46
+
47
+ # Citations
48
+
49
+ Please consider citing the original project and the VinVL paper
50
+
51
+ ```BibTeX
52
+
53
+ @misc{han2021image,
54
+ title={Image Scene Graph Generation (SGG) Benchmark},
55
+ author={Xiaotian Han and Jianwei Yang and Houdong Hu and Lei Zhang and Jianfeng Gao and Pengchuan Zhang},
56
+ year={2021},
57
+ eprint={2107.12604},
58
+ archivePrefix={arXiv},
59
+ primaryClass={cs.CV}
60
+ }
61
+
62
+ @inproceedings{zhang2021vinvl,
63
+ title={Vinvl: Revisiting visual representations in vision-language models},
64
+ author={Zhang, Pengchuan and Li, Xiujun and Hu, Xiaowei and Yang, Jianwei and Zhang, Lei and Wang, Lijuan and Choi, Yejin and Gao, Jianfeng},
65
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
66
+ pages={5579--5588},
67
+ year={2021}
68
+ }
69
+
70
+ ```