smellslikeml
commited on
Commit
·
e122c7f
1
Parent(s):
7796f30
update config, README
Browse files- README.md +25 -1
- config.json +2 -2
README.md
CHANGED
@@ -30,6 +30,30 @@ With a pipeline of expert models, we can infer spatial relationships between obj
|
|
30 |
|
31 |
Use this model to query spatial relationships between objects in a scene.
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
Try it on Discord: http://discord.gg/b2yGuCNpuC
|
34 |
|
35 |
## Citation
|
@@ -55,4 +79,4 @@ Try it on Discord: http://discord.gg/b2yGuCNpuC
|
|
55 |
journal={arXiv preprint arXiv:2402.03766},
|
56 |
year={2024}
|
57 |
}
|
58 |
-
```
|
|
|
30 |
|
31 |
Use this model to query spatial relationships between objects in a scene.
|
32 |
|
33 |
+
Run it using [MobileVLM inference](https://github.com/Meituan-AutoML/MobileVLM/tree/main?tab=readme-ov-file#example-for-mobilevlmmobilevlm-v2-model-inference) code:
|
34 |
+
```python
|
35 |
+
# assuming cwd is /path/to/MobileVLM/
|
36 |
+
from scripts.inference import inference_once
|
37 |
+
model_path = "/path/to/SpaceLLaVA-lite"
|
38 |
+
image_file = "/path/to/your-image.jpg"
|
39 |
+
prompt_str = "For each object in the scene, describe the distance between objects in meters"
|
40 |
+
|
41 |
+
args = type('Args', (), {
|
42 |
+
"model_path": model_path,
|
43 |
+
"image_file": image_file,
|
44 |
+
"prompt": prompt_str,
|
45 |
+
"conv_mode": "v1",
|
46 |
+
"temperature": 0,
|
47 |
+
"top_p": None,
|
48 |
+
"num_beams": 1,
|
49 |
+
"max_new_tokens": 512,
|
50 |
+
"load_8bit": False,
|
51 |
+
"load_4bit": False,
|
52 |
+
})()
|
53 |
+
|
54 |
+
inference_once(args)
|
55 |
+
```
|
56 |
+
|
57 |
Try it on Discord: http://discord.gg/b2yGuCNpuC
|
58 |
|
59 |
## Citation
|
|
|
79 |
journal={arXiv preprint arXiv:2402.03766},
|
80 |
year={2024}
|
81 |
}
|
82 |
+
```
|
config.json
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4a4e390b44b0c584bf6f4f2f97a153cab472a7b6d5264eeffc6c277d224847f4
|
3 |
+
size 1156
|