Correct output from test.py
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ PG-InstructBLIP is finetuned using the [PhysObjects dataset](https://drive.googl
|
|
18 |
|
19 |
## Example Usage and Installation
|
20 |
|
21 |
-
This model is designed to be used with the LAVIS library. Please install [salesforce-lavis](https://pypi.org/project/salesforce-lavis/) and download this model through git-lfs or direct downloading.
|
22 |
|
23 |
After loading the model, you can disable the qformer text input to follow the same configuration we used for fine-tuning. However, the model still works well with it enabled, so we recommend users to experiment with both and choose the optimal configuration on a case-by-case basis.
|
24 |
|
@@ -62,7 +62,7 @@ question_samples = {
|
|
62 |
|
63 |
answers, scores = generate(vlm, question_samples, length_penalty=0, repetition_penalty=1, num_captions=3)
|
64 |
print(answers, scores)
|
65 |
-
#
|
66 |
```
|
67 |
|
68 |
Note that the output of the generate function includes the log probabilities of each generation. For categorical properties (like material, transparency, and contents), these probabilities can be interpreted as confidences, as typical with VLMs. In the example above, PG-InstructBLIP is very confident that the ceramic bowl is opaque, which is true.
|
|
|
18 |
|
19 |
## Example Usage and Installation
|
20 |
|
21 |
+
This model is designed to be used with the LAVIS library. Please install [salesforce-lavis](https://pypi.org/project/salesforce-lavis/) from source and download this model through git-lfs or direct downloading.
|
22 |
|
23 |
After loading the model, you can disable the qformer text input to follow the same configuration we used for fine-tuning. However, the model still works well with it enabled, so we recommend users to experiment with both and choose the optimal configuration on a case-by-case basis.
|
24 |
|
|
|
62 |
|
63 |
answers, scores = generate(vlm, question_samples, length_penalty=0, repetition_penalty=1, num_captions=3)
|
64 |
print(answers, scores)
|
65 |
+
# ['opaque', 'translucent', 'transparent'] tensor([-0.0373, -4.2404, -4.4436], device='cuda:0')
|
66 |
```
|
67 |
|
68 |
Note that the output of the generate function includes the log probabilities of each generation. For categorical properties (like material, transparency, and contents), these probabilities can be interpreted as confidences, as typical with VLMs. In the example above, PG-InstructBLIP is very confident that the ceramic bowl is opaque, which is true.
|
test.py
CHANGED
@@ -35,4 +35,4 @@ question_samples = {
|
|
35 |
|
36 |
answers, scores = generate(vlm, question_samples, length_penalty=0, repetition_penalty=1, num_captions=3)
|
37 |
print(answers, scores)
|
38 |
-
#
|
|
|
35 |
|
36 |
answers, scores = generate(vlm, question_samples, length_penalty=0, repetition_penalty=1, num_captions=3)
|
37 |
print(answers, scores)
|
38 |
+
# ['opaque', 'translucent', 'transparent'] tensor([-0.0373, -4.2404, -4.4436], device='cuda:0')
|