Spaces:
Runtime error
Text location in screenshots
Is it possible to add a resize to 1080x1920 for this demo specifically? I think that will lead to better performance
Typically new lines seem to help at the end of prompts in our experience :)
@somaniarushi thanks! For VQA, though, I found worse results with the newline, at least in some of the examples. For example, the Walmart receipt example returns "There are four items sold." whereas the version without the newline generates "5". Is't true that the version without the newline is a bit too concise in this case, but the answer is right.
I would suggest we use the existing prompts for now, and we keep testing on more examples and update, if appropriate. What do you think?
Note also that the "coco style" caption does include a newline, as recommended in your original code and tests.
@pcuenq For sure, let's go with evidence over my intuition!
@somaniarushi oh, but your intuition is better informed than mine! :) We'll continue to test on more examples and share our findings!