Can this grounding model output bbox?
#3
by
xxheyu
- opened
with torch.no_grad():
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
print(tokenizer.decode(outputs[0]))
It only output the description of the image. How can I get the bounding box?
Thanks in advace!
I just meet the same issue.
It is in the output text with [[x1,y1,x2,y2]]
I tried several prompts, but I find the HF version model cannot output the box coordinates.
Then, I shifted to the SAT version model. I found it can generate the box coordinates.