Hi OFA team,
Congrats on the awesome work.
Are you planning to share a notebook example for fine tuning OFA for visual grounding?
I would be interested to benchmark OFA vs Donut for UI RefExp task. Here is my working in progress with Donut:
I know Visual Grounding is pre-trained on RefCoco family which is mostly physical objects , while UI RefExp is primarily RICO android mobile app screenshots. Nevertheless I am curious how fast OFA can transfer learn on RICO RefExp and with what ultimate performance. Happy to share my results as I am with Donut.