Transfer Learn Visual Grounding to UI RefExp

Hi OFA team,

Congrats on the awesome work.

Are you planning to share a notebook example for fine tuning OFA for visual grounding?

I would be interested to benchmark OFA vs Donut for UI RefExp task. Here is my working in progress with Donut:

I know Visual Grounding is pre-trained on RefCoco family which is mostly physical objects , while UI RefExp is primarily RICO android mobile app screenshots. Nevertheless I am curious how fast OFA can transfer learn on RICO RefExp and with what ultimate performance. Happy to share my results as I am with Donut.



