google/matcha-plotqa-v1 · Results on PlotQA-V1 does not match with the paper

google

Hello,

I have tested the model on test set of PlotQA-V1 with 1.2m Q/A pairs. It appears that there is a discrepancy when compared to the results presented in the original paper. Is there any preprocessing involved in experiments in the original paper?

I have used these models in my experiments and achieved only ~40% accuracy:
processor = Pix2StructProcessor.from_pretrained('google/matcha-base')
model = Pix2StructForConditionalGeneration.from_pretrained('google/matcha-plotqa-v1')

Some example outputs:
idx, ground truth answer, generated answer

700000,bottom right,top right
700001,3,4
700002,vertical,vertical
700003,Number of mobile cellular subscribers,Net bilateral aid flow in an economy from Canada
700004,No,Yes
700005,Years,Years
700006,No. of subscribers (per 100 people),Aid flow (current US$)
700007,3,4
700008,No,No

Results on PlotQA-V1 does not match with the paper

Some example outputs:idx, ground truth answer, generated answer

Some example outputs:
idx, ground truth answer, generated answer