Junming Yang commited on
Commit
20d077e
1 Parent(s): e2f94e9

add VQA meta_data

Browse files
Files changed (1) hide show
  1. meta_data.py +21 -0
meta_data.py CHANGED
@@ -157,4 +157,25 @@ LEADERBOARD_MD['RealWorldQA'] = """
157
  ## RealWorldQA Evaluation Results
158
 
159
  - RealWorldQA is a benchmark designed to evaluate the real-world spatial understanding capabilities of multimodal AI models, contributed by XAI. It assesses how well these models comprehend physical environments. The benchmark consists of 700+ images, each accompanied by a question and a verifiable answer. These images are drawn from real-world scenarios, including those captured from vehicles. The goal is to advance AI models' understanding of our physical world.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
160
  """
 
157
  ## RealWorldQA Evaluation Results
158
 
159
  - RealWorldQA is a benchmark designed to evaluate the real-world spatial understanding capabilities of multimodal AI models, contributed by XAI. It assesses how well these models comprehend physical environments. The benchmark consists of 700+ images, each accompanied by a question and a verifiable answer. These images are drawn from real-world scenarios, including those captured from vehicles. The goal is to advance AI models' understanding of our physical world.
160
+ """
161
+
162
+ LEADERBOARD_MD['TextVQA_VAL'] = """
163
+ ## TextVQA Evaluation Results
164
+
165
+ - TextVQA is a dataset to benchmark visual reasoning based on text in images. TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions.
166
+ - Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
167
+ """
168
+
169
+ LEADERBOARD_MD['ChartQA_TEST'] = """
170
+ ## ChartQA Evaluation Results
171
+
172
+ - ChartQA is a benchmark for question answering about charts with visual and logical reasoning.
173
+ - Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
174
+ """
175
+
176
+ LEADERBOARD_MD['OCRVQA_TESTCORE'] = """
177
+ ## OCRVQA Evaluation Results
178
+
179
+ - OCRVQA is a benchmark for visual question answering by reading text in images. It presents a large-scale dataset, OCR-VQA-200K, comprising over 200,000 images of book covers. The study combines techniques from the Optical Character Recognition (OCR) and Visual Question Answering (VQA) domains to address the challenges associated with this new task and dataset.
180
+ - Note that some models may not be able to generate standardized responses based on the prompt. We currently do not have reports for these models.
181
  """