tttoaster commited on
Commit
d093c93
1 Parent(s): 8b3d9a9

Update constants.py

Browse files
Files changed (1) hide show
  1. constants.py +7 -0
constants.py CHANGED
@@ -80,12 +80,19 @@ SUBMIT_INTRODUCTION = """# Submit on SEED Benchmark Introduction
80
 
81
  TABLE_INTRODUCTION = """In the table below, we summarize each task performance of all the models.
82
  We use accurancy(%) as the primary evaluation metric for each tasks.
 
83
  SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
 
84
  SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
 
85
  For PPL evaluation method, we count the loss for each candidate and select the lowest loss candidate. For detail, please refer [InternLM_Xcomposer_VL_interface](https://github.com/AILab-CVC/SEED-Bench/blob/387a067b6ba99ae5e8231f39ae2d2e453765765c/SEED-Bench-2/model/InternLM_Xcomposer_VL_interface.py#L74).
 
86
  For PPL A/B/C/D evaluation method, please refer [EVAL_SEED.md](https://github.com/QwenLM/Qwen-VL/blob/master/eval_mm/seed_bench/EVAL_SEED.md) for more information.
 
87
  For Generate evaluation method, please refer [Evaluation.md](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#seed-bench) for detailed.
 
88
  For the NG evaluation method, we indicate that the evaluation method is Not Given.
 
89
  If you have any questions, please feel free to contact us.
90
  """
91
 
 
80
 
81
  TABLE_INTRODUCTION = """In the table below, we summarize each task performance of all the models.
82
  We use accurancy(%) as the primary evaluation metric for each tasks.
83
+
84
  SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
85
+
86
  SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
87
+
88
  For PPL evaluation method, we count the loss for each candidate and select the lowest loss candidate. For detail, please refer [InternLM_Xcomposer_VL_interface](https://github.com/AILab-CVC/SEED-Bench/blob/387a067b6ba99ae5e8231f39ae2d2e453765765c/SEED-Bench-2/model/InternLM_Xcomposer_VL_interface.py#L74).
89
+
90
  For PPL A/B/C/D evaluation method, please refer [EVAL_SEED.md](https://github.com/QwenLM/Qwen-VL/blob/master/eval_mm/seed_bench/EVAL_SEED.md) for more information.
91
+
92
  For Generate evaluation method, please refer [Evaluation.md](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#seed-bench) for detailed.
93
+
94
  For the NG evaluation method, we indicate that the evaluation method is Not Given.
95
+
96
  If you have any questions, please feel free to contact us.
97
  """
98