clefourrier HF staff commited on
Commit
6b1ca6b
1 Parent(s): 7e39ef6

Update content.py

Browse files
Files changed (1) hide show
  1. content.py +9 -2
content.py CHANGED
@@ -9,6 +9,15 @@ It is therefore divided in 3 levels, where level 1 should be breakable by very g
9
 
10
  GAIA data can be found in [this dataset](https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
11
 
 
 
 
 
 
 
 
 
 
12
  ## Submissions
13
  Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
14
 
@@ -17,9 +26,7 @@ We expect submissions to be json-line files with the following format. The first
17
  {"task_id": "task_id_1", "model_answer": "Answer 1 from your model", "reasoning_trace": "The different steps by which your model reached answer 1"}
18
  {"task_id": "task_id_2", "model_answer": "Answer 2 from your model", "reasoning_trace": "The different steps by which your model reached answer 2"}
19
  ```
20
- Submission made by our team are labelled "GAIA authors". While we report average scores over different runs when possible in our paper, we only report the best run in the leaderboard.
21
 
22
- **Please do not repost the public dev set, nor use it in training data for your models.**
23
  """
24
 
25
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 
9
 
10
  GAIA data can be found in [this dataset](https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
11
 
12
+ **Please do not repost the public dev set, nor use it in training data for your models.**
13
+
14
+ ## Leaderboard
15
+ Submission made by our team are labelled "GAIA authors". While we report average scores over different runs when possible in our paper, we only report the best run in the leaderboard.
16
+
17
+ See below for submissions.
18
+ """
19
+
20
+ SUBMISSION_TEXT = """
21
  ## Submissions
22
  Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
23
 
 
26
  {"task_id": "task_id_1", "model_answer": "Answer 1 from your model", "reasoning_trace": "The different steps by which your model reached answer 1"}
27
  {"task_id": "task_id_2", "model_answer": "Answer 2 from your model", "reasoning_trace": "The different steps by which your model reached answer 2"}
28
  ```
 
29
 
 
30
  """
31
 
32
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"