Clémentine commited on
Commit
01d1bbb
1 Parent(s): c177f62

text reorg

Browse files
Files changed (1) hide show
  1. content.py +2 -5
content.py CHANGED
@@ -1,17 +1,14 @@
1
  TITLE = """<h1 align="center" id="space-title">GAIA Leaderboard</h1>"""
2
 
3
- CANARY_STRING = "" # TODO
4
-
5
  INTRODUCTION_TEXT = """
6
- GAIA is a benchmark which aims at evaluating next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc).
7
- (See our paper for more details.)
8
 
9
  ## Context
10
  GAIA is made of more than 450 non-trivial question with an unambiguous answer, requiring different levels of tooling and autonomy to solve. GAIA data can be found in this space (https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
11
 
12
  It is divided in 3 levels, where level 1 should be breakable by very good LLMs, and level 3 indicate a strong jump in model capabilities, each divided into a fully public dev set for validation, and a test set with private answers and metadata.
13
 
14
- ## Submissions
15
  Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
16
 
17
  We expect submissions to be json-line files with the following format. The first two fields are mandatory, `reasoning_trace` is optionnal:
 
1
  TITLE = """<h1 align="center" id="space-title">GAIA Leaderboard</h1>"""
2
 
 
 
3
  INTRODUCTION_TEXT = """
4
+ GAIA is a benchmark which aims at evaluating next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc). (See our paper for more details.)
 
5
 
6
  ## Context
7
  GAIA is made of more than 450 non-trivial question with an unambiguous answer, requiring different levels of tooling and autonomy to solve. GAIA data can be found in this space (https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
8
 
9
  It is divided in 3 levels, where level 1 should be breakable by very good LLMs, and level 3 indicate a strong jump in model capabilities, each divided into a fully public dev set for validation, and a test set with private answers and metadata.
10
 
11
+ # Submissions
12
  Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
13
 
14
  We expect submissions to be json-line files with the following format. The first two fields are mandatory, `reasoning_trace` is optionnal: