mfajcik commited on
Commit
5be5d06
โ€ข
1 Parent(s): e60cafc

Update content.py

Browse files
Files changed (1) hide show
  1. content.py +15 -4
content.py CHANGED
@@ -4,9 +4,11 @@ This file contains the text content for the leaderboard client.
4
  HEADER_MARKDOWN = """
5
  # ๐Ÿ‡จ๐Ÿ‡ฟ BenCzechMark [Beta Preview]
6
 
7
- Welcome to the leaderboard! Here you can compare models on tasks in Czech language and/or submit your own model. Head to submission page to learn about submission details.
8
- We use our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under same protocol.
9
- See about page for brief description of our evaluation protocol & win score mechanism, citation information, and future directions for this benchmark.
 
 
10
  """
11
  LEADERBOARD_TAB_TITLE_MARKDOWN = """
12
  ## Leaderboard
@@ -17,7 +19,16 @@ SUBMISSION_TAB_TITLE_MARKDOWN = """
17
  1. Head down to our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness).
18
  Follow the instructions and evaluate your model on all ๐Ÿ‡จ๐Ÿ‡ฟ BenCzechMark tasks, while logging your lm harness outputs into designated folder.
19
 
20
- 2. Use our script <TODO: add script> for processing log files from your designated folder into single compact submission file that contains everything we need.
 
 
 
 
 
 
 
 
 
21
 
22
  3. Upload your file, and fill the form below!
23
 
 
4
  HEADER_MARKDOWN = """
5
  # ๐Ÿ‡จ๐Ÿ‡ฟ BenCzechMark [Beta Preview]
6
 
7
+ Welcome to the leaderboard! Here you can compare models on tasks in Czech language and/or submit your own model. We use our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under same protocol.
8
+
9
+ - Head to submission page to learn about submission details.
10
+ - See about page for brief description of our evaluation protocol & win score mechanism, citation information, and future directions for this benchmark.
11
+
12
  """
13
  LEADERBOARD_TAB_TITLE_MARKDOWN = """
14
  ## Leaderboard
 
19
  1. Head down to our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness).
20
  Follow the instructions and evaluate your model on all ๐Ÿ‡จ๐Ÿ‡ฟ BenCzechMark tasks, while logging your lm harness outputs into designated folder.
21
 
22
+ 2. Use our script [compile_log_files.py](https://huggingface.co/spaces/CZLC/BenCzechMark/blob/main/compile_log_files.py) for processing log files from your designated folder into single compact submission file that contains everything we need.
23
+ Example usage:
24
+ - Download sample outputs for csmpt7b from [csmpt_logdir.zip](https://czechllm.fit.vutbr.cz/csmpt7b/sample_results/csmpt_logdir.zip).
25
+ - Unzip.
26
+ - Run the script as with python (with libs jsonlines and tqdm)
27
+ ```bash
28
+ python compile_log_files.py \
29
+ -i "<your_local_path_to_folder>/csmpt_logdir/csmpt/eval_csmpt7b*" \
30
+ -o "<your_local_path_to_outfolder>/sample_submission.json"
31
+ ```
32
 
33
  3. Upload your file, and fill the form below!
34