BenCzechMark-unstable

Runtime error

App Files Files Community

mfajcik commited on Sep 4

Commit

5be5d06

•

1 Parent(s): e60cafc

Update content.py

Browse files

Files changed (1) hide show

content.py +15 -4

content.py CHANGED Viewed

@@ -4,9 +4,11 @@ This file contains the text content for the leaderboard client.
 HEADER_MARKDOWN = """
 # 🇨🇿 BenCzechMark [Beta Preview]
-Welcome to the leaderboard! Here you can compare models on tasks in Czech language and/or submit your own model. Head to submission page to learn about submission details.
-We use our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under same protocol.
-See about page for brief description of our evaluation protocol & win score mechanism, citation information, and future directions for this benchmark.
 """
 LEADERBOARD_TAB_TITLE_MARKDOWN = """
    ## Leaderboard
@@ -17,7 +19,16 @@ SUBMISSION_TAB_TITLE_MARKDOWN = """
     1. Head down to our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness).
     Follow the instructions and evaluate your model on all 🇨🇿 BenCzechMark tasks, while logging your lm harness outputs into designated folder.
-    2. Use our script <TODO: add script> for processing log files from your designated folder into single compact submission file that contains everything we need.
     3. Upload your file, and fill the form below!

 HEADER_MARKDOWN = """
 # 🇨🇿 BenCzechMark [Beta Preview]
+Welcome to the leaderboard! Here you can compare models on tasks in Czech language and/or submit your own model. We use our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under same protocol.
+- Head to submission page to learn about submission details.
+- See about page for brief description of our evaluation protocol & win score mechanism, citation information, and future directions for this benchmark.
 """
 LEADERBOARD_TAB_TITLE_MARKDOWN = """
    ## Leaderboard
     1. Head down to our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness).
     Follow the instructions and evaluate your model on all 🇨🇿 BenCzechMark tasks, while logging your lm harness outputs into designated folder.
+    2. Use our script [compile_log_files.py](https://huggingface.co/spaces/CZLC/BenCzechMark/blob/main/compile_log_files.py) for processing log files from your designated folder into single compact submission file that contains everything we need.
+    Example usage:
+   - Download sample outputs for csmpt7b from [csmpt_logdir.zip](https://czechllm.fit.vutbr.cz/csmpt7b/sample_results/csmpt_logdir.zip).
+   - Unzip.
+   - Run the script as with python (with libs jsonlines and tqdm)
+   ```bash
+   python compile_log_files.py \
+   -i "<your_local_path_to_folder>/csmpt_logdir/csmpt/eval_csmpt7b*" \
+   -o "<your_local_path_to_outfolder>/sample_submission.json"
+   ```
     3. Upload your file, and fill the form below!