binwang commited on
Commit
a591976
1 Parent(s): 6a36a7e
Files changed (1) hide show
  1. app.py +0 -4
app.py CHANGED
@@ -2210,12 +2210,8 @@ with block:
2210
  - **Number of Languages**: > 8
2211
  - **Number of Models**: {NUM_MODELS}
2212
  - **Mode of Evaluation**: Zero-Shot, Five-Shot
2213
-
2214
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2215
  Know Issues:
2216
  - For base models, the output of base model is not truncated as no EOS detected. Evaluation could be affected, especially with length-aware metrics.
2217
-
2218
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2219
  The following table shows the performance of the models on the SeaEval benchmark.
2220
  - For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
2221
  - (-1) value indicates the results are ready yet.
 
2210
  - **Number of Languages**: > 8
2211
  - **Number of Models**: {NUM_MODELS}
2212
  - **Mode of Evaluation**: Zero-Shot, Five-Shot
 
 
2213
  Know Issues:
2214
  - For base models, the output of base model is not truncated as no EOS detected. Evaluation could be affected, especially with length-aware metrics.
 
 
2215
  The following table shows the performance of the models on the SeaEval benchmark.
2216
  - For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
2217
  - (-1) value indicates the results are ready yet.