Nathan Habib commited on
Commit
bfab6ae
1 Parent(s): c19cedb
Files changed (1) hide show
  1. dist/index.html +5 -3
dist/index.html CHANGED
@@ -274,7 +274,7 @@
274
  <div class="main-plot-container">
275
  <figure><img src="assets/images/ranking_top10_bottom10.png"/></figure>
276
  <div id="ranking">
277
- <iframe src="rankings_change.html" title="description", height="700" width="100%", style="border:none;"></iframe>
278
  </div>
279
  </div>
280
 
@@ -283,9 +283,11 @@
283
 
284
  <p>For example, our different evaluations results are not all correlated with one another, which is expected.</p>
285
 
286
- <div class="l-body">
287
  <figure><img src="assets/images/v2_correlation_heatmap.png"/></figure>
288
- <div id="heatmap"></div>
 
 
289
  </div>
290
 
291
  <p>MMLU-Pro, BBH and ARC-challenge are well correlated together. It is known that these 3 are well correlated with human preference (as they tend to align with human judgment on LMSys’s chatbot arena).</p>
 
274
  <div class="main-plot-container">
275
  <figure><img src="assets/images/ranking_top10_bottom10.png"/></figure>
276
  <div id="ranking">
277
+ <iframe src="rankings_change.html" title="description", height="800" width="100%", style="border:none;"></iframe>
278
  </div>
279
  </div>
280
 
 
283
 
284
  <p>For example, our different evaluations results are not all correlated with one another, which is expected.</p>
285
 
286
+ <div class="main-plot-container">
287
  <figure><img src="assets/images/v2_correlation_heatmap.png"/></figure>
288
+ <div id="heatmap">
289
+ <iframe src="correlation_heatmap.html" title="description", height="800" width="100%", style="border:none;"></iframe>
290
+ </div>
291
  </div>
292
 
293
  <p>MMLU-Pro, BBH and ARC-challenge are well correlated together. It is known that these 3 are well correlated with human preference (as they tend to align with human judgment on LMSys’s chatbot arena).</p>