Other benchmarks as MT-Bench and/or AlpacaEval

#14

by alvarobartt HF staff - opened Nov 29, 2023

Nov 29, 2023

Hi here! Are you also planning to run both MT-Bench and/or AlpacaEval? Those benchmarks seem to be close to reality rather than lm-eval-harness, and would be interested in the results too if any, thanks in advance!

(Maybe those already exist, but couldn't find those within the model on the Hub)

lvkaokao

Intel org Nov 30, 2023

hi, we will update the results soon~

alvarobartt

Nov 30, 2023

Hi @lvkaokao , that's great to hear! Feel free to ping me when uploaded, I'm really looking forward those!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment