lewtun HF staff commited on
Commit
5a6c0cd
1 Parent(s): 9c6d73c

Add tables

Browse files
Files changed (1) hide show
  1. README.md +12 -9
README.md CHANGED
@@ -49,15 +49,18 @@ Zephyr is a series of language models that are trained to act as helpful assista
49
 
50
  ## Performance
51
 
52
- At the time of release, Zephyr 7B Gemma is the highest ranked 7B chat model on the [MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench) and [AlpacaEval](https://tatsu-lab.github.io/alpaca_eval/) benchmarks:
53
-
54
-
55
-
56
- In particular, on several categories of MT-Bench, Zephyρ 7B Gemma has strong performance compared to larger open models like Llama2-Chat-70B:
57
-
58
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6200d0a443eb0913fa2df7cc/raxvt5ma16d7T23my34WC.png)
59
-
60
- However, on more complex tasks like coding and mathematics, Zephyr 7B Gemma lags behind proprietary models and more research is needed to close the gap.
 
 
 
61
 
62
  ## Intended uses & limitations
63
 
 
49
 
50
  ## Performance
51
 
52
+ | Model |MT Bench|IFEval|
53
+ |-----------------------------------------------------------------------|------:|------:|
54
+ |[zephyr-7b-gemma](https://huggingface.co/HuggingFaceH4/zephyr-7b-gemma)| 7.81 | 28.76|
55
+ |[zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | 7.34 | 43.81|
56
+ |[gemma-7b-it](https://huggingface.co/google/gemma-7b-it) | 6.38 | 38.01|
57
+
58
+
59
+ | Model |AGIEval|GPT4All|TruthfulQA|BigBench|Average|
60
+ |-----------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
61
+ |[zephyr-7b-gemma](https://huggingface.co/HuggingFaceH4/zephyr-7b-gemma)| 34.22| 66.37| 52.19| 37.10| 47.47|
62
+ |[zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | 37.52| 71.77| 55.26| 39.77| 51.08|
63
+ |[gemma-7b-it](https://huggingface.co/google/gemma-7b-it) | 21.33| 40.84| 41.70| 30.25| 33.53|
64
 
65
  ## Intended uses & limitations
66