Zyphra
/

Zamba2-2.7B-instruct

Text Generation

Inference Endpoints

Model card Files Files and versions Community

BerenMillidge commited on Sep 19, 2024

Commit

bf23724

•

1 Parent(s): 5622826

Update README.md

Files changed (1) hide show

README.md +1 -23

README.md CHANGED Viewed

@@ -55,34 +55,12 @@ Zamba2-2.7B-Instruct punches dramatically above its weight, achieving extremely
 | Model | Size | MT-Bench | IFEval |
 |-------------|----|----|----|
 | **Zamba2-2.6B-Instruct** | 2.6B | **72.40** | **53.96** |
-| Mistral-7B-Instruct | 7B | 72.40 | 66.4 | 45.3 |
 | Gemma2-2B-Instruct | 2.7B | 51.69 | 48.8 |
 | H2O-Danube-4B-Chat | 4B | 52.57 | 45.44 |
 | StableLM-Zephyr-3B | 3B | 66.43 | 36.83 |
-| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
-|--------|-----|----|---------------|--------------|
-| **StableLM Zephyr 3B** 🪁 | 3B | DPO | 6.64 | 76.00 |
-| StableLM Zephyr (SFT only) | 3B | SFT | 6.04 | 71.15 |
-| Capybara v1.9 | 3B | dSFT | 5.94 | - |
-| MPT-Chat |  7B |dSFT |5.42| -|
-| Xwin-LM v0.1 | 7B| dPPO| 6.19| 87.83|
-| Mistral-Instruct v0.1 | 7B|  - | 6.84 |-|
-| Zephyr-7b-α |7B|  dDPO| 6.88| -|
-| Zephyr-7b-β| 7B | dDPO | 7.34 | 90.60 |
-| Falcon-Instruct |  40B |dSFT |5.17 |45.71|
-| Guanaco | 65B |  SFT |6.41| 71.80|
-| Llama2-Chat |  70B |RLHF |6.86| 92.66|
-| Vicuna v1.3 |  33B |dSFT |7.12 |88.99|
-| WizardLM v1.0 |  70B |dSFT |7.71 |-|
-| Xwin-LM v0.1 |   70B |dPPO |- |95.57|
-| GPT-3.5-turbo | - |RLHF |7.94 |89.37|
-| Claude 2 |  - |RLHF |8.06| 91.36|
-| GPT-4 |  -| RLHF |8.99| 95.28|
 Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
 Time to First Token (TTFT)             |  Output Generation

 | Model | Size | MT-Bench | IFEval |
 |-------------|----|----|----|
 | **Zamba2-2.6B-Instruct** | 2.6B | **72.40** | **53.96** |
+| Mistral-7B-Instruct | 7B | 66.4 | 45.3 |
 | Gemma2-2B-Instruct | 2.7B | 51.69 | 48.8 |
 | H2O-Danube-4B-Chat | 4B | 52.57 | 45.44 |
 | StableLM-Zephyr-3B | 3B | 66.43 | 36.83 |
 Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
 Time to First Token (TTFT)             |  Output Generation