tiiuae
/

falcon-mamba-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

IChahed commited on Jul 24

Commit

4141e0d

•

1 Parent(s): 8c8f700

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -189,7 +189,7 @@ The model training took roughly two months.
 ## Benchmarks
-We evaluate our model on all benchmarks of the leaderboard's version 2 using the `lm-evaluation-harness` package, and we evaluate it on the benchmarks of version 1 using `lighteval`.
 | `model name`              |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
@@ -217,7 +217,7 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
 | `TRI-ML/mamba-7b-rw`         | 46.48  | 80.24     | 57.72 | 76.40      | -          | 4.70  | -                |
 |***Hybrid SSM-attention models***|     |           |       |            |            |       |                  |
 | `recurrentgemma-9b`          |52.00   |   80.40   | 60.50 |   73.60    |   38.60    | 42.60 |  57.95           |
-| `Zyphra/Zamba-7B-v1`         | 46.48  | 80.24     | 57.72 | 76.40      | -          | 30.78 | -                |
 |***Transformer models***      |        |           |       |            |            |       |                  |
 | `Falcon2-11B`                | 59.73  | 82.91     | 58.37 | 78.30      | 52.56      | 53.83 | **64.28**        |
 | `Meta-Llama-3-8B`            | 60.24  | 82.23     | 66.70 | 78.45      | 42.93      | 45.19 | 62.62            |

 ## Benchmarks
+We evaluate our model on all benchmarks of the leaderboard's version 2 using the `lm-evaluation-harness` package, and we evaluate it on the benchmarks of version 1 using `lighteval`. The reported evaluation results on the leaderboard version 2 are normalized following HuggingFace score normalization.
 | `model name`              |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
 | `TRI-ML/mamba-7b-rw`         | 46.48  | 80.24     | 57.72 | 76.40      | -          | 4.70  | -                |
 |***Hybrid SSM-attention models***|     |           |       |            |            |       |                  |
 | `recurrentgemma-9b`          |52.00   |   80.40   | 60.50 |   73.60    |   38.60    | 42.60 |  57.95           |
+| `Zyphra/Zamba-7B-v1`         | 56.14  | 82.23     | 58.11 | 79.87      | 36.23      | 30.78 |  57.23           |
 |***Transformer models***      |        |           |       |            |            |       |                  |
 | `Falcon2-11B`                | 59.73  | 82.91     | 58.37 | 78.30      | 52.56      | 53.83 | **64.28**        |
 | `Meta-Llama-3-8B`            | 60.24  | 82.23     | 66.70 | 78.45      | 42.93      | 45.19 | 62.62            |