Text Generation
Transformers
Safetensors
English
falcon_mamba
Eval Results
Inference Endpoints
IChahed commited on
Commit
c78f432
1 Parent(s): 3adb85d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -202,7 +202,7 @@ We evaluate our model on all benchmarks of the new leaderboard's version using t
202
  | `TRI-ML/mamba-7b-rw`<sup>*</sup>| 22.46 | 6.71 | 0.45 | 1.12 | 5.51 | 1.69 | 6.25 |
203
  |***Hybrid SSM-attention models*** | | | | | | |
204
  |`recurrentgemma-9b` | 30.76 | 14.80 | 4.83 | 4.70 | 6.60 | 17.88 | 13.20 |
205
- | `Zyphra/Zamba-7B-v1` | 24.06 | 21.12 | 3.32 | 3.03 | 7.74 | 16.02 | 12.55 |
206
  |***Transformer models*** | | | | | | | |
207
  | `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
208
  | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
@@ -218,11 +218,11 @@ Also, we evaluate our model on the benchmarks of the first leaderboard using `li
218
  | `model name` |`ARC`|`HellaSwag` |`MMLU` |`Winogrande`|`TruthfulQA`|`GSM8K`|`Average` |
219
  |:-----------------------------|:------:|:---------:|:-----:|:----------:|:----------:|:-----:|:----------------:|
220
  | ***Pure SSM models*** | | | | | | | |
221
- | `FalconMamba-7B` |62.03 | 80.82 | 62.11 | 73.64 | 53.42 | 52.54 | **64.09** |
222
- | `TRI-ML/mamba-7b-rw` | 51.25 | 80.85 | 33.41 | 71.11 | 23.13 | 4.70 | 44.03 |
223
  |***Hybrid SSM-attention models***| | | | | | | |
224
  | `recurrentgemma-9b` |52.00 | 80.40 | 60.50 | 73.60 | 38.60 | 42.60 | 57.95 |
225
- | `Zyphra/Zamba-7B-v1` | 56.14 | 82.23 | 58.11 | 79.87 | 36.23 | 30.78 | 57.23 |
226
  |***Transformer models*** | | | | | | | |
227
  | `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
228
  | `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
 
202
  | `TRI-ML/mamba-7b-rw`<sup>*</sup>| 22.46 | 6.71 | 0.45 | 1.12 | 5.51 | 1.69 | 6.25 |
203
  |***Hybrid SSM-attention models*** | | | | | | |
204
  |`recurrentgemma-9b` | 30.76 | 14.80 | 4.83 | 4.70 | 6.60 | 17.88 | 13.20 |
205
+ | `Zyphra/Zamba-7B-v1`<sup>*</sup> | 24.06 | 21.12 | 3.32 | 3.03 | 7.74 | 16.02 | 12.55 |
206
  |***Transformer models*** | | | | | | | |
207
  | `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.80 | 7.53 | 15.44 | 13.78 |
208
  | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
 
218
  | `model name` |`ARC`|`HellaSwag` |`MMLU` |`Winogrande`|`TruthfulQA`|`GSM8K`|`Average` |
219
  |:-----------------------------|:------:|:---------:|:-----:|:----------:|:----------:|:-----:|:----------------:|
220
  | ***Pure SSM models*** | | | | | | | |
221
+ | `FalconMamba-7B`<sup>*</sup> |62.03 | 80.82 | 62.11 | 73.64 | 53.42 | 52.54 | **64.09** |
222
+ | `TRI-ML/mamba-7b-rw`<sup>*</sup> | 51.25 | 80.85 | 33.41 | 71.11 | 23.13 | 4.70 | 44.03 |
223
  |***Hybrid SSM-attention models***| | | | | | | |
224
  | `recurrentgemma-9b` |52.00 | 80.40 | 60.50 | 73.60 | 38.60 | 42.60 | 57.95 |
225
+ | `Zyphra/Zamba-7B-v1`<sup>*</sup> | 56.14 | 82.23 | 58.11 | 79.87 | 36.23 | 30.78 | 57.23 |
226
  |***Transformer models*** | | | | | | | |
227
  | `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
228
  | `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |