Text Generation
Transformers
Safetensors
English
falcon_mamba
Eval Results
Inference Endpoints
Ilyas Chahed commited on
Commit
5bb4402
1 Parent(s): cdf905f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -39
README.md CHANGED
@@ -203,8 +203,8 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
203
  | `meta-llama/Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
204
  | `tiiuae/falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.8 | 7.53 | 15.44 | 13.78 |
205
  | `mistralai/Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
206
- | `Zyphra/Zamba-7B-v1` | 24.06 | 21.12 | 3.32 | 3.03 | 7.74 | 16.02 | 12.55 |
207
- | `TRI-ML/mamba-7b-rw` | 22.46 | 6.71 | 0.45 | 1.12 | 5.51 | 1.69 | 6.25 |
208
  | Ours | 33.36 | 19.88 | 3.63 | 8.05 | 10.85 | 14.47 | **15.04** |
209
 
210
  | model_name | ARC | HellaSwag | MMLU | Winogrande | TruthfulQA | GSM8K | **Average L1** |
@@ -214,43 +214,39 @@ We evaluate our model on all benchmarks of the leaderboard's version 2 using the
214
  | `mistralai/Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
215
  | `Zyphra/Zamba-7B-v1` | 46.48 | 80.24 | 57.72 | 76.4 | - | 30.78 | - |
216
  | `TRI-ML/mamba-7b-rw` | 46.48 | 80.24 | 57.72 | 76.4 | - | 4.7 | - |
217
- | Ours | 62.03 | 80.82 | 62.11 | 73.64 | 53.42 | 52.54 | 64.09 |
218
-
219
-
220
- | `model name` |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
221
- |:-------------------|:------:|:-----:|:---------:|:-----:|:-----:|:--------:|:--------:|
222
- | ***Pure SSM models***| | | | | | | |
223
- | `Falcon-Mamba-7B` | 33.36 | 19.88 | 3.63 | 8.05 | 10.86 | 14.47 | 15.04 |
224
- | `mamba1` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
225
- | `mamba2` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
226
- | `mamba3` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
227
- |***Hybrid SSM-attention models***| | | | | | | |
228
- | `hybrid1` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
229
- | `hybrid2` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
230
- | `hybrid3` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
231
- |***Transformer models***| | | | | | | |
232
- | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
233
- | `gemma-7B` | 00.00 | 00.00 | 0.00 | 0.00 | 0.00 | 00.00 | 00.00 |
234
- | `falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.8 | 7.53 | 15.44 | 13.78 |
235
- | `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
236
-
237
-
238
- | `model name` |`ARC`|`HellaSwag`|`MMLU`|`Winogrande`|`TruthfulQA`|`GSM8K`|`Average`|
239
- |:-------------------|:---:|:---------:|:----:|:----------:|:----------:|:-----:|:-------:|
240
- | ***Pure SSM models***| | | | | | | |
241
- | `Falcon-Mamba-7B` |62.03| 80.82 | 62.11| 73.64 | 53.42 | 52.54 | 64.09 |
242
- | `mamba1` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
243
- | `mamba2` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
244
- | `mamba3` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
245
- |***Hybrid SSM-attention models***|| | | | | | |
246
- | `hybrid1` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
247
- | `hybrid2` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
248
- | `hybrid3` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
249
- |***Transformer models***| | | | | | | |
250
- | `Meta-Llama-3-8B` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
251
- | `gemma-7B` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
252
- | `falcon2-11B` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
253
- | `Mistral-7B-v0.1` |00.00| 00.00 | 00.00| 00.00 | 00.00 | 00.00 | 00.00 |
254
 
255
  ## Throughput
256
 
 
203
  | `meta-llama/Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
204
  | `tiiuae/falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.8 | 7.53 | 15.44 | 13.78 |
205
  | `mistralai/Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
206
+
207
+
208
  | Ours | 33.36 | 19.88 | 3.63 | 8.05 | 10.85 | 14.47 | **15.04** |
209
 
210
  | model_name | ARC | HellaSwag | MMLU | Winogrande | TruthfulQA | GSM8K | **Average L1** |
 
214
  | `mistralai/Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
215
  | `Zyphra/Zamba-7B-v1` | 46.48 | 80.24 | 57.72 | 76.4 | - | 30.78 | - |
216
  | `TRI-ML/mamba-7b-rw` | 46.48 | 80.24 | 57.72 | 76.4 | - | 4.7 | - |
217
+ | Ours | 62.03 | 80.82 | 62.11 | 73.64 | 53.42 | 52.54 | **64.09** |
218
+
219
+
220
+ | `model name` |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
221
+ |:--------------------------|:------:|:-----:|:---------:|:-----:|:-----:|:--------:|:-------:|
222
+ | ***Pure SSM models*** | | | | | | | |
223
+ | `Falcon-Mamba-7B` | 33.36 | 19.88 | 3.63 | 8.05 | 10.86 | 14.47 |**15.04**|
224
+ | `TRI-ML/mamba-7b-rw` | 22.46 | 6.71 | 0.45 | 1.12 | 5.51 | 1.69 | 6.25 |
225
+ |***Hybrid SSM-attention models*** | | | | | | |
226
+ | `Zamba-7B-v1` | 24.06 | 21.12 | 3.32 | 3.03 | 7.74 | 16.02 | 12.55 |
227
+ |`recurrentgemma-9b` | 30.76 | 14.80 | 4.83 | 4.70 | 6.60 | 17.88 | 13.2 |
228
+ |***Transformer models*** | | | | | | | |
229
+ | `Falcon2-11B` | 32.61 | 21.94 | 2.34 | 2.8 | 7.53 | 15.44 | 13.78 |
230
+ | `Meta-Llama-3-8B` | 14.55 | 24.50 | 3.25 | 7.38 | 6.24 | 24.55 | 13.41 |
231
+ | `gemma-7B` | 26.59 | 21.12 | 6.42 | 4.92 | 10.98 | 21.64 |**15.28**|
232
+ | `Mistral-7B-v0.1` | 23.86 | 22.02 | 2.49 | 5.59 | 10.68 | 22.36 | 14.50 |
233
+ | `Mistral-Nemo-Base` | 16.83 | 29.37 | 4.98 | 5.82 | 6.52 | 27.46 | 15.08 |
234
+
235
+
236
+
237
+ | `model name` |`ARC`|`HellaSwag` |`MMLU` |`Winogrande`|`TruthfulQA`|`GSM8K`|`Average` |
238
+ |:-----------------------------|:------:|:---------:|:-----:|:----------:|:----------:|:-----:|:----------------:|
239
+ | ***Pure SSM models*** | | | | | | | |
240
+ | `Falcon-Mamba-7B` |62.03 | 80.82 | 62.11 | 73.64 | 53.42 | 52.54 | **64.09** |
241
+ | `TRI-ML/mamba-7b-rw` | 46.48 | 80.24 | 57.72 | 76.4 | - | 4.7 | - |
242
+ |***Hybrid SSM-attention models***| | | | | | | |
243
+ | `recurrentgemma-9b` |52.00 | 80.40 | 60.50 | 73.60 | 38.60 | 42.60 | 57.95 |
244
+ | `Zyphra/Zamba-7B-v1` | 46.48 | 80.24 | 57.72 | 76.4 | - | 30.78 | - |
245
+ |***Transformer models*** | | | | | | | |
246
+ | `Falcon2-11B` | 59.73 | 82.91 | 58.37 | 78.30 | 52.56 | 53.83 | **64.28** |
247
+ | `Meta-Llama-3-8B` | 60.24 | 82.23 | 66.70 | 78.45 | 42.93 | 45.19 | 62.62 |
248
+ | `gemma-7B` | 61.09 | 82.20 | 64.56 | 79.01 | 44.79 | 50.87 | 00.00 |
249
+ | `Mistral-7B-v0.1` | 59.98 | 83.31 | 64.16 | 78.37 | 42.15 | 37.83 | 60.97 |
 
 
 
 
250
 
251
  ## Throughput
252