PowerInfer
/

Bamboo-base-v0_1

Feature Extraction

Model card Files Files and versions Community

yixinsong commited on Mar 27

Commit

3a063e9

•

1 Parent(s): 99bede2

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -76,10 +76,10 @@ Our evaluation is based on the framework lm-evaluation-harness and opencompass.
 - Huggingface LLM Leaderboard tasks.
 - Other Popular Benchmarks: We report the average accuracies on Big Bench Hard (BBH) (3-shot), HumanEval.
-|         | MMLU   | Winogrande | TruthfulQA | Hellaswag | GSM8K  | Arc-C  | HumanEval | BBH  | Average |
 | ------- | ------ | ---------- | ---------- | --------- | ------ | ------ | --------- | ---- | ------- |
-| Ours    | 0.6389 | 0.7593     | 0.4406     | 0.8217    | 0.5315 | 0.6195 | 0.256     |      |         |
-| Mistral | 0.6265 | 0.7924     | 0.4262     | 0.8332    | 0.4018 | 0.6143 | 0.2621    |      |         |
 ## Inference Speed Evaluation Results

 - Huggingface LLM Leaderboard tasks.
 - Other Popular Benchmarks: We report the average accuracies on Big Bench Hard (BBH) (3-shot), HumanEval.
+|        | Average | MMLU   | Winogrande | TruthfulQA | Hellaswag | GSM8K  | Arc-C  | HumanEval | BBH  |
 | ------- | ------ | ---------- | ---------- | --------- | ------ | ------ | --------- | ---- | ------- |
+| Bamboo  | **57.1**  | 63.89 | 76.16     | 44.06     | 82.17    | 52.84 | 62.20 | 25.6     |  50.35    |
+| Mistral-v0.1 | **56.5** | 62.65 | 79.24     | 42.62     | 83.32    | 40.18 | 61.43 | 26.21    |   56.35   |
 ## Inference Speed Evaluation Results