Raincleared
commited on
Commit
•
3cc5a3b
1
Parent(s):
3d1ddae
Update README.md
Browse files
README.md
CHANGED
@@ -10,13 +10,14 @@ license: apache-2.0
|
|
10 |
---
|
11 |
|
12 |
|
13 |
-
#
|
14 |
|
15 |
- Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
|
16 |
- Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
|
17 |
-
- Paper: [link](https://arxiv.org/pdf/2402.13516.pdf)
|
|
|
18 |
|
19 |
-
**This model is converted from [
|
20 |
|
21 |
### Introduction
|
22 |
|
@@ -77,10 +78,10 @@ The evaluation results on the above benchmarks demonstrate the advantage of ProS
|
|
77 |
| **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
|
78 |
| **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
|
79 |
| MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
|
80 |
-
| **
|
81 |
-
| **
|
82 |
|
83 |
-
**Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "
|
84 |
|
85 |
### Evaluation Issues with LM-Eval
|
86 |
|
|
|
10 |
---
|
11 |
|
12 |
|
13 |
+
# MiniCPM-S-1B-sft-llama-format
|
14 |
|
15 |
- Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
|
16 |
- Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
|
17 |
+
- Paper: [link](https://arxiv.org/pdf/2402.13516.pdf) (Note: `MiniCPM-S-1B` is denoted as `ProSparse-1B` in the paper.)
|
18 |
+
- Adapted PowerInfer version: [https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf](https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf)
|
19 |
|
20 |
+
**This model is converted from [MiniCPM-S-1B-sft](https://huggingface.co/openbmb/MiniCPM-S-1B-sft/) as a LLaMA format to make its usage more convenient.**
|
21 |
|
22 |
### Introduction
|
23 |
|
|
|
78 |
| **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
|
79 |
| **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
|
80 |
| MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
|
81 |
+
| **MiniCPM-S-1B**\* | 86.25 | **44.72** | 41.38 | 64.55 | 60.69 | 34.72 | 49.36 | 34.04 | 28.27 |
|
82 |
+
| **MiniCPM-S-1B** | **87.89** | **44.72** | 42.04 | 64.37 | 60.73 | 34.57 | 49.51 | 34.08 | 27.77 |
|
83 |
|
84 |
+
**Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "MiniCPM-S-1B\*" denote the ProSparse versions without activation threshold shifting.
|
85 |
|
86 |
### Evaluation Issues with LM-Eval
|
87 |
|