Raincleared commited on
Commit
3cc5a3b
1 Parent(s): 3d1ddae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -10,13 +10,14 @@ license: apache-2.0
10
  ---
11
 
12
 
13
- # ProSparse-MiniCPM-1B-sft-llama-format
14
 
15
  - Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
16
  - Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
17
- - Paper: [link](https://arxiv.org/pdf/2402.13516.pdf)
 
18
 
19
- **This model is converted from [ProSparse-MiniCPM-1B-sft](https://huggingface.co/openbmb/ProSparse-MiniCPM-1B-sft/) as a LLaMA format to make its usage more convenient.**
20
 
21
  ### Introduction
22
 
@@ -77,10 +78,10 @@ The evaluation results on the above benchmarks demonstrate the advantage of ProS
77
  | **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
78
  | **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
79
  | MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
80
- | **ProSparse-1B**\* | 86.25 | **44.72** | 41.38 | 64.55 | 60.69 | 34.72 | 49.36 | 34.04 | 28.27 |
81
- | **ProSparse-1B** | **87.89** | **44.72** | 42.04 | 64.37 | 60.73 | 34.57 | 49.51 | 34.08 | 27.77 |
82
 
83
- **Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "ProSparse-1B\*" denote the ProSparse versions without activation threshold shifting.
84
 
85
  ### Evaluation Issues with LM-Eval
86
 
 
10
  ---
11
 
12
 
13
+ # MiniCPM-S-1B-sft-llama-format
14
 
15
  - Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
16
  - Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
17
+ - Paper: [link](https://arxiv.org/pdf/2402.13516.pdf) (Note: `MiniCPM-S-1B` is denoted as `ProSparse-1B` in the paper.)
18
+ - Adapted PowerInfer version: [https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf](https://huggingface.co/openbmb/MiniCPM-S-1B-sft-gguf)
19
 
20
+ **This model is converted from [MiniCPM-S-1B-sft](https://huggingface.co/openbmb/MiniCPM-S-1B-sft/) as a LLaMA format to make its usage more convenient.**
21
 
22
  ### Introduction
23
 
 
78
  | **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
79
  | **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
80
  | MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
81
+ | **MiniCPM-S-1B**\* | 86.25 | **44.72** | 41.38 | 64.55 | 60.69 | 34.72 | 49.36 | 34.04 | 28.27 |
82
+ | **MiniCPM-S-1B** | **87.89** | **44.72** | 42.04 | 64.37 | 60.73 | 34.57 | 49.51 | 34.08 | 27.77 |
83
 
84
+ **Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "MiniCPM-S-1B\*" denote the ProSparse versions without activation threshold shifting.
85
 
86
  ### Evaluation Issues with LM-Eval
87