Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ tags:
|
|
16 |
</div>
|
17 |
|
18 |
<p align="center">
|
19 |
-
<a href="
|
20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
22 |
</p>
|
@@ -51,9 +51,18 @@ We release all model parameters for research and limited commercial use. We also
|
|
51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
53 |
|
|
|
54 |
|
|
|
55 |
|
56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
|
58 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
59 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
@@ -130,8 +139,8 @@ print(responds)
|
|
130 |
|
131 |
## 工作引用 Citation
|
132 |
|
133 |
-
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](
|
134 |
-
* Please cite our [techinical report]() if you find our work valuable.
|
135 |
|
136 |
```
|
137 |
@inproceedings{minicpm2024,
|
@@ -140,4 +149,3 @@ print(responds)
|
|
140 |
year={2024}
|
141 |
}
|
142 |
```
|
143 |
-
|
|
|
16 |
</div>
|
17 |
|
18 |
<p align="center">
|
19 |
+
<a href="https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4" target="_blank">MiniCPM 技术报告</a><a href="https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4" target="_blank"> Technical Report</a> |
|
20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
22 |
</p>
|
|
|
51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
53 |
|
54 |
+
### 评测结果 Evaluation Results
|
55 |
|
56 |
+
详细的评测结果位于[github仓库](https://github.com/OpenBMB/MiniCPM?tab=readme-ov-file#%E8%AF%84%E6%B5%8B%E7%BB%93%E6%9E%9C)
|
57 |
|
58 |
+
Detailed evaluation results are in [github repo](https://github.com/OpenBMB/MiniCPM/blob/main/README-en.md#evaluation-results)
|
59 |
+
|
60 |
+
注意:我们发现使用Huggingface生成质量略差于vLLM,因此推荐使用vLLM进行测试。我们正在排查原因。
|
61 |
+
|
62 |
+
Notice: We discovered that the quality of Huggingface generation is slightly lower than vLLM, thus benchmarking using vLLM is recommended.
|
63 |
+
We are investigating the cause now.
|
64 |
+
|
65 |
+
### 局限性 Limitations
|
66 |
|
67 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
68 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
|
|
139 |
|
140 |
## 工作引用 Citation
|
141 |
|
142 |
+
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4)
|
143 |
+
* Please cite our [techinical report](https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4) if you find our work valuable.
|
144 |
|
145 |
```
|
146 |
@inproceedings{minicpm2024,
|
|
|
149 |
year={2024}
|
150 |
}
|
151 |
```
|
|