hyx21 commited on
Commit
153ab05
1 Parent(s): b9d9abf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -5
README.md CHANGED
@@ -16,7 +16,7 @@ tags:
16
  </div>
17
 
18
  <p align="center">
19
- <a href="XXXX" target="_blank">MiniCPM 技术报告 Technical Report</a> |
20
  <a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
21
  <a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
22
  </p>
@@ -51,15 +51,24 @@ We release all model parameters for research and limited commercial use. We also
51
  - The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
52
  - Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
53
 
 
54
 
 
55
 
56
- ### 局限性 Limitations:
 
 
 
 
 
 
 
57
 
58
  - 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
59
  - 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
60
  - 受限于模型规模,模型的输出受到提示词(prompt)的影响较大,可能多次尝试产生不一致的结果;
61
  - 受限于模型容量,模型的知识记忆较不准确,后续我们将结合RAG方法来增强模型的知识记忆能力。
62
-
63
  - Due to limitations in model size, the model may experience hallucinatory issues. As DPO model tend to generate longer response, hallucinations are more likely to occur. We will also continue to iterate and improve the MiniCPM model.
64
  - To ensure the universality of the model for academic research purposes, we did not conduct any identity training on the model. Meanwhile, as we use ShareGPT open-source corpus as part of the training data, the model may output identity information similar to the GPT series models.
65
  - Due to the limitation of model size, the output of the model is greatly influenced by prompt words, which may result in inconsistent results from multiple attempts.
@@ -130,8 +139,8 @@ print(responds)
130
 
131
  ## 工作引用 Citation
132
 
133
- * 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](todo)
134
- * Please cite our [techinical report]() if you find our work valuable.
135
 
136
  ```
137
  @inproceedings{minicpm2024,
 
16
  </div>
17
 
18
  <p align="center">
19
+ <a href="https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4" target="_blank">MiniCPM 技术报告</a><a href="https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4" target="_blank"> Technical Report</a> |
20
  <a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
21
  <a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
22
  </p>
 
51
  - The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
52
  - Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
53
 
54
+ ### 评测结果 Evaluation Results
55
 
56
+ 详细的评测结果位于[github仓库](https://github.com/OpenBMB/MiniCPM?tab=readme-ov-file#%E8%AF%84%E6%B5%8B%E7%BB%93%E6%9E%9C)
57
 
58
+ Detailed evaluation results are in [github repo](https://github.com/OpenBMB/MiniCPM/blob/main/README-en.md#evaluation-results)
59
+
60
+ 注意:我们发现使用Huggingface生成质量略差于vLLM,因此推荐使用vLLM进行测试。我们正在排查原因。
61
+
62
+ Notice: We discovered that the quality of Huggingface generation is slightly lower than vLLM, thus benchmarking using vLLM is recommended.
63
+ We are investigating the cause now.
64
+
65
+ ### 局限性 Limitations
66
 
67
  - 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
68
  - 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
69
  - 受限于模型规模,模型的输出受到提示词(prompt)的影响较大,可能多次尝试产生不一致的结果;
70
  - 受限于模型容量,模型的知识记忆较不准确,后续我们将结合RAG方法来增强模型的知识记忆能力。
71
+
72
  - Due to limitations in model size, the model may experience hallucinatory issues. As DPO model tend to generate longer response, hallucinations are more likely to occur. We will also continue to iterate and improve the MiniCPM model.
73
  - To ensure the universality of the model for academic research purposes, we did not conduct any identity training on the model. Meanwhile, as we use ShareGPT open-source corpus as part of the training data, the model may output identity information similar to the GPT series models.
74
  - Due to the limitation of model size, the output of the model is greatly influenced by prompt words, which may result in inconsistent results from multiple attempts.
 
139
 
140
  ## 工作引用 Citation
141
 
142
+ * 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4)
143
+ * Please cite our [techinical report](https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4) if you find our work valuable.
144
 
145
  ```
146
  @inproceedings{minicpm2024,