barisaydin
/

chatglm2-6b

@@ -9,23 +9,16 @@ tags:
 ---
 # ChatGLM2-6B
 <p align="center">
-  💻 <a href="https://github.com/THUDM/ChatGLM2-6B" target="_blank">Github Repo</a> • 🐦 <a href="https://twitter.com/thukeg" target="_blank">Twitter</a> • 📃 <a href="https://arxiv.org/abs/2103.10360" target="_blank">[GLM@ACL 22]</a> <a href="https://github.com/THUDM/GLM" target="_blank">[GitHub]</a> • 📃 <a href="https://arxiv.org/abs/2210.02414" target="_blank">[GLM-130B@ICLR 23]</a> <a href="https://github.com/THUDM/GLM-130B" target="_blank">[GitHub]</a> <br>
 </p>
-<p align="center">
-    👋 Join our <a href="https://join.slack.com/t/chatglm/shared_invite/zt-1y7pqoloy-9b1g6T6JjA8J0KxvUjbwJw" target="_blank">Slack</a> and <a href="https://github.com/THUDM/ChatGLM-6B/blob/main/resources/WECHAT.md" target="_blank">WeChat</a>
-</p>
-<p align="center">
-📍Experience the larger-scale ChatGLM model at <a href="https://www.chatglm.cn">chatglm.cn</a>
-</p>
-## 介绍
-ChatGLM**2**-6B 是开源中英双语对话模型 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) 的第二代版本，在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上，ChatGLM**2**-6B 引入了如下新特性：
-1. **更强大的性能**：基于 ChatGLM 初代模型的开发经验，我们全面升级了 ChatGLM2-6B 的基座模型。ChatGLM2-6B 使用了 [GLM](https://github.com/THUDM/GLM) 的混合目标函数，经过了 1.4T 中英标识符的预训练与人类偏好对齐训练，[评测结果](#评测结果)显示，相比于初代模型，ChatGLM2-6B 在 MMLU（+23%）、CEval（+33%）、GSM8K（+571%） 、BBH（+60%）等数据集上的性能取得了大幅度的提升，在同尺寸开源模型中具有较强的竞争力。
-2. **更长的上下文**：基于 [FlashAttention](https://github.com/HazyResearch/flash-attention) 技术，我们将基座模型的上下文长度（Context Length）由 ChatGLM-6B 的 2K 扩展到了 32K，并在对话阶段使用 8K 的上下文长度训练，允许更多轮次的对话。但当前版本的 ChatGLM2-6B 对单轮超长文档的理解能力有限，我们会在后续迭代升级中着重进行优化。
-3. **更高效的推理**：基于 [Multi-Query Attention](http://arxiv.org/abs/1911.02150) 技术，ChatGLM2-6B 有更高效的推理速度和更低的显存占用：在官方的模型实现下，推理速度相比初代提升了 42%，INT4 量化下，6G 显存支持的对话长度由 1K 提升到了 8K。
-4. **更开放的协议**：ChatGLM2-6B 权重对学术研究**完全开放**，在填写[问卷](https://open.bigmodel.cn/mla/form)进行登记后**亦允许免费商业使用**。
 ChatGLM**2**-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:
@@ -34,15 +27,15 @@ ChatGLM**2**-6B is the second-generation version of the open-source bilingual (C
 3. **More Efficient Inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official  implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.
 4. **More Open License**: ChatGLM2-6B weights are **completely open** for academic research, and **free commercial use** is also allowed after completing the [questionnaire](https://open.bigmodel.cn/mla/form).
-## 软件依赖
 ```shell
 pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2html sentencepiece accelerate
 ```
-## 代码调用
-可以通过如下代码调用 ChatGLM-6B 模型来生成对话：
 ```ipython
 >>> from transformers import AutoTokenizer, AutoModel
@@ -51,35 +44,35 @@ pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2ht
 >>> model = model.eval()
 >>> response, history = model.chat(tokenizer, "你好", history=[])
 >>> print(response)
-你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。
->>> response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
 >>> print(response)
-晚上睡不着可能会让你感到焦虑或不舒服,但以下是一些可以帮助你入睡的方法:
-1. 制定规律的睡眠时间表:保持规律的睡眠时间表可以帮助你建立健康的睡眠习惯,使你更容易入睡。尽量在每天的相同时间上床,并在同一时间起床。
-2. 创造一个舒适的睡眠环境:确保睡眠环境舒适,安静,黑暗且温度适宜。可以使用舒适的床上用品,并保持房间通风。
-3. 放松身心:在睡前做些放松的活动,例如泡个热水澡,听些轻柔的音乐,阅读一些有趣的书籍等,有助于缓解紧张和焦虑,使你更容易入睡。
-4. 避免饮用含有咖啡因的饮料:咖啡因是一种刺激性物质,会影响你的睡眠质量。尽量避免在睡前饮用含有咖啡因的饮料,例如咖啡,茶和可乐。
-5. 避免在床上做与睡眠无关的事情:在床上做些与睡眠无关的事情,例如看电影,玩游戏或工作等,可能会干扰你的睡眠。
-6. 尝试呼吸技巧:深呼吸是一种放松技巧,可以帮助你缓解紧张和焦虑,使你更容易入睡。试着慢慢吸气,保持几秒钟,然后缓慢呼气。
-如果这些方法无法帮助你入睡,你可以考虑咨询医生或睡眠专家,寻求进一步的建议。
 ```
-关于更多的使用说明，包括如何运行命令行和网页版本的 DEMO，以及使用模型量化以节省显存，请参考我们的 [Github Repo](https://github.com/THUDM/ChatGLM2-6B)。
 For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
 ## Change Log
 * v1.0
-## 协议
-本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源，ChatGLM2-6B 模型的权重的使用则需要遵循 [Model License](MODEL_LICENSE)。
-## 引用
-如果你觉得我们的工作有帮助的话，请考虑引用下列论文，ChatGLM2-6B 的论文会在近期公布，敬请期待～
 ```
 @article{zeng2022glm,

 ---
 # ChatGLM2-6B
 <p align="center">
+<a href="https://github.com/THUDM/ChatGLM2-6B" target="_blank">Github Repo</a>
 </p>
+## Introduce
+ChatGLM**2**-6B is the second-generation version of the open source Chinese-English bilingual dialogue model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth dialogue and deployment of the first-generation model. On the basis of many excellent features such as low threshold, ChatGLM**2**-6B introduces the following new features:
+1. **More powerful performance**: Based on the development experience of the first-generation ChatGLM model, we have comprehensively upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the hybrid objective function of [GLM](https://github.com/THUDM/GLM), and has been pre-trained with 1.4T Chinese and English identifiers and human preference alignment training, [evaluation results](#evaluation The results) show that compared with the first-generation model, ChatGLM2-6B has achieved great performance on MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%) and other data sets. The improvement in magnitude makes it highly competitive among open source models of the same size.
+2. **Longer context**: Based on [FlashAttention](https://github.com/HazyResearch/flash-attention) technology, we change the context length (Context Length) of the base model from 2K of ChatGLM-6B Extended to 32K, and trained with 8K context length during the dialogue phase, allowing more rounds of dialogue. However, the current version of ChatGLM2-6B has limited ability to understand single-round ultra-long documents. We will focus on optimization in subsequent iterative upgrades.
+3. **More efficient inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technology, ChatGLM2-6B has more efficient inference speed and lower memory usage: in Under the official model implementation, the inference speed is increased by 42% compared to the first generation. Under INT4 quantification, the conversation length supported by 6G video memory is increased from 1K to 8K.
+4. **More open protocol**: ChatGLM2-6B weights are **fully open** to academic research, after filling in the [questionnaire](https://open.bigmodel.cn/mla/form) for registration** Free commercial use is also permitted**.
 ChatGLM**2**-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:
 3. **More Efficient Inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official  implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.
 4. **More Open License**: ChatGLM2-6B weights are **completely open** for academic research, and **free commercial use** is also allowed after completing the [questionnaire](https://open.bigmodel.cn/mla/form).
+## Software dependencies
 ```shell
 pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2html sentencepiece accelerate
 ```
+## Code call
+The ChatGLM-6B model can be called by the following code to generate a conversation:
 ```ipython
 >>> from transformers import AutoTokenizer, AutoModel
 >>> model = model.eval()
 >>> response, history = model.chat(tokenizer, "你好", history=[])
 >>> print(response)
+Hello 👋! I am the artificial intelligence assistant ChatGLM-6B. Nice to meet you. You are welcome to ask me any questions.
+>>> response, history = model.chat(tokenizer, "What should I do if I can't sleep at night", history=history)
 >>> print(response)
+Staying awake at night may make you feel anxious or uncomfortable, but here are some things you can do to help you fall asleep:
+1. Establish a regular sleep schedule: Maintaining a regular sleep schedule can help you establish healthy sleep habits and make it easier for you to fall asleep. Try to go to bed at the same time every day and get up at the same time.
+2. Create a comfortable sleeping environment: Make sure the sleeping environment is comfortable, quiet, dark and at a suitable temperature. Use comfortable bedding and keep the room ventilated.
+3. Relax your body and mind: Doing some relaxing activities before going to bed, such as taking a hot bath, listening to some soft music, reading some interesting books, etc., can help relieve tension and anxiety and make it easier for you to fall asleep.
+4. Avoid drinking beverages containing caffeine: Caffeine is a stimulating substance that can affect your sleep quality. Try to avoid drinking caffeinated beverages such as coffee, tea and cola before bed.
+5. Avoid doing things in bed that are not related to sleep: Doing things in bed that are not related to sleep, such as watching movies, playing games or working, may interfere with your sleep.
+6. Try breathing techniques: Deep breathing is a relaxation technique that can help you relieve tension and anxiety and make it easier for you to fall asleep. Try to inhale slowly, hold for a few seconds, and then exhale slowly.
+If these methods don't help you fall asleep, you may consider talking to your doctor or sleep specialist for further advice.
 ```
+For more instructions, including how to run the command line and web version of DEMO, and use model quantization to save video memory, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
 For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
 ## Change Log
 * v1.0
+## License
+The code of this repository is open source according to the [Apache-2.0](LICENSE) agreement. The use of the weights of the ChatGLM2-6B model needs to follow the [Model License](MODEL_LICENSE).
+## Quote
+If you find our work helpful, please consider citing the following papers. The ChatGLM2-6B paper will be published in the near future, so stay tuned~
 ```
 @article{zeng2022glm,