Commit
·
d426f71
1
Parent(s):
3266ecd
Update README.md
Browse files
README.md
CHANGED
@@ -9,23 +9,16 @@ tags:
|
|
9 |
---
|
10 |
# ChatGLM2-6B
|
11 |
<p align="center">
|
12 |
-
|
13 |
</p>
|
14 |
|
15 |
-
|
16 |
-
|
17 |
-
</p>
|
18 |
-
<p align="center">
|
19 |
-
📍Experience the larger-scale ChatGLM model at <a href="https://www.chatglm.cn">chatglm.cn</a>
|
20 |
-
</p>
|
21 |
-
|
22 |
-
## 介绍
|
23 |
-
ChatGLM**2**-6B 是开源中英双语对话模型 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) 的第二代版本,在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上,ChatGLM**2**-6B 引入了如下新特性:
|
24 |
|
25 |
-
1.
|
26 |
-
2.
|
27 |
-
3.
|
28 |
-
4.
|
29 |
|
30 |
ChatGLM**2**-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:
|
31 |
|
@@ -34,15 +27,15 @@ ChatGLM**2**-6B is the second-generation version of the open-source bilingual (C
|
|
34 |
3. **More Efficient Inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.
|
35 |
4. **More Open License**: ChatGLM2-6B weights are **completely open** for academic research, and **free commercial use** is also allowed after completing the [questionnaire](https://open.bigmodel.cn/mla/form).
|
36 |
|
37 |
-
##
|
38 |
|
39 |
```shell
|
40 |
pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2html sentencepiece accelerate
|
41 |
```
|
42 |
|
43 |
-
##
|
44 |
|
45 |
-
|
46 |
|
47 |
```ipython
|
48 |
>>> from transformers import AutoTokenizer, AutoModel
|
@@ -51,35 +44,35 @@ pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2ht
|
|
51 |
>>> model = model.eval()
|
52 |
>>> response, history = model.chat(tokenizer, "你好", history=[])
|
53 |
>>> print(response)
|
54 |
-
|
55 |
-
>>> response, history = model.chat(tokenizer, "
|
56 |
>>> print(response)
|
57 |
-
|
58 |
|
59 |
-
1.
|
60 |
-
2.
|
61 |
-
3.
|
62 |
-
4.
|
63 |
-
5.
|
64 |
-
6.
|
65 |
|
66 |
-
|
67 |
```
|
68 |
|
69 |
-
|
70 |
|
71 |
For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
|
72 |
|
73 |
## Change Log
|
74 |
* v1.0
|
75 |
|
76 |
-
##
|
77 |
|
78 |
-
|
79 |
|
80 |
-
##
|
81 |
|
82 |
-
|
83 |
|
84 |
```
|
85 |
@article{zeng2022glm,
|
|
|
9 |
---
|
10 |
# ChatGLM2-6B
|
11 |
<p align="center">
|
12 |
+
<a href="https://github.com/THUDM/ChatGLM2-6B" target="_blank">Github Repo</a>
|
13 |
</p>
|
14 |
|
15 |
+
## Introduce
|
16 |
+
ChatGLM**2**-6B is the second-generation version of the open source Chinese-English bilingual dialogue model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth dialogue and deployment of the first-generation model. On the basis of many excellent features such as low threshold, ChatGLM**2**-6B introduces the following new features:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
+
1. **More powerful performance**: Based on the development experience of the first-generation ChatGLM model, we have comprehensively upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the hybrid objective function of [GLM](https://github.com/THUDM/GLM), and has been pre-trained with 1.4T Chinese and English identifiers and human preference alignment training, [evaluation results](#evaluation The results) show that compared with the first-generation model, ChatGLM2-6B has achieved great performance on MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%) and other data sets. The improvement in magnitude makes it highly competitive among open source models of the same size.
|
19 |
+
2. **Longer context**: Based on [FlashAttention](https://github.com/HazyResearch/flash-attention) technology, we change the context length (Context Length) of the base model from 2K of ChatGLM-6B Extended to 32K, and trained with 8K context length during the dialogue phase, allowing more rounds of dialogue. However, the current version of ChatGLM2-6B has limited ability to understand single-round ultra-long documents. We will focus on optimization in subsequent iterative upgrades.
|
20 |
+
3. **More efficient inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technology, ChatGLM2-6B has more efficient inference speed and lower memory usage: in Under the official model implementation, the inference speed is increased by 42% compared to the first generation. Under INT4 quantification, the conversation length supported by 6G video memory is increased from 1K to 8K.
|
21 |
+
4. **More open protocol**: ChatGLM2-6B weights are **fully open** to academic research, after filling in the [questionnaire](https://open.bigmodel.cn/mla/form) for registration** Free commercial use is also permitted**.
|
22 |
|
23 |
ChatGLM**2**-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B). It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:
|
24 |
|
|
|
27 |
3. **More Efficient Inference**: Based on [Multi-Query Attention](http://arxiv.org/abs/1911.02150) technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.
|
28 |
4. **More Open License**: ChatGLM2-6B weights are **completely open** for academic research, and **free commercial use** is also allowed after completing the [questionnaire](https://open.bigmodel.cn/mla/form).
|
29 |
|
30 |
+
## Software dependencies
|
31 |
|
32 |
```shell
|
33 |
pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2html sentencepiece accelerate
|
34 |
```
|
35 |
|
36 |
+
## Code call
|
37 |
|
38 |
+
The ChatGLM-6B model can be called by the following code to generate a conversation:
|
39 |
|
40 |
```ipython
|
41 |
>>> from transformers import AutoTokenizer, AutoModel
|
|
|
44 |
>>> model = model.eval()
|
45 |
>>> response, history = model.chat(tokenizer, "你好", history=[])
|
46 |
>>> print(response)
|
47 |
+
Hello 👋! I am the artificial intelligence assistant ChatGLM-6B. Nice to meet you. You are welcome to ask me any questions.
|
48 |
+
>>> response, history = model.chat(tokenizer, "What should I do if I can't sleep at night", history=history)
|
49 |
>>> print(response)
|
50 |
+
Staying awake at night may make you feel anxious or uncomfortable, but here are some things you can do to help you fall asleep:
|
51 |
|
52 |
+
1. Establish a regular sleep schedule: Maintaining a regular sleep schedule can help you establish healthy sleep habits and make it easier for you to fall asleep. Try to go to bed at the same time every day and get up at the same time.
|
53 |
+
2. Create a comfortable sleeping environment: Make sure the sleeping environment is comfortable, quiet, dark and at a suitable temperature. Use comfortable bedding and keep the room ventilated.
|
54 |
+
3. Relax your body and mind: Doing some relaxing activities before going to bed, such as taking a hot bath, listening to some soft music, reading some interesting books, etc., can help relieve tension and anxiety and make it easier for you to fall asleep.
|
55 |
+
4. Avoid drinking beverages containing caffeine: Caffeine is a stimulating substance that can affect your sleep quality. Try to avoid drinking caffeinated beverages such as coffee, tea and cola before bed.
|
56 |
+
5. Avoid doing things in bed that are not related to sleep: Doing things in bed that are not related to sleep, such as watching movies, playing games or working, may interfere with your sleep.
|
57 |
+
6. Try breathing techniques: Deep breathing is a relaxation technique that can help you relieve tension and anxiety and make it easier for you to fall asleep. Try to inhale slowly, hold for a few seconds, and then exhale slowly.
|
58 |
|
59 |
+
If these methods don't help you fall asleep, you may consider talking to your doctor or sleep specialist for further advice.
|
60 |
```
|
61 |
|
62 |
+
For more instructions, including how to run the command line and web version of DEMO, and use model quantization to save video memory, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
|
63 |
|
64 |
For more instructions, including how to run CLI and web demos, and model quantization, please refer to our [Github Repo](https://github.com/THUDM/ChatGLM2-6B).
|
65 |
|
66 |
## Change Log
|
67 |
* v1.0
|
68 |
|
69 |
+
## License
|
70 |
|
71 |
+
The code of this repository is open source according to the [Apache-2.0](LICENSE) agreement. The use of the weights of the ChatGLM2-6B model needs to follow the [Model License](MODEL_LICENSE).
|
72 |
|
73 |
+
## Quote
|
74 |
|
75 |
+
If you find our work helpful, please consider citing the following papers. The ChatGLM2-6B paper will be published in the near future, so stay tuned~
|
76 |
|
77 |
```
|
78 |
@article{zeng2022glm,
|