xverse
/

XVERSE-13B-Chat-GPTQ-Int8

@@ -9,7 +9,7 @@ inference: false
 ## 更新信息
-- **[2024/03/25]** 发布XVERSE-13B-Chat-GPTQ-Int8量化模型，支持vLLM推理xverse-13b量化模型。
 - **[2023/11/06]** 发布新版本的 **XVERSE-13B-2** 底座模型和 **XVERSE-13B-Chat-2** 对话模型，相较于原始版本，新版本的模型训练更加充分（从 1.4T 增加到 3.2T），各方面的能力均得到大幅提升，同时新增工具调用能力。
 - **[2023/09/26]** 发布 7B 尺寸的 [XVERSE-7B](https://github.com/xverse-ai/XVERSE-7B) 底座模型和 [XVERSE-7B-Chat](https://github.com/xverse-ai/XVERSE-7B) 对话模型，支持在单张消费级显卡部署运行，并保持高性能、全开源、免费可商用。
 - **[2023/08/22]** 发布经过指令精调的 XVERSE-13B-Chat 对话模型。
@@ -17,7 +17,7 @@ inference: false
 ## Update Information
-- **[2024/03/25]** Release the XVERSE-13B-Chat-GPTQ-Int8 quantification model, supporting vLLM inference for the xverse-13b quantification model.
 - **[2023/11/06]** The new versions of the **XVERSE-13B-2** base model and the **XVERSE-13B-Chat-2** model have been released. Compared to the original versions, the new models have undergone more extensive training (increasing from 1.4T to 3.2T), resulting in significant improvements in all capabilities, along with the addition of Function Call abilities.
 - **[2023/09/26]** Released the [XVERSE-7B](https://github.com/xverse-ai/XVERSE-7B) base model and [XVERSE-7B-Chat](https://github.com/xverse-ai/XVERSE-7B) instruct-finetuned model with 7B size, which support deployment and operation on a single consumer-grade graphics card while maintaining high performance, full open source, and free for commercial use.
 - **[2023/08/22]** Released the aligned instruct-finetuned model XVERSE-13B-Chat.
@@ -55,7 +55,7 @@ We advise you to clone [`vllm`](https://github.com/vllm-project/vllm.git) and in
 ## 使用方法
-我们演示了如何使用 `vllm` 来运行XVERSE-13B-Chat-GPTQ-Int8量化模型:
 ```python
 from vllm import LLM, SamplingParams
@@ -83,7 +83,7 @@ for output in outputs:
 ## Usage
-We demonstrated how to use 'vllm' to run the XVERSE-13B-Chat-GPTQ-Int8 quantization model:
 ```python
 from vllm import LLM, SamplingParams

 ## 更新信息
+- **[2024/03/25]** 发布XVERSE-13B-Chat-GPTQ-Int8量化模型，支持vLLM推理XVERSE-13B-Chat量化模型。
 - **[2023/11/06]** 发布新版本的 **XVERSE-13B-2** 底座模型和 **XVERSE-13B-Chat-2** 对话模型，相较于原始版本，新版本的模型训练更加充分（从 1.4T 增加到 3.2T），各方面的能力均得到大幅提升，同时新增工具调用能力。
 - **[2023/09/26]** 发布 7B 尺寸的 [XVERSE-7B](https://github.com/xverse-ai/XVERSE-7B) 底座模型和 [XVERSE-7B-Chat](https://github.com/xverse-ai/XVERSE-7B) 对话模型，支持在单张消费级显卡部署运行，并保持高性能、全开源、免费可商用。
 - **[2023/08/22]** 发布经过指令精调的 XVERSE-13B-Chat 对话模型。
 ## Update Information
+- **[2024/03/25]** Release the XVERSE-13B-Chat-GPTQ-Int8 quantification model, supporting vLLM inference for the XVERSE-13B-Chat quantification model.
 - **[2023/11/06]** The new versions of the **XVERSE-13B-2** base model and the **XVERSE-13B-Chat-2** model have been released. Compared to the original versions, the new models have undergone more extensive training (increasing from 1.4T to 3.2T), resulting in significant improvements in all capabilities, along with the addition of Function Call abilities.
 - **[2023/09/26]** Released the [XVERSE-7B](https://github.com/xverse-ai/XVERSE-7B) base model and [XVERSE-7B-Chat](https://github.com/xverse-ai/XVERSE-7B) instruct-finetuned model with 7B size, which support deployment and operation on a single consumer-grade graphics card while maintaining high performance, full open source, and free for commercial use.
 - **[2023/08/22]** Released the aligned instruct-finetuned model XVERSE-13B-Chat.
 ## 使用方法
+我们演示了如何使用 vLLM 来运行XVERSE-13B-Chat-GPTQ-Int8量化模型:
 ```python
 from vllm import LLM, SamplingParams
 ## Usage
+We demonstrated how to use vLLM to run the XVERSE-13B-Chat-GPTQ-Int8 quantization model:
 ```python
 from vllm import LLM, SamplingParams