About Quantization

我们使用modelscope swift仓库进行AWQ 4bit量化. 量化文档可以查看这里. 量化命令如下:

We use the modelscope swift repository to perform AWQ 4bit quantization. Quantization documentation can be found here. The quantization command is as follows:

# Experimental Environment: A100
swift export \
    --quant_bits 4 \
    --model_type yi-1_5-6b-chat \
    --quant_method awq \
    --quant_n_samples 64 \
    --dataset alpaca-zh alpaca-en sharegpt-gpt4-mini \
    --quant_seqlen 4096

Inference:

CUDA_VISIBLE_DEVICES=0 swift infer --model_type yi-1_5-6b-chat-awq-int4

SFT:

CUDA_VISIBLE_DEVICES=0 swift sft --model_type yi-1_5-6b-chat-awq-int4 --dataset leetcode-python-en

Original Model:

YI1.5-6B-Chat

Downloads last month
8
Safetensors
Model size
1.27B params
Tensor type
I32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.