Quantization made by Richard Erkhov.

Qwen1.5-MoE-A2.7B - GGUF

Model creator: https://huggingface.co/Qwen/
Original model: https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B/

Name	Quant method	Size
Qwen1.5-MoE-A2.7B.Q2_K.gguf	Q2_K	5.49GB
Qwen1.5-MoE-A2.7B.IQ3_XS.gguf	IQ3_XS	6.07GB
Qwen1.5-MoE-A2.7B.IQ3_S.gguf	IQ3_S	6.37GB
Qwen1.5-MoE-A2.7B.Q3_K_S.gguf	Q3_K_S	6.37GB
Qwen1.5-MoE-A2.7B.IQ3_M.gguf	IQ3_M	6.46GB
Qwen1.5-MoE-A2.7B.Q3_K.gguf	Q3_K	6.93GB
Qwen1.5-MoE-A2.7B.Q3_K_M.gguf	Q3_K_M	6.93GB
Qwen1.5-MoE-A2.7B.Q3_K_L.gguf	Q3_K_L	7.21GB
Qwen1.5-MoE-A2.7B.IQ4_XS.gguf	IQ4_XS	7.4GB
Qwen1.5-MoE-A2.7B.Q4_0.gguf	Q4_0	7.59GB
Qwen1.5-MoE-A2.7B.IQ4_NL.gguf	IQ4_NL	7.68GB
Qwen1.5-MoE-A2.7B.Q4_K_S.gguf	Q4_K_S	8.11GB
Qwen1.5-MoE-A2.7B.Q4_K.gguf	Q4_K	8.84GB
Qwen1.5-MoE-A2.7B.Q4_K_M.gguf	Q4_K_M	8.84GB
Qwen1.5-MoE-A2.7B.Q4_1.gguf	Q4_1	8.41GB
Qwen1.5-MoE-A2.7B.Q5_0.gguf	Q5_0	9.22GB
Qwen1.5-MoE-A2.7B.Q5_K_S.gguf	Q5_K_S	9.46GB
Qwen1.5-MoE-A2.7B.Q5_K.gguf	Q5_K	10.09GB
Qwen1.5-MoE-A2.7B.Q5_K_M.gguf	Q5_K_M	10.09GB
Qwen1.5-MoE-A2.7B.Q5_1.gguf	Q5_1	10.04GB
Qwen1.5-MoE-A2.7B.Q6_K.gguf	Q6_K	11.89GB
Qwen1.5-MoE-A2.7B.Q8_0.gguf	Q8_0	14.18GB

Original model description:

license: other license_name: tongyi-qianwen license_link: >- https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B/blob/main/LICENSE language: - en pipeline_tag: text-generation tags: - pretrained - moe

Qwen1.5-MoE-A2.7B

Introduction

Qwen1.5-MoE is a transformer-based MoE decoder-only language model pretrained on a large amount of data.

For more details, please refer to our blog post and GitHub repo.

Model Details

Qwen1.5-MoE employs Mixture of Experts (MoE) architecture, where the models are upcycled from dense language models. For instance, Qwen1.5-MoE-A2.7B is upcycled from Qwen-1.8B. It has 14.3B parameters in total and 2.7B activated parameters during runtime, while achieving comparable performance to Qwen1.5-7B, it only requires 25% of the training resources. We also observed that the inference speed is 1.74 times that of Qwen1.5-7B.

Requirements

The code of Qwen1.5-MoE has been in the latest Hugging face transformers and we advise you to build from source with command pip install git+https://github.com/huggingface/transformers, or you might encounter the following error:

KeyError: 'qwen2_moe'.

Usage

We do not advise you to use base language models for text generation. Instead, you can apply post-training, e.g., SFT, RLHF, continued pretraining, etc., on this model.