imgurmeet4u/qwen1.5-llm-quantized

The "qwen1.5-llm-quantized" model is a quantized version of the original Qwen1.5-110B model. Qwen1.5 is a transformer-based decoder-only language model that has been pretrained on a large amount of data. The improvements in Qwen1.5 include multiple model sizes, ranging from 0.5B to 110B dense models, as well as an MoE (Mixture of Experts) model of 14B with 2.7B activated. These models show significant performance improvements in chat models and provide multilingual support for both base and chat models. They also offer stable support for a 32K context length for models of all sizes. The quantized version of the model has undergone a quantization process, which reduces the model size and computational requirements while maintaining its performance.

For more details about the original Qwen1.5-110B model, you can refer to the blog post and GitHub repository provided by the Qwen team at Alibaba Cloud.

"https://huggingface.co/Qwen/Qwen1.5-110B" "https://github.com/QwenLM/Qwen1.5"