Edit model card
Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

WizardLM-2-4x7B-MoE-exl2-3_5bpw

This is a quantized version of WizardLM-2-4x7B-MoE an experimental MoE model made with Mergekit. Quantization was done using version 0.0.18 of ExLlamaV2.

Please be sure to set experts per token to 4 for the best results! Context length should be the same as Mistral-7B-Instruct-v0.1 (8k tokens). For instruction templates, Vicuna-v1.1 is recommended.

For more information see the original repository.

Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.