Edit model card

GreenBit LLMs

This is GreenBitAI's pretrained low-bit LLMs with extreme compression yet still strong performance.

Please refer to our Github page for the code to run the model and more information.

Zero-shot Evaluation

We evaluate the zero-shot ability of low-bit quantized Qwen1.5 models using the llm_eval library and list the results below:

Repository (Qwen Family) Avg Acc. OpenBQ ARC-E Winogr. HellaS. ARC-C PIQA BoolQ RACE ANLI-R1 ANLI-R2 ANLI-R3 WiC
Qwen-1.5-0.5B-layer-mix-bpw-2.2 0.398 0.170 0.443 0.527 0.332 0.238 0.634 0.620 0.318 0.332 0.338 0.330 0.500
Qwen-1.5-0.5B-layer-mix-bpw-2.5 0.394 0.170 0.514 0.541 0.337 0.232 0.637 0.496 0.318 0.316 0.358 0.326 0.490
Qwen-1.5-0.5B-layer-mix-bpw-3.0 0.407 0.198 0.533 0.536 0.348 0.234 0.671 0.552 0.323 0.330 0.333 0.335 0.495
Qwen-1.5-1.8B-layer-mix-bpw-2.2 0.415 0.218 0.539 0.586 0.392 0.260 0.678 0.622 0.333 0.333 0.333 0.336 0.464
Qwen-1.5-1.8B-layer-mix-bpw-2.5 0.423 0.222 0.592 0.585 0.406 0.267 0.695 0.629 0.336 0.314 0.339 0.361 0.507
Qwen-1.5-1.8B-layer-mix-bpw-3.0 0.438 0.246 0.576 0.563 0.413 0.277 0.694 0.645 0.352 0.323 0.336 0.343 0.492
Qwen-1.5-4B-layer-mix-bpw-2.2 0.480 0.254 0.663 0.623 0.463 0.339 0.712 0.718 0.349 0.326 0.355 0.384 0.513
Qwen-1.5-4B-layer-mix-bpw-2.5 0.490 0.266 0.677 0.629 0.473 0.365 0.732 0.717 0.351 0.372 0.352 0.360 0.502
Qwen-1.5-4B-layer-mix-bpw-3.0 0.502 0.268 0.678 0.642 0.494 0.358 0.755 0.757 0.380 0.395 0.395 0.392 0.519
Qwen-1.5-7B-layer-mix-bpw-2.2 0.513 0.278 0.669 0.654 0.504 0.389 0.741 0.759 0.376 0.383 0.410 0.403 0.517
Qwen-1.5-7B-layer-mix-bpw-2.5 0.520 0.294 0.705 0.650 0.520 0.387 0.750 0.769 0.371 0.445 0.424 0.398 0.564
Qwen-1.5-7B-layer-mix-bpw-3.0 0.531 0.292 0.713 0.654 0.545 0.405 0.764 0.807 0.383 0.424 0.393 0.414 0.627
Qwen-1.5-14B-layer-mix-bpw-2.5 0.553 0.318 0.727 0.682 0.564 0.413 0.775 0.792 0.390 0.472 0.434 0.446 0.623
Qwen-1.5-32B-layer-mix-bpw-3.0 0.599 0.346 0.775 0.722 0.620 0.492 0.807 0.853 0.444 0.515 0.494 0.478 0.642
Downloads last month
3
Safetensors
Model size
4.92B params
Tensor type
FP16
·
I32
·
I16
·
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Collection including GreenBitAI/Qwen-1.5-32B-channel-mix-bpw-3.0