Step-3.5-Flash-exl3 / README.md
turboderp's picture
Update README.md
868d323 verified
metadata
license: apache-2.0
base_model: stepfun-ai/Step-3.5-Flash
base_model_relation: quantized
quantized_by: turboderp
tags:
  - exl3

EXL3 quants of Step-3.5-Flash

⚠️ Requires ExLlamaV3 v0.0.23 (or v0.0.22 dev branch)

Base bitrates:

2.00 bits per weight
3.00 bits per weight
4.00 bits per weight

Optimized:

2.08 bits per weight
3.05 bits per weight
(more coming soon)

. Ppl¹ KL-div
2.00 bpw 2.629 0.653
2.08 bpw 2.154 0.466
3.00 bpw 1.521 0.142
3.05 bpw 1.478 0.118
4.00 bpw 1.379 0.053
Original 1.336

¹ (10 rows of wikitext2)