metadata
license: other
language:
- en
EXL2 Quantization of Gryphe's MythoMax L2 13B.
Other quantized models are available from TheBloke: GGML - GPTQ - GGUF - AWQ
Model details
Branch | Bits | Perplexity | **Desc ** |
---|---|---|---|
main | 5 | 6.1018 | Up to 6144 context size on T4 GPU |
6bit | 6 | 6.1182 | 4096 context size (tokens) on T4 GPU |
- | 7 | 6.1056 | 2048 max context size for T4 GPU |
- | 8 | 6.1027 | Just, why? |
I'll upload the 7 and 8 bits quant if someone request it. (Idk y the 5 bits quant preplexity is lower than higher bits quant, need some test)
Prompt Format
Alpaca format:
### Instruction:
### Response: