|
--- |
|
license: other |
|
language: |
|
- en |
|
library_name: transformers |
|
--- |
|
See: https://huggingface.co/01-ai/Yi-34B-200K |
|
|
|
Yi-30B-200K quantized to 3.9bpw, which should allow for ~50K context on 24GB GPUs. Ask if you need another size. |
|
|
|
Quantized with 8K rows on a mix of wikitext, prompt formatting, and my own RP stories. |
|
|
|
Use with --enable-remote-code in text-gen-ui. Load with Exllamav2_HF, use 8-bit cache, and *disable* the `fast_tokenizer` option. The TFS preset seems to work well with Yi. |
|
|
|
|
|
|
|
--- |
|
license: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE |
|
--- |