File size: 565 Bytes
520b9da
 
 
 
 
 
5fb9f53
 
 
 
8f99dff
5fb9f53
95a2033
5fb9f53
7619737
5fb9f53
 
5c7d7ca
5fb9f53
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: other
language:
- en
library_name: transformers
---
See: https://huggingface.co/01-ai/Yi-34B-200K

Yi-30B-200K quantized to 3.9bpw, which should allow for ~50K context on 24GB GPUs. Ask if you need another size.

Quantized with 8K rows on a mix of wikitext, prompt formatting, and my own RP stories.

Use with --enable-remote-code in text-gen-ui. Load with Exllamav2_HF, use 8-bit cache, and *disable* the `fast_tokenizer` option. The TFS preset seems to work well with Yi.



---
license: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
---