--- license: mit --- 这是以Yi-34B-Llama为底座重新合并的模型,原本200K上下文底座在合并了几个非200K上下文LoRA后的效果好象不太行,所以使用与LoRA相配套的底座重新合并。 底座是4096上下文,按照Y34B原本的说法推理时支持最大32K上下文(Alpha 8),本人推建8K上下文(Alpha 2.5)。 这次合并我改了下LoRA合并的顺序,将limarpv3切到最后合并。 ### acsr-y34b-4bpw-hb6-exl2 - base model: [Yi-34B-Llama](https://huggingface.co/chargoddard/Yi-34B-Llama) - LoRA: [Yi-34b-alpaca-cot-lora](https://huggingface.co/zzlgreat/Yi-34b-alpaca-cot-lora) 支持Alpaca格式 - LoRA: [Yi-34B-Spicyboros-3.1-LoRA](https://huggingface.co/LoneStriker/Yi-34B-Spicyboros-3.1-LoRA) 非官方对话数据集 - LoRA: [limarpv3-yi-llama-34b-lora](https://huggingface.co/Doctor-Shotgun/limarpv3-yi-llama-34b-lora) 扮演类长回复 ### description - This is test for [exllamav2](https://github.com/turboderp/exllamav2) model. - 4bpw `python convert.py -i acsr-y34b -c exl2/0000.parquet -o acsr-y34b-4bpw-hb6-exl2 -hb 6 -l 4096 -b 4.15` - [convert doc](https://github.com/turboderp/exllamav2/blob/master/doc/convert.md) - calibration dataset: [WikiText-2-v1](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet) - oobabooga/text-generation-webui must add `--trust-remote-code` into CMD_FLAGS.txt and use ExLlamav2 to load model