Edit model card

这是以Yi-34B-Llama为底座重新合并的模型,原本200K上下文底座在合并了几个非200K上下文LoRA后的效果好象不太行,所以使用与LoRA相配套的底座重新合并。 底座是4096上下文,按照Y34B原本的说法推理时支持最大32K上下文(Alpha 8),本人推建8K上下文(Alpha 2.5)。 这次合并我改了下LoRA合并的顺序,将limarpv3切到最后合并。

acsr-y34b-4bpw-hb6-exl2

description

  • This is test for exllamav2 model.
  • 4bpw python convert.py -i acsr-y34b -c exl2/0000.parquet -o acsr-y34b-4bpw-hb6-exl2 -hb 6 -l 4096 -b 4.15
  • convert doc
  • calibration dataset: WikiText-2-v1
  • oobabooga/text-generation-webui must add --trust-remote-code into CMD_FLAGS.txt and use ExLlamav2 to load model
Downloads last month
18
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.