Edit model card

个人感觉这是当前对中文角色扮演聊天支持最好的模型了

acsr-y34b-4bpw-hb6-exl2

  • base model: Yi-34B-Chat
  • LoRA: Yi-34b-alpaca-cot-lora 支持Alpaca格式对话,效果很差不推荐使用Alpaca指令格式
  • LoRA: Yi-34B-Spicyboros-3.1-LoRA 非官方对话数据集
  • LoRA: limarpv3-yi-llama-34b-lora 扮演类长回复
  • Instruction template: ChatML
  • Origin max token size: 4096
  • 在使用text-generation-webui作后台,SillyTavern做前端时,webui上MaxToken调到8K时SillyTavern的MaxToken要调整到18K,WebUI才不会提前截断
  • 上下文长度8K Alpha调到2.5左右时,聊天到6K 以上Token还是不可避免的出现重复的回复和回复质量下降
  • 在SillyTavern上要得到Roleplay样式的长回复需做如下调整
    • 使用ChatML作为指令格式
    • 勾选 用换行符换行序列
    • 最后输出序列调整为 ### Response (2 paragraphs, engaging, natural, authentic, descriptive, creative):
    • 自定义停止符调整为 ["<|im_end|>用户", "\n### Input"]
    • 系统提示调整为 Avoid repetition, don't loop. Develop the plot slowly, always stay in character. Describe all actions in full, elaborate, explicit, graphic, and vivid detail. Mention all relevant sensory perceptions.

description

  • This is test for exllamav2 model.
  • 4bpw python convert.py -i acsr-v2-y34b -c exl2/0000.parquet -o acsr-v2-y34b-4bpw-hb6-exl2 -hb 6 -l 4096 -b 4.15
  • convert doc
  • calibration dataset: WikiText-2-v1
  • oobabooga/text-generation-webui must add --trust-remote-code into CMD_FLAGS.txt and use ExLlamav2 to load model
Downloads last month
19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.