yuuko-eth's picture
Update README.md
81e077d verified
metadata
inference: false
language:
  - zh
  - en
license: unknown
model_name: Rain-2x7B-MoE-32k-v0.2
pipeline_tag: text-generation
prompt_template: <s> SYS_PROMPT [INST] QUERY1 [/INST] RESPONSE1 [INST] QUERY2 [/INST]
tags:
  - nlp
  - chinese
  - mistral
  - mixtral
  - traditional_chinese
  - merge
  - mergekit
  - MediaTek-Research/Breeze-7B-Instruct-v0_1
  - beowolx/CodeNinja-1.0-OpenChat-7B
  - mlabonne/Marcoro14-7B-slerp

小雨同學 2x7B

採用聯發科 Breeze 7B Instruct 為基底的國語 MoE (Mixture-of-Experts) 模型,共有兩個 Expert model。

請用 Marcoro14-7B 或是 Breeze-7B-Instruct 所推薦的 Prompt 格式進行操作;以下為模型配置。

  • v0.2 更新了 tokenizer parameters

Rain-2x7B-MoE-32k-v0.2

This is an experimental Mixtral-architecture MoE model with 2 of 7B sized fine-tunes. Breeze and CodeNinja are used on top of Marcoro14-7B-slerp.

Model configuration is as follows:

To use the model, please use either prompt templates suggested by the base models.

Notes

Please evaluate before use in any application pipeline. Activation for coding part of the model would be 'code', 'python', 'typescript', 'javascript', 'programming', 'algorithm'.