Chihiro-7B-v0.1 / README.md
yuuko-eth's picture
Update README.md
d48684e verified
metadata
inference: false
language:
  - zh
  - en
license: unknown
model_name: Chihiro-7B-v0.1
pipeline_tag: text-generation
prompt_template: <s> SYS_PROMPT [INST] QUERY1 [/INST] RESPONSE1 [INST] QUERY2 [/INST]
tags:
  - nlp
  - chinese
  - mistral
  - traditional_chinese
  - merge
  - mergekit
  - MediaTek-Research/Breeze-7B-Instruct-v0_1
  - mlabonne/Zebrafish-7B

千尋 7B v0.1

Zebrafish 7B 加上 Breeze 7B 的 slerp merge 試驗性通用繁中基座模型 📚

GGUF Quants 👉 Chihiro-7B-v0.1-GGUF

請用 Mistral 7B Instruct 或是 Breeze 7B Instruct 所推薦的 Prompt 格式進行操作;以下為模型配置。

Chihiro 7B v0.1

This is an experimental Mistral-architecture SLERP merge with two brilliant base models. Zebrafish and Breeze were used together in this work.

Model configuration is as follows:

To use the model, please use either prompt templates suggested by the base models, or just slap the Mistral one on.



Benchmarks

Evaluation suite: OpenLLM

Model ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
Chihiro-7B-v0.1 68.52 85.95 (not yet evaluated) 63.81 81.77 64.22

Evaluation suite: Nous

Model AGIEval GPT4All TruthfulQA Bigbench Average
Chihiro-7B-v0.1 45.16 75.26 63.82 47.38 57.91

Average: 47.38%

Average score: 57.91%

Evaluated Apr. 27, 2024, NVIDIA RTX 4090