Update README.md
Browse files
README.md
CHANGED
@@ -4,10 +4,11 @@ language:
|
|
4 |
- zh
|
5 |
---
|
6 |
|
|
|
7 |
# Kyara: Knowledge Yielding Adaptive Retrieval Augmentation for LLM Fine-tuning
|
8 |
|
9 |
<p align="left">
|
10 |
-
🤗 <a href="https://huggingface.co/zake7749/gemma-2-2b-it-chinese-kyara-dpo">Hugging Face</a>  | 🚀<a href="https://github.com/zake7749/kyara">Github</a>  |  📑 <a href="#">Paper</a>  |  📖 <a href="
|
11 |
</p>
|
12 |
<div style="text-align: center;">
|
13 |
<img src="https://i.imgur.com/QiWlcYJ.jpeg" alt="kyara"/>
|
@@ -61,16 +62,35 @@ All evaluations are based-on zero-shot.
|
|
61 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **21.25**| 20.75 |
|
62 |
| **Chinese-Reason-Bench** | **3.97** | 3.44 |
|
63 |
|
64 |
-
The aggregation method for
|
65 |
|
66 |
### **Alignment Benchmark**
|
67 |
|
68 |
-
| Metric | Kyara
|
69 |
|--------------------------|----------|-------------|
|
70 |
-
| **[AlpacaEval
|
71 |
-
| **[AlpacaEval](https://github.com/tatsu-lab/alpaca_eval)** | **43.95
|
72 |
| **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **6.94** | 6.59 |
|
73 |
| **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | 7.92 | **8.17** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
|
75 |
## Feature
|
76 |
|
@@ -293,7 +313,7 @@ We followed the original design, using Kyara-SFT to generate a set of contrastiv
|
|
293 |
|
294 |
##### RLAIF
|
295 |
|
296 |
-
|
297 |
|
298 |
We extracted Chinese Prompts from `Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese`, `hfl/stem_zh_instruction`, and `FreedomIntelligence/Evol-Instruct-Chinese-GPT4`, and distributed the same prompt to four different LLMs. The competitors include:
|
299 |
|
@@ -348,7 +368,7 @@ Like most models, Kyara still suffers from hallucinations, and possibly due to t
|
|
348 |
如何解釋「俠者,不受名利所牽,但求俯仰無愧。正邪存乎在我,聲名留問諸君。非我桀傲不馴,乃因唐門薪火,自古如今!」
|
349 |
```
|
350 |
|
351 |
-
|
352 |
|
353 |
```
|
354 |
這段話出自《射鵰英雄傳》中的唐門內部經典,由郭靖和黃蓉在唐門大殿上宣讀。這段話主要表達了唐門門人對於俠義精神和內在修為的追求,以及他們對外界的名利和聲名的不屑。以下是詳細解釋:
|
@@ -368,4 +388,4 @@ The model would respond that this quote is from The "Legend of the Condor Heroes
|
|
368 |
���結來說,這段話強調了唐門門人對名利和聲名的不屑,他們追求的是內在的修為和道德,不為外界的評價所動搖。這也體現了他們俠義精神和對正義的堅持。
|
369 |
```
|
370 |
|
371 |
-
It is recommended to exercise caution when using language models.
|
|
|
4 |
- zh
|
5 |
---
|
6 |
|
7 |
+
|
8 |
# Kyara: Knowledge Yielding Adaptive Retrieval Augmentation for LLM Fine-tuning
|
9 |
|
10 |
<p align="left">
|
11 |
+
🤗 <a href="https://huggingface.co/zake7749/gemma-2-2b-it-chinese-kyara-dpo">Hugging Face</a>  | 🚀<a href="https://github.com/zake7749/kyara">Github</a>  |  📑 <a href="#">Paper</a>  |  📖 <a href="https://github.com/zake7749/kyara/blob/main/document/README_EN.md">English</a>  |  📖 <a href="https://github.com/zake7749/kyara">Chinese</a>
|
12 |
</p>
|
13 |
<div style="text-align: center;">
|
14 |
<img src="https://i.imgur.com/QiWlcYJ.jpeg" alt="kyara"/>
|
|
|
62 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **21.25**| 20.75 |
|
63 |
| **Chinese-Reason-Bench** | **3.97** | 3.44 |
|
64 |
|
65 |
+
The aggregation method for the groups in TMMLUPlus is macro average, following the practice in the official implementation.
|
66 |
|
67 |
### **Alignment Benchmark**
|
68 |
|
69 |
+
| Metric | Kyara-2b-it | Gemma-2-2b-it |
|
70 |
|--------------------------|----------|-------------|
|
71 |
+
| **[AlpacaEval LC Win Rate](https://github.com/tatsu-lab/alpaca_eval)** | **41.15** | 32.37 |
|
72 |
+
| **[AlpacaEval Win Rate](https://github.com/tatsu-lab/alpaca_eval)** | **43.95** | 32.94 |
|
73 |
| **[MT-Bench-TW](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2)** | **6.94** | 6.59 |
|
74 |
| **[MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench)** | 7.92 | **8.17** |
|
75 |
+
| **[Chatbot-Arena-Hard](https://github.com/lm-sys/arena-hard-auto)** | 18.9 | **19.4** |
|
76 |
+
|
77 |
+
#### [AlignBench](https://github.com/THUDM/AlignBench)
|
78 |
+
|
79 |
+
| Fold | Kyara-2b-it-CHT | Kyara-2b-it-CHS | Gemma-2-2b-it |
|
80 |
+
|---------------------------------|-----------------|-----------------|---------------|
|
81 |
+
| Fundamental Language Ability | 6.45 | **6.54** | 6.42 |
|
82 |
+
| Advanced Chinese Understanding | **5.27** | 5.20 | 5.03 |
|
83 |
+
| Open-ended Questions | 7.71 | **7.97** | 7.52 |
|
84 |
+
| Writing Ability | 7.98 | **8.06** | 7.76 |
|
85 |
+
| Logical Reasoning | **4.68** | 4.52 | 4.20 |
|
86 |
+
| Mathematics | **5.10** | 4.72 | 5.05 |
|
87 |
+
| Task-oriented Role Play | 7.87 | **8.05** | 7.42 |
|
88 |
+
| Professional Knowledge | **6.68** | 6.32 | 5.79 |
|
89 |
+
| Reasoning AVG. | **4.89** | 4.62 | 4.63 |
|
90 |
+
| Chinage Language AVG. | 6.99 | **7.02** | 6.66 |
|
91 |
+
| Overall | **5.94** | 5.82 | 5.64 |
|
92 |
+
|
93 |
+
where the postfixes CHT and CHS represent Traditional Chinese and Simplified Chinese, respectively. To evaluate the performance on Traditional Chinese in AlignBench, we used [OpenCC](https://github.com/BYVoid/OpenCC) with the `s2twp` configuration to convert all questions, evidence, and references from Simplified Chinese to Traditional Chinese.
|
94 |
|
95 |
## Feature
|
96 |
|
|
|
313 |
|
314 |
##### RLAIF
|
315 |
|
316 |
+
Dataset: [zake7749/kyara-chinese-preference-dpo-s0-30K](https://huggingface.co/datasets/zake7749/kyara-chinese-preference-dpo-s0-30K)
|
317 |
|
318 |
We extracted Chinese Prompts from `Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese`, `hfl/stem_zh_instruction`, and `FreedomIntelligence/Evol-Instruct-Chinese-GPT4`, and distributed the same prompt to four different LLMs. The competitors include:
|
319 |
|
|
|
368 |
如何解釋「俠者,不受名利所牽,但求俯仰無愧。正邪存乎在我,聲名留問諸君。非我桀傲不馴,乃因唐門薪火,自古如今!」
|
369 |
```
|
370 |
|
371 |
+
However, the model would respond that this quote is from The "Legend of the Condor Heroes."
|
372 |
|
373 |
```
|
374 |
這段話出自《射鵰英雄傳》中的唐門內部經典,由郭靖和黃蓉在唐門大殿上宣讀。這段話主要表達了唐門門人對於俠義精神和內在修為的追求,以及他們對外界的名利和聲名的不屑。以下是詳細解釋:
|
|
|
388 |
���結來說,這段話強調了唐門門人對名利和聲名的不屑,他們追求的是內在的修為和道德,不為外界的評價所動搖。這也體現了他們俠義精神和對正義的堅持。
|
389 |
```
|
390 |
|
391 |
+
It is recommended to exercise caution when using language models.
|