Upload 2 files
Browse files- README.md +1 -0
- README_en.md +1 -0
README.md
CHANGED
@@ -11,6 +11,7 @@ pipeline_tag: text-generation
|
|
11 |
---
|
12 |
**Read this in other languages: [English](README_en.md), [中文](README.md).**
|
13 |
|
|
|
14 |
* 2023.12.14更新:发布经过微调的Qwen-14b-chat-yarn-32k,微调后的模型能适应32k长度(约4万汉字)的中英问答,相较于之前的通过位置插值得到的32k模型,几乎完全解决了多文档问答任务下召回率低(即 lost in middle 现象)的问题。
|
15 |
<br>
|
16 |
<br>
|
|
|
11 |
---
|
12 |
**Read this in other languages: [English](README_en.md), [中文](README.md).**
|
13 |
|
14 |
+
* 2023.12.16更新:发布[论文(中文版)](https://cloud.tsinghua.edu.cn/d/5894ec4442e54a6aac96/)
|
15 |
* 2023.12.14更新:发布经过微调的Qwen-14b-chat-yarn-32k,微调后的模型能适应32k长度(约4万汉字)的中英问答,相较于之前的通过位置插值得到的32k模型,几乎完全解决了多文档问答任务下召回率低(即 lost in middle 现象)的问题。
|
16 |
<br>
|
17 |
<br>
|
README_en.md
CHANGED
@@ -11,6 +11,7 @@ pipeline_tag: text-generation
|
|
11 |
---
|
12 |
**Read this in other languages: [English](README_en.md), [中文](README.md).**
|
13 |
|
|
|
14 |
* Updated on December 14, 2023: We have released the Qwen-14b-chat-yarn-32k model, which has been fine-tuned to handle Chinese and English question-answering tasks with a length of up to 32k (approximately 40,000 Chinese characters). This model addresses the low recall issue in multi-document question-answering tasks (also known as the "lost in middle" phenomenon) that was present in the previous 32k model obtained through position interpolation. <br>
|
15 |
<br>
|
16 |
# Evaluation results in LongBench
|
|
|
11 |
---
|
12 |
**Read this in other languages: [English](README_en.md), [中文](README.md).**
|
13 |
|
14 |
+
* Updated on December 16, 2023: Release [Paper (Chinese)](https://cloud.tsinghua.edu.cn/d/5894ec4442e54a6aac96/)
|
15 |
* Updated on December 14, 2023: We have released the Qwen-14b-chat-yarn-32k model, which has been fine-tuned to handle Chinese and English question-answering tasks with a length of up to 32k (approximately 40,000 Chinese characters). This model addresses the low recall issue in multi-document question-answering tasks (also known as the "lost in middle" phenomenon) that was present in the previous 32k model obtained through position interpolation. <br>
|
16 |
<br>
|
17 |
# Evaluation results in LongBench
|