Update README.md
Browse files
README.md
CHANGED
@@ -21,15 +21,15 @@ Good at solving text summarization tasks, after fine-tuning on multiple Chinese
|
|
21 |
|
22 |
| 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
|
23 |
| :----: | :----: | :----: | :----: | :----: | :----: |
|
24 |
-
| 通用 General | 自然语言转换 NLT | 燃灯 Randeng | PEFASUS |
|
25 |
|
26 |
## 模型信息 Model Information
|
27 |
|
28 |
参考论文:[PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf)
|
29 |
|
30 |
-
基于[Randeng-Pegasus-523M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-Pegasus-523M-Chinese),我们在收集的7个中文领域的文本摘要数据集(约4M
|
31 |
|
32 |
-
Based on [Randeng-Pegasus-523M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-Pegasus-523M-Chinese), we fine-tuned a text summarization version (summary) on 7 Chinese text summarization datasets, with totaling around 4M samples. The datasets include: education, new2016zh, nlpcc, shence, sohu, thucnews and weibo.
|
33 |
|
34 |
|
35 |
## 使用 Usage
|
|
|
21 |
|
22 |
| 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
|
23 |
| :----: | :----: | :----: | :----: | :----: | :----: |
|
24 |
+
| 通用 General | 自然语言转换 NLT | 燃灯 Randeng | PEFASUS | 523M | 文本摘要任务-中文 Summary-Chinese |
|
25 |
|
26 |
## 模型信息 Model Information
|
27 |
|
28 |
参考论文:[PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf)
|
29 |
|
30 |
+
基于[Randeng-Pegasus-523M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-Pegasus-523M-Chinese),我们在收集的7个中文领域的文本摘要数据集(约4M个样本),使用实体过滤后数据集(约1.8M)重新微调,在不损伤下游指标的情况下提升了摘要对原文的忠实度,得到了summary-v1版本。这7个数据集为:education, new2016zh, nlpcc, shence, sohu, thucnews和weibo。
|
31 |
|
32 |
+
Based on [Randeng-Pegasus-523M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-Pegasus-523M-Chinese), we fine-tuned a text summarization version (summary-v1) on a filted dataset(1.8M), which we use entitys to filter these 7 Chinese text summarization datasets, with totaling around 4M samples. We can improve the faithfulness of summaries without damage to the downstream task, eg Rouge-L on lcsts. The datasets include: education, new2016zh, nlpcc, shence, sohu, thucnews and weibo.
|
33 |
|
34 |
|
35 |
## 使用 Usage
|