init commit
Browse files
README.md
CHANGED
@@ -18,9 +18,9 @@ widget:
|
|
18 |
|
19 |
## 简介 Brief Introduction
|
20 |
|
21 |
-
在Randeng-T5-Char-57M的基础上,收集了100个左右的中文数据集,进行Text2Text统一范式的有监督任务预训练。
|
22 |
|
23 |
-
On the basis of Randeng-T5-Char-57M, about 100 Chinese datasets were collected and pre-trained for the supervised task of Text2Text unified paradigm.
|
24 |
|
25 |
## 模型分类 Model Taxonomy
|
26 |
|
@@ -33,9 +33,9 @@ On the basis of Randeng-T5-Char-57M, about 100 Chinese datasets were collected a
|
|
33 |
|
34 |
参考论文:[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](http://jmlr.org/papers/v21/20-074.html)
|
35 |
|
36 |
-
基于[Randeng-T5-Char-57M](https://huggingface.co/IDEA-CCNL/Randeng-T5-Char-57M),我们在收集的100+个中文领域的多任务数据集(从中采样了30w+个样本)上微调了它,得到了此多任务版本。这些多任务包括:情感分析,新闻分类,文本分类,意图识别,自然语言推理,多项选择,指代消解,抽取式阅读理解,实体识别,关键词抽取,生成式摘要。
|
37 |
|
38 |
-
Based on [Randeng-T5-Char-57M](https://huggingface.co/IDEA-CCNL/Randeng-T5-Char-57M), we fine-tuned it on a collection of 100+ multitasking datasets in Chinese domains (from which 30w+ samples were sampled) to obtain this multitasking version. These multitasks include: sentiment analysis, news classification, text classification, intention recognition, natural language inference, multiple choice, denotational disambiguation, extractive reading comprehension, entity recognition, keyword extraction, and generative summarization.
|
39 |
|
40 |
|
41 |
## 使用 Usage
|
@@ -54,7 +54,7 @@ model.resize_token_embeddings(len(tokenizer))
|
|
54 |
model.eval()
|
55 |
|
56 |
# tokenize
|
57 |
-
text = "
|
58 |
encode_dict = tokenizer(text, max_length=512, padding='max_length',truncation=True)
|
59 |
|
60 |
inputs = {
|
|
|
18 |
|
19 |
## 简介 Brief Introduction
|
20 |
|
21 |
+
在Randeng-T5-Char-57M-Chinese的基础上,收集了100个左右的中文数据集,进行Text2Text统一范式的有监督任务预训练。
|
22 |
|
23 |
+
On the basis of Randeng-T5-Char-57M-Chinese, about 100 Chinese datasets were collected and pre-trained for the supervised task of Text2Text unified paradigm.
|
24 |
|
25 |
## 模型分类 Model Taxonomy
|
26 |
|
|
|
33 |
|
34 |
参考论文:[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](http://jmlr.org/papers/v21/20-074.html)
|
35 |
|
36 |
+
基于[Randeng-T5-Char-57M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-T5-Char-57M-Chinese),我们在收集的100+个中文领域的多任务数据集(从中采样了30w+个样本)上微调了它,得到了此多任务版本。这些多任务包括:情感分析,新闻分类,文本分类,意图识别,自然语言推理,多项选择,指代消解,抽取式阅读理解,实体识别,关键词抽取,生成式摘要。
|
37 |
|
38 |
+
Based on [Randeng-T5-Char-57M-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-T5-Char-57M-Chinese), we fine-tuned it on a collection of 100+ multitasking datasets in Chinese domains (from which 30w+ samples were sampled) to obtain this multitasking version. These multitasks include: sentiment analysis, news classification, text classification, intention recognition, natural language inference, multiple choice, denotational disambiguation, extractive reading comprehension, entity recognition, keyword extraction, and generative summarization.
|
39 |
|
40 |
|
41 |
## 使用 Usage
|
|
|
54 |
model.eval()
|
55 |
|
56 |
# tokenize
|
57 |
+
text = "情感分析任务:【房间还是比较舒适的,酒店服务良好】这篇文章的情感态度是什么?正面/负面"
|
58 |
encode_dict = tokenizer(text, max_length=512, padding='max_length',truncation=True)
|
59 |
|
60 |
inputs = {
|