Update README.md
#2
by
ifurther
- opened
README.md
CHANGED
@@ -1,3 +1,7 @@
|
|
|
|
|
|
|
|
|
|
1 |
# 中文预训练Longformer模型 | Longformer_ZH with PyTorch
|
2 |
|
3 |
相比于Transformer的O(n^2)复杂度,Longformer提供了一种以线性复杂度处理最长4K字符级别文档序列的方法。Longformer Attention包括了标准的自注意力与全局注意力机制,方便模型更好地学习超长序列的信息。
|
@@ -37,8 +41,8 @@ LongformerZhForMaksedLM.from_pretrained('ValkyriaLenneth/longformer_zh')
|
|
37 |
## 关于预训练 | About Pretraining
|
38 |
- 我们的预训练语料来自 https://github.com/brightmart/nlp_chinese_corpus, 根据Longformer原文的设置,采用了多种语料混合的预训练数据。
|
39 |
- The corpus of pretraining is from https://github.com/brightmart/nlp_chinese_corpus. Based on the paper of Longformer, we use a mixture of 4 different chinese corpus for pretraining.
|
40 |
-
- 我们的模型是基于Roberta_zh_mid
|
41 |
-
- The basement of our model is Roberta_zh_mid
|
42 |
|
43 |
- 同时我们在原版基础上,引入了 `Whole-Word-Masking` 机制,以便更好地适应中文特性。
|
44 |
- We introduce `Whole-Word-Masking` method into pretraining for better fitting Chinese language.
|
@@ -97,6 +101,4 @@ LongformerZhForMaksedLM.from_pretrained('ValkyriaLenneth/longformer_zh')
|
|
97 |
## 致谢
|
98 |
感谢东京工业大学 奥村·船越研究室 提供算力。
|
99 |
|
100 |
-
Thanks Okumula·Funakoshi Lab from Tokyo Institute of Technology who provides the devices and oppotunity for me to finish this project.
|
101 |
-
|
102 |
-
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- zh
|
4 |
+
---
|
5 |
# 中文预训练Longformer模型 | Longformer_ZH with PyTorch
|
6 |
|
7 |
相比于Transformer的O(n^2)复杂度,Longformer提供了一种以线性复杂度处理最长4K字符级别文档序列的方法。Longformer Attention包括了标准的自注意力与全局注意力机制,方便模型更好地学习超长序列的信息。
|
|
|
41 |
## 关于预训练 | About Pretraining
|
42 |
- 我们的预训练语料来自 https://github.com/brightmart/nlp_chinese_corpus, 根据Longformer原文的设置,采用了多种语料混合的预训练数据。
|
43 |
- The corpus of pretraining is from https://github.com/brightmart/nlp_chinese_corpus. Based on the paper of Longformer, we use a mixture of 4 different chinese corpus for pretraining.
|
44 |
+
- 我们的模型是基于[Roberta_zh_mid](https://github.com/brightmart/roberta_zh),训练脚本参考https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb
|
45 |
+
- The basement of our model is [Roberta_zh_mid](https://github.com/brightmart/roberta_zh). Pretraining scripts is modified from https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb.
|
46 |
|
47 |
- 同时我们在原版基础上,引入了 `Whole-Word-Masking` 机制,以便更好地适应中文特性。
|
48 |
- We introduce `Whole-Word-Masking` method into pretraining for better fitting Chinese language.
|
|
|
101 |
## 致谢
|
102 |
感谢东京工业大学 奥村·船越研究室 提供算力。
|
103 |
|
104 |
+
Thanks Okumula·Funakoshi Lab from Tokyo Institute of Technology who provides the devices and oppotunity for me to finish this project.
|
|
|
|