mrm8488 commited on
Commit
4b323c3
1 Parent(s): c949892

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - es
4
+ license: mit
5
+ widget:
6
+ - text: "Manuel Romero ha creado con el equipo de BERTIN un modelo que procesa documentos <mask> largos."
7
+ tags:
8
+ - longformer
9
+ - bertin
10
+ - spanish
11
+ datasets:
12
+ - spanish_large_corpus
13
+
14
+ ---
15
+
16
+ # longformer-base-4096-spanish
17
+
18
+ ## [Longformer](https://arxiv.org/abs/2004.05150) is a Transformer model for long documents.
19
+
20
+ `longformer-base-4096` is a BERT-like model started from the RoBERTa checkpoint (**BERTIN** in this case) and pre-trained for *MLM* on long documents (from BETO's `all_wikis`). It supports sequences of length up to 4,096!
21
+
22
+
23
+
24
+ **Longformer** uses a combination of a sliding window (*local*) attention and *global* attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
25
+
26
+
27
+ This model was made following the research done by [Iz Beltagy and Matthew E. Peters and Arman Cohan](https://arxiv.org/abs/2004.05150).