markussagen
commited on
Commit
•
c03980e
1
Parent(s):
adb08df
Improved README
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
-
## XLM-R Longformer Model
|
2 |
-
XLM-R Longformer is a XLM-R model
|
3 |
|
4 |
The reason for this was to investigate methods for creating efficient Transformers for low-resource languages, such as Swedish, without the need to pre-train them on long-context datasets in each respecitve language. The trained model came as a result of a master thesis project at [Peltarion](https://peltarion.com/) and was fine-tuned on multilingual quesion-answering tasks, with code available [here](https://github.com/MarkusSagen/Master-Thesis-Multilingual-Longformer#xlm-r).
|
5 |
|
|
|
1 |
+
## XLM-R Longformer Model / XLM-Long
|
2 |
+
XLM-R Longformer (or XLM-Long for short) is a XLM-R model that has been extended to allow sequence lengths up to 4096 tokens, instead of the regular 512. The model was pre-trained from the XLM-RoBERTa checkpoint using the Longformer [pre-training scheme](https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb) on the English WikiText-103 corpus.
|
3 |
|
4 |
The reason for this was to investigate methods for creating efficient Transformers for low-resource languages, such as Swedish, without the need to pre-train them on long-context datasets in each respecitve language. The trained model came as a result of a master thesis project at [Peltarion](https://peltarion.com/) and was fine-tuned on multilingual quesion-answering tasks, with code available [here](https://github.com/MarkusSagen/Master-Thesis-Multilingual-Longformer#xlm-r).
|
5 |
|