metadata
tags:
- longformer
- xlmr
- XLM-RoBERTa
language: multilingual
license: apache-2.0
datasets:
- wikitext
XLM-R Longformer Model / XLM-Long
This is an XLM-RoBERTa longformer model that was pre-trained from the XLM-RoBERTa checkpoint using the Longformer pre-training scheme on the English WikiText-103 corpus.
This model is identical to markussagen's xlm-r longformer model, the difference being that the weights have been transferred to a Longformer model, in order to enable loading with AutoModel.from_pretrained()
without the need for external libraries.
How to Use
The model can be used as expected to fine-tune on a downstream task.
For instance for QA.
import torch
from transformers import AutoModel, AutoTokenizer
MAX_SEQUENCE_LENGTH = 4096
MODEL_NAME_OR_PATH = "AshtonIsNotHere/xlm-roberta-long-base-4096"
tokenizer = AutoTokenizer.from_pretrained(
MODEL_NAME_OR_PATH,
max_length=MAX_SEQUENCE_LENGTH,
padding="max_length",
truncation=True,
)
model = AutoModelForQuestionAnswering.from_pretrained(
MODEL_NAME_OR_PATH,
max_length=MAX_SEQUENCE_LENGTH,
)