Spaces:
Runtime error
Runtime error
<!--Copyright 2021 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# RoFormer | |
## Overview | |
The RoFormer model was proposed in [RoFormer: Enhanced Transformer with Rotary Position Embedding](https://arxiv.org/pdf/2104.09864v1.pdf) by Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu. | |
The abstract from the paper is the following: | |
*Position encoding in transformer architecture provides supervision for dependency modeling between elements at | |
different positions in the sequence. We investigate various methods to encode positional information in | |
transformer-based language models and propose a novel implementation named Rotary Position Embedding(RoPE). The | |
proposed RoPE encodes absolute positional information with rotation matrix and naturally incorporates explicit relative | |
position dependency in self-attention formulation. Notably, RoPE comes with valuable properties such as flexibility of | |
being expand to any sequence lengths, decaying inter-token dependency with increasing relative distances, and | |
capability of equipping the linear self-attention with relative position encoding. As a result, the enhanced | |
transformer with rotary position embedding, or RoFormer, achieves superior performance in tasks with long texts. We | |
release the theoretical analysis along with some preliminary experiment results on Chinese data. The undergoing | |
experiment for English benchmark will soon be updated.* | |
Tips: | |
- RoFormer is a BERT-like autoencoding model with rotary position embeddings. Rotary position embeddings have shown | |
improved performance on classification tasks with long texts. | |
This model was contributed by [junnyu](https://huggingface.co/junnyu). The original code can be found [here](https://github.com/ZhuiyiTechnology/roformer). | |
## Documentation resources | |
- [Text classification task guide](../tasks/sequence_classification) | |
- [Token classification task guide](../tasks/token_classification) | |
- [Question answering task guide](../tasks/question_answering) | |
- [Causal language modeling task guide](../tasks/language_modeling) | |
- [Masked language modeling task guide](../tasks/masked_language_modeling) | |
- [Multiple choice task guide](../tasks/multiple_choice) | |
## RoFormerConfig | |
[[autodoc]] RoFormerConfig | |
## RoFormerTokenizer | |
[[autodoc]] RoFormerTokenizer | |
- build_inputs_with_special_tokens | |
- get_special_tokens_mask | |
- create_token_type_ids_from_sequences | |
- save_vocabulary | |
## RoFormerTokenizerFast | |
[[autodoc]] RoFormerTokenizerFast | |
- build_inputs_with_special_tokens | |
## RoFormerModel | |
[[autodoc]] RoFormerModel | |
- forward | |
## RoFormerForCausalLM | |
[[autodoc]] RoFormerForCausalLM | |
- forward | |
## RoFormerForMaskedLM | |
[[autodoc]] RoFormerForMaskedLM | |
- forward | |
## RoFormerForSequenceClassification | |
[[autodoc]] RoFormerForSequenceClassification | |
- forward | |
## RoFormerForMultipleChoice | |
[[autodoc]] RoFormerForMultipleChoice | |
- forward | |
## RoFormerForTokenClassification | |
[[autodoc]] RoFormerForTokenClassification | |
- forward | |
## RoFormerForQuestionAnswering | |
[[autodoc]] RoFormerForQuestionAnswering | |
- forward | |
## TFRoFormerModel | |
[[autodoc]] TFRoFormerModel | |
- call | |
## TFRoFormerForMaskedLM | |
[[autodoc]] TFRoFormerForMaskedLM | |
- call | |
## TFRoFormerForCausalLM | |
[[autodoc]] TFRoFormerForCausalLM | |
- call | |
## TFRoFormerForSequenceClassification | |
[[autodoc]] TFRoFormerForSequenceClassification | |
- call | |
## TFRoFormerForMultipleChoice | |
[[autodoc]] TFRoFormerForMultipleChoice | |
- call | |
## TFRoFormerForTokenClassification | |
[[autodoc]] TFRoFormerForTokenClassification | |
- call | |
## TFRoFormerForQuestionAnswering | |
[[autodoc]] TFRoFormerForQuestionAnswering | |
- call | |
## FlaxRoFormerModel | |
[[autodoc]] FlaxRoFormerModel | |
- __call__ | |
## FlaxRoFormerForMaskedLM | |
[[autodoc]] FlaxRoFormerForMaskedLM | |
- __call__ | |
## FlaxRoFormerForSequenceClassification | |
[[autodoc]] FlaxRoFormerForSequenceClassification | |
- __call__ | |
## FlaxRoFormerForMultipleChoice | |
[[autodoc]] FlaxRoFormerForMultipleChoice | |
- __call__ | |
## FlaxRoFormerForTokenClassification | |
[[autodoc]] FlaxRoFormerForTokenClassification | |
- __call__ | |
## FlaxRoFormerForQuestionAnswering | |
[[autodoc]] FlaxRoFormerForQuestionAnswering | |
- __call__ | |