Transformers documentation

MPNet

Transformers

You are viewing v4.24.0 version. A newer version v4.57.1 is available.

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

MPNet

Overview

The MPNet model was proposed in MPNet: Masked and Permuted Pre-training for Language Understanding by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.

MPNet adopts a novel pre-training method, named masked and permuted language modeling, to inherit the advantages of masked language modeling and permuted language modeling for natural language understanding.

The abstract from the paper is the following:

BERT adopts masked language modeling (MLM) for pre-training and is one of the most successful pre-training models. Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem. However, XLNet does not leverage the full position information of a sentence and thus suffers from position discrepancy between pre-training and fine-tuning. In this paper, we propose MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations. MPNet leverages the dependency among predicted tokens through permuted language modeling (vs. MLM in BERT), and takes auxiliary position information as input to make the model see a full sentence and thus reducing the position discrepancy (vs. PLM in XLNet). We pre-train MPNet on a large-scale dataset (over 160GB text corpora) and fine-tune on a variety of down-streaming tasks (GLUE, SQuAD, etc). Experimental results show that MPNet outperforms MLM and PLM by a large margin, and achieves better results on these tasks compared with previous state-of-the-art pre-trained methods (e.g., BERT, XLNet, RoBERTa) under the same model setting.

Tips:

MPNet doesn’t have token_type_ids, you don’t need to indicate which token belongs to which segment. just separate your segments with the separation token tokenizer.sep_token (or [sep]).

The original code can be found here.

Transformers

MPNet

Overview

MPNetConfig

class transformers.MPNetConfig

MPNetTokenizer

class transformers.MPNetTokenizer

build_inputs_with_special_tokens

get_special_tokens_mask

create_token_type_ids_from_sequences

save_vocabulary

MPNetTokenizerFast

class transformers.MPNetTokenizerFast

create_token_type_ids_from_sequences

MPNetModel

class transformers.MPNetModel

forward

MPNetForMaskedLM

class transformers.MPNetForMaskedLM

forward

MPNetForSequenceClassification

class transformers.MPNetForSequenceClassification

forward

MPNetForMultipleChoice

class transformers.MPNetForMultipleChoice

forward

MPNetForTokenClassification

class transformers.MPNetForTokenClassification

forward

MPNetForQuestionAnswering

class transformers.MPNetForQuestionAnswering

forward

TFMPNetModel

class transformers.TFMPNetModel

call

TFMPNetForMaskedLM

class transformers.TFMPNetForMaskedLM

call

TFMPNetForSequenceClassification

class transformers.TFMPNetForSequenceClassification

call

TFMPNetForMultipleChoice

class transformers.TFMPNetForMultipleChoice

call

TFMPNetForTokenClassification

class transformers.TFMPNetForTokenClassification

call

TFMPNetForQuestionAnswering

class transformers.TFMPNetForQuestionAnswering

call