PMLM is the language model described in [Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order](https://arxiv.org/abs/2004.11579), which is trained with probabilistic masking. This is the "PMLM-A" variant, adapted from [the authors' original implementation](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/PMLM).