Edit model card

AntiBERTa2 🧬

AntiBERTa2 is an antibody-specific language model based on the RoFormer model - it is pre-trained using masked language modelling. We also provide a multimodal version of AntiBERTa2, AntiBERTa2-CSSP, that has been trained using a contrastive objective, similar to the CLIP method. Further details on both AntiBERTa2 and AntiBERTa2-CSSP are described in our paper accepted at the NeurIPS MLSB Workshop 2023.

Both AntiBERTa2 models are only available for non-commercial use. Output antibody sequences (e.g. from infilling via masked language models) can only be used for non-commercial use. For any users seeking commercial use of our model and generated antibodies, please reach out to us at info@alchemab.com.

Model variant Parameters Config
AntiBERTa2 202M 16L, 16H, 1024d
AntiBERTa2-CSSP 202M 16L, 16H, 1024d

Example usage

>>> from transformers import (
        RoFormerForMaskedLM, 
        RoFormerTokenizer, 
        pipeline, 
        RoFormerForSequenceClassification
    )
>>> tokenizer = RoFormerTokenizer.from_pretrained("alchemab/antiberta2")
>>> model = RoFormerForMaskedLM.from_pretrained("alchemab/antiberta2")

>>> filler = pipeline(model=model, tokenizer=tokenizer)
>>> filler("Ḣ Q V Q ... C A [MASK] D ... T V S S") # fill in the mask

>>> new_model = RoFormerForSequenceClassification.from_pretrained(
            "alchemab/antiberta2") # this will of course raise warnings 
                                   # that a new linear layer will be added 
                                   # and randomly initialized
Downloads last month
4,707