random-roberta-mini

We introduce random-roberta-mini, which is a unpretrained version of a mini RoBERTa model(4 layer and 256 heads). The weight of random-roberta-mini is randomly initiated and this can be particularly useful when we aim to train a language model from scratch or benchmark the effect of pretraining.

It's important to note that tokenizer of random-roberta-mini is the same as roberta-base because it's not a trivial task to get a random tokenizer and it's less meaningful compared to the random weight.

A debatable advantage of pulling random-roberta-mini from Huggingface is to avoid using random seed in order to obtain the same randomness at each time.

The code to obtain such random model:


from transformers import RobertaConfig, RobertaModel

    
def get_custom_blank_roberta(h=768, l=12):

    # Initializing a RoBERTa configuration
    configuration = RobertaConfig(num_attention_heads=h, num_hidden_layers=l)

    # Initializing a model from the configuration
    model = RobertaModel(configuration)

    return model

rank="mini"
h=256
l=4
model_type = "roberta"
tokenizer = AutoTokenizer.from_pretrained("roberta-base")
model_name ="random-"+model_type+"-"+rank
model = get_custom_blank_roberta(h, l)
Downloads last month
2
Hosted inference API

Unable to determine this model’s pipeline type. Check the docs .