This is a masked language model that was trained on IMDB dataset using a finetuned DistilBERT model.

Note:

I published a tutorial explaining how transformers work and how to train a masked language model using transformer. https://olafenwaayoola.medium.com/the-concept-of-transformers-and-training-a-transformers-model-45a09ae7fb50

Rest API Code for Testing the Masked Language Model

Inference API python code for testing the masked language model.

import requests

API_URL = "https://api-inference.huggingface.co/models/ayoolaolafenwa/Masked-Language-Model"
headers = {"Authorization": "Bearer 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"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()
    
output = query({
    "inputs": "Washington DC is the [MASK] of USA.",
})
print(output[0]["sequence"])

Output

washington dc is the capital of usa.

It produces the correct output, washington dc is the capital of usa.

Load the Masked Language Model with Transformers

You can easily load the Language model with transformers using this code.

from transformers import AutoTokenizer, AutoModelForMaskedLM
import torch

tokenizer = AutoTokenizer.from_pretrained("ayoolaolafenwa/Masked-Language-Model")

model = AutoModelForMaskedLM.from_pretrained("ayoolaolafenwa/Masked-Language-Model") 

inputs = tokenizer("The internet [MASK] amazing.", return_tensors="pt")


with torch.no_grad():
    logits = model(**inputs).logits

# retrieve index of [MASK]
mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]

predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
output = tokenizer.decode(predicted_token_id)
print(output)

Output

is

It prints out the predicted masked word is.