Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.


MLRC (Medical, Legal, Regulatory, and Compliance) takes weeks (and sometimes months) to review any consumer-facing content submitted by marketing agencies e.g., website text, Facebook ads, Instagram posts, TV ads, etc. Content could be text, audio, images, or video. This review process involving tens of people from medical, legal, regulatory, and compliance results in slow releases of ad campaigns or website content to consumers. Because there are thousands of content jobs for MLRC to review monthly, this backlog reduces the amount of time for them to actually do their day jobs. And these review jobs are increasing monthly with pressure being put on them to speed up reviews.


Inabia AI will reduce the review time from weeks to days by front-loading the review on the text content creators e.g., marketing agencies using a Grammarly-like web UI that will do four levels of review (similar to what MLRC reviewers conduct on actual content).

Level-1-Review (Detection)

Find the location of problem sentences/clauses aka error detection.

Fine-tuned BERT-large on MLRC dataset

This custom BERT-large was fine-tuned on MLRC balanced dataset.

Model description

BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with two objectives:

  • Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the model and has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT which internally masks the future tokens. It allows the model to learn a bidirectional representation of the sentence.
  • Next sentence prediction (NSP): the models concatenates two masked sentences as inputs during pretraining. Sometimes they correspond to sentences that were next to each other in the original text, sometimes not. The model then has to predict if the two sentences were following each other or not.

This way, the model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard classifier using the features produced by the BERT model as inputs.

The detailed release history can be found on the google-research/bert readme on github.

Model #params Language
bert-base-uncased 110M English
bert-large-uncased 340M English
bert-base-cased 110M English
bert-large-cased 340M English
bert-base-chinese 110M Chinese
bert-base-multilingual-cased 110M Multiple
bert-large-uncased-whole-word-masking 340M English
bert-large-cased-whole-word-masking 340M English

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('Inabia-AI/bert-large-uncased-mlrc')
model = BertModel.from_pretrained("Inabia-AI/bert-large-uncased-mlrc")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

and in TensorFlow:

from transformers import BertTokenizer, TFBertModel
tokenizer = BertTokenizer.from_pretrained('Inabia-AI/bert-large-uncased-mlrc')
model = TFBertModel.from_pretrained("Inabia-AI/bert-large-uncased-mlrc")
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

Evaluation results

When fine-tuned on downstream tasks (text classification), this model achieves the following results:

Training dataset TP:TN # of TPs # of TNs Precision Recall F1
MLRC 1:1 200 200 64% 55% 50%
Downloads last month