PrivBERT

PrivBERT is a privacy policy language model. We pre-trained PrivBERT on ~1 million privacy policies starting with the pretrained Roberta model. The data is available at https://privaseer.ist.psu.edu/data

Usage

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("mukund/privbert")
model = AutoModel.from_pretrained("mukund/privbert")

License

If you use this dataset in research, you must cite the below paper.

Mukund Srinath, Shomir Wilson and C. Lee Giles. Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies. In Proc. ACL 2021.

For research, teaching, and scholarship purposes, the model is available under a CC BY-NC-SA license. Please contact us for any requests regarding commercial use.

New

Select AutoNLP in the “Train” menu to fine-tune this model automatically.

Downloads last month
1,684
Hosted inference API
Fill-Mask
Mask token: <mask>
Examples
Examples
This model can be loaded on the Inference API on-demand.