Edit model card
YAML Metadata Error: "datasets[1]" with value "chemical patent" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "datasets[2]" with value "cooking recipe" is not valid. If possible, use a dataset id from https://hf.co/datasets.

ProcBERT

ProcBERT is a pre-trained language model specifically for procedural text. It was pre-trained on a large-scale procedural corpus (PubMed articles/chemical patents/cooking recipes) containing over 12B tokens and shows great performance on downstream tasks. More details can be found in the following paper:

@inproceedings{bai-etal-2021-pre,
    title = "Pre-train or Annotate? Domain Adaptation with a Constrained Budget",
    author = "Bai, Fan  and
              Ritter, Alan  and
              Xu, Wei",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
}

Usage

from transformers import *
tokenizer = AutoTokenizer.from_pretrained("fbaigt/procbert")
model = AutoModelForTokenClassification.from_pretrained("fbaigt/procbert")

More usage details can be found here.

Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.