Model Card for roberta-base-cuad

Model Details

Model Description

Developed by: Hendrycks et al.
Model type: Question Answering
Language(s) (NLP): en
License: cc-by-4.0
Related Models:
- Parent Model: RoBERTa
Resources for more information:
- GitHub Repo: TheAtticusProject
- Associated Paper: CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
- Project website: Contract Understanding Atticus Dataset (CUAD)

Uses

Direct Use

This model can be used for the task of Question Answering on Legal Documents.

Training Details

Read: CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review for detailed information on training procedure, dataset preprocessing and evaluation.

Training Data, Procedure, Preprocessing, etc.

See CUAD dataset card for more information.

Evaluation

Testing Data, Factors & Metrics

Testing Data

See CUAD dataset card for more information.

Software

Python, Transformers

Citation

BibTeX:

@article{hendrycks2021cuad,
     title={CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review}, 
     author={Dan Hendrycks and Collin Burns and Anya Chen and Spencer Ball},
     journal={NeurIPS},
     year={2021}
}

How to Get Started with the Model

Use the code below to get started with the model.

Click to expand

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
 
tokenizer = AutoTokenizer.from_pretrained("mgigena/cuad-roberta-base")
 
model = AutoModelForQuestionAnswering.from_pretrained("mgigena/cuad-roberta-base")

mgigena
/

roberta-base-cuad