Model Card for joelito/legal-swiss-longformer-base

This model is based on XLM-R-Base. It was pretrained on negation scope resolution using NegBERT (Khandelwal and Sawant 2020) For training we used the Multi Legal Neg Dataset, a multilingual dataset of legal data annotated for negation cues and scopes, ConanDoyle-neg ( Morante and Blanco. 2012), SFU Review (Konstantinova et al. 2012), BioScope (Szarvas et al. 2008) and Dalloux (Dalloux et al. 2020).

Model Details

Model Description

Model type: Transformer-based language model (XLM-R-base)
Languages: de, fr, it, en
License: CC BY-SA
Finetune Task: Negation Scope Resolution

Uses

See LegalNegBERT for details on the training process and how to use this model.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training Data

This model was pretrained on the Multi Legal Neg Dataset

Evaluation

We evaluate neg-xlm-roberta-base on the test sets in the Multi Legal Neg Dataset.

_Test Dataset	F1-score
fr	92.49
it	88.81
de (DE)	95.66
de (CH)	87.82
SFU Review	88.53
ConanDoyle-neg	90.47
BioScope	95.59
Dalloux	93.99

Software

pytorch, transformers.

Citation

Please cite the following preprint:

@misc{christen2023resolving,
      title={Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents}, 
      author={Ramona Christen and Anastassia Shaitarova and Matthias Stürmer and Joel Niklaus},
      year={2023},
      eprint={2309.08695},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

rcds
/

neg-xlm-roberta-base