File size: 2,689 Bytes
381d030
 
 
 
 
 
 
 
 
 
 
bc10cac
381d030
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bc10cac
381d030
 
 
 
 
aeda5ed
381d030
 
aeda5ed
 
 
 
 
 
 
 
381d030
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
datasets:
- rcds/MultiLegalNeg
language:
- de
- fr
- it
- en
tags:
- legal
---

# Model Card for joelito/legal-swiss-longformer-base

This model is based on [XLM-R-Base](https://huggingface.co/xlm-roberta-base). 
It was pretrained on negation scope resolution using [NegBERT](https://github.com/adityak6798/Transformers-For-Negation-and-Speculation/blob/master/Transformers_for_Negation_and_Speculation.ipynb) ([Khandelwal and Sawant 2020](https://arxiv.org/abs/1911.04211))
For training we used the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg), a multilingual dataset of legal data annotated for negation cues and scopes, ConanDoyle-neg ([
Morante and Blanco. 2012](https://aclanthology.org/S12-1035/)), SFU Review ([Konstantinova et al. 2012](http://www.lrec-conf.org/proceedings/lrec2012/pdf/533_Paper.pdf)), BioScope ([Szarvas et al. 2008](https://aclanthology.org/W08-0606/)) and Dalloux ([Dalloux et al. 2020](https://clementdalloux.fr/?page_id=28)).

## Model Details

### Model Description

- **Model type:** Transformer-based language model (XLM-R-base)
- **Languages:** de, fr, it, en
- **License:** CC BY-SA
- **Finetune Task:** Negation Scope Resolution

## Uses

See [LegalNegBERT](https://github.com/RamonaChristen/Multilingual_Negation_Scope_Resolution_on_Legal_Data/blob/main/LegalNegBERT) for details on the training process and how to use this model. 

### Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.


### Training Data

This model was pretrained on the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg)

## Evaluation

We evaluate neg-xlm-roberta-base on the test sets in the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg).
| \_Test Dataset             | F1-score | 
| :------------------------- | :-------- | 
| fr                         | 92.49     |
| it                         | 88.81     |
| de (DE)                    | 95.66     |
| de (CH)                    | 87.82     |
| SFU Review                 | 88.53     |
| ConanDoyle-neg             | 90.47     |
| BioScope                   | 95.59     |
| Dalloux                    | 93.99     |


#### Software

pytorch, transformers.

## Citation
Please cite the following preprint:

```
@misc{christen2023resolving,
      title={Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents}, 
      author={Ramona Christen and Anastassia Shaitarova and Matthias Stürmer and Joel Niklaus},
      year={2023},
      eprint={2309.08695},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```