rcds
/

neg-xlm-roberta-base

Token Classification

Inference Endpoints

Model card Files Files and versions Community

neg-xlm-roberta-base / README.md

ramonachristen's picture

add citation

aeda5ed 8 months ago

|

raw history blame contribute delete

No virus

2.69 kB

	---
	datasets:
	- rcds/MultiLegalNeg
	language:
	- de
	- fr
	- it
	- en
	tags:
	- legal
	---

	# Model Card for joelito/legal-swiss-longformer-base

	This model is based on [XLM-R-Base](https://huggingface.co/xlm-roberta-base).
	It was pretrained on negation scope resolution using [NegBERT](https://github.com/adityak6798/Transformers-For-Negation-and-Speculation/blob/master/Transformers_for_Negation_and_Speculation.ipynb) ([Khandelwal and Sawant 2020](https://arxiv.org/abs/1911.04211))
	For training we used the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg), a multilingual dataset of legal data annotated for negation cues and scopes, ConanDoyle-neg ([
	Morante and Blanco. 2012](https://aclanthology.org/S12-1035/)), SFU Review ([Konstantinova et al. 2012](http://www.lrec-conf.org/proceedings/lrec2012/pdf/533_Paper.pdf)), BioScope ([Szarvas et al. 2008](https://aclanthology.org/W08-0606/)) and Dalloux ([Dalloux et al. 2020](https://clementdalloux.fr/?page_id=28)).

	## Model Details

	### Model Description

	- Model type: Transformer-based language model (XLM-R-base)
	- Languages: de, fr, it, en
	- License: CC BY-SA
	- Finetune Task: Negation Scope Resolution

	## Uses

	See [LegalNegBERT](https://github.com/RamonaChristen/Multilingual_Negation_Scope_Resolution_on_Legal_Data/blob/main/LegalNegBERT) for details on the training process and how to use this model.

	### Recommendations

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.


	### Training Data

	This model was pretrained on the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg)

	## Evaluation

	We evaluate neg-xlm-roberta-base on the test sets in the [Multi Legal Neg Dataset](https://huggingface.co/datasets/rcds/MultiLegalNeg).
	\| \_Test Dataset \| F1-score \|
	\| :------------------------- \| :-------- \|
	\| fr \| 92.49 \|
	\| it \| 88.81 \|
	\| de (DE) \| 95.66 \|
	\| de (CH) \| 87.82 \|
	\| SFU Review \| 88.53 \|
	\| ConanDoyle-neg \| 90.47 \|
	\| BioScope \| 95.59 \|
	\| Dalloux \| 93.99 \|


	#### Software

	pytorch, transformers.

	## Citation
	Please cite the following preprint:

	```
	@misc{christen2023resolving,
	title={Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents},
	author={Ramona Christen and Anastassia Shaitarova and Matthias Stürmer and Joel Niklaus},
	year={2023},
	eprint={2309.08695},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```