ltg
/

ssa-perin

Token Classification

Model card Files Files and versions Community

ssa-perin / README.md

larkkin's picture

Add code and readme

c45d283 5 months ago

|

No virus

2.72 kB

	---
	license: apache-2.0
	datasets:
	- ltg/norec
	language:
	- 'no'
	pipeline_tag: token-classification


	model-index:
	- name: SSA-Perin
	results:
	- task:
	type: structured sentiment analysis
	dataset:
	name: NoReC
	type: NoReC
	metrics:
	- name: Unlabeled sentiment tuple F1
	type: Unlabeled sentiment tuple F1
	value: 44.12%
	- name: Target F1
	type: Target F1
	value: 56.44%
	- name: Relative polarity precision
	type: Relative polarity precision
	value: 93.19%
	---



	This repository contains a pretrained model (and an easy-to-run wrapper for it) for structured sentiment analysis in Norwegian language, pre-trained on the [NoReC_fine dataset](https://github.com/ltgoslo/norec_fine).
	This is an implementation of the method described in
	```bibtex
	@misc{samuel2022direct,
	title={Direct parsing to sentiment graphs},
	author={David Samuel and Jeremy Barnes and Robin Kurtz and Stephan Oepen and Lilja Øvrelid and Erik Velldal},
	year={2022},
	eprint={2203.13209},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```
	The main repository that also contains the scripts for training the model, can be found on the project [github](https://github.com/jerbarnes/direct_parsing_to_sent_graph).
	The model is also available in the form of a [HF space](https://huggingface.co/spaces/ltg/ssa-perin).


	The sentiment graph model is based on an underlying masked language model – [NorBERT 2](https://huggingface.co/ltg/norbert2).
	The proposed method suggests three different ways to encode the sentiment graph: "node-centric", "labeled-edge", and "opinion-tuple".
	The current model
	- uses "labeled-edge" graph encoding
	- does not use character-level embedding
	- all other hyperparameters are set to [default values](https://github.com/jerbarnes/direct_parsing_to_sent_graph/blob/main/perin/config/edge_norec.yaml)
	, and it achieves the following results on the held-out set of the dataset:

	\| Unlabeled sentiment tuple F1 \| Target F1 \| Relative polarity precision \|
	\|:----------------------------:\|:----------:\|:---------------------------:\|
	\| 0.434 \| 0.541 \| 0.926 \|


	The model can be easily used for predicting sentiment tuples as follows:

	```python
	>>> import model_wrapper
	>>> model = model_wrapper.PredictionModel()
	>>> model.predict(['vi liker svart kaffe'])
	[{'sent_id': '0',
	'text': 'vi liker svart kaffe',
	'opinions': [{'Source': [['vi'], ['0:2']],
	'Target': [['svart', 'kaffe'], ['9:14', '15:20']],
	'Polar_expression': [['liker'], ['3:8']],
	'Polarity': 'Positive'}]}]
	```