ssa-perin / README.md
larkkin's picture
Update README.md
5305747 verified
metadata
license: apache-2.0
language:
  - 'no'
pipeline_tag: token-classification
model-index:
  - name: SSA-Perin
    results:
      - task:
          type: structured sentiment analysis
        dataset:
          name: NoReC
          type: NoReC
        metrics:
          - name: Unlabeled sentiment tuple F1
            type: Unlabeled sentiment tuple F1
            value: 44.12%
          - name: Target F1
            type: Target F1
            value: 56.44%
          - name: Relative polarity precision
            type: Relative polarity precision
            value: 93.19%

This repository contains a pretrained model (and an easy-to-run wrapper for it) for structured sentiment analysis in Norwegian language, pre-trained on the NoReC_fine dataset. This is an implementation of the method described in

@misc{samuel2022direct,
      title={Direct parsing to sentiment graphs},
      author={David Samuel and Jeremy Barnes and Robin Kurtz and Stephan Oepen and Lilja Øvrelid and Erik Velldal},
      year={2022},
      eprint={2203.13209},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

The main repository that also contains the scripts for training the model, can be found on the project github. The model is also available in the form of a HF space.

The sentiment graph model is based on an underlying masked language model – NorBERT 2. The proposed method suggests three different ways to encode the sentiment graph: "node-centric", "labeled-edge", and "opinion-tuple". The current model

  • uses "labeled-edge" graph encoding
  • does not use character-level embedding
  • all other hyperparameters are set to default values , and it achieves the following results on the held-out set of the dataset:
Unlabeled sentiment tuple F1 Target F1 Relative polarity precision
0.434 0.541 0.926

The model can be easily used for predicting sentiment tuples as follows:

>>> import model_wrapper
>>> model = model_wrapper.PredictionModel()
>>> model.predict(['vi liker svart kaffe'])
[{'sent_id': '0',
  'text': 'vi liker svart kaffe',
  'opinions': [{'Source': [['vi'], ['0:2']],
    'Target': [['svart', 'kaffe'], ['9:14', '15:20']],
    'Polar_expression': [['liker'], ['3:8']],
    'Polarity': 'Positive'}]}]