ltg
/

erikve commited on
Commit
a36345a
1 Parent(s): 65b1c32

Updated model card

Browse files
Files changed (1) hide show
  1. README.md +47 -17
README.md CHANGED
@@ -27,18 +27,36 @@ model-index:
27
  value: 93.19%
28
  ---
29
 
 
30
 
31
 
 
 
32
  We here release a pretrained model (and an easy-to-run wrapper) for structured sentiment analysis of Norwegian text, pre-trained on the [NoReC_fine dataset](https://github.com/ltgoslo/norec_fine).
33
- This is an implementation of the method described in the paper [Direct parsing to sentiment graphs](https://aclanthology.org/2022.acl-short.51/) by Samuel et al., 2022.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
- To see a demo of how it works, you can try the model in a [Hugging Face Space](https://huggingface.co/spaces/ltg/ssa-perin).
36
 
37
- ## Example usage
38
 
39
- The model attempts to identify the following components for a given sentence: source expressions (the opinion holder), target expressions (what the opinion is directed towards), polar expressions (the part of the text indicating that an opinion expressed), and finally the polarity (positive or negative). For more information about the definition of these concepts, please the paper [A Fine-grained Sentiment Dataset for Norwegian](https://aclanthology.org/2020.lrec-1.618/) by Øvrelid et al. 2020. For each identified expression, the character offsets in the text are also provided.
40
 
41
- Here is an eaxmple showing to use the model for predicting such sentiment tuples:
 
 
42
 
43
  ```python
44
  >>> import model_wrapper
@@ -52,26 +70,35 @@ Here is an eaxmple showing to use the model for predicting such sentiment tuple
52
  'Polarity': 'Positive'}]}]
53
  ```
54
 
 
 
 
 
 
55
 
56
- ## Details about the model configuration
 
57
 
58
- The method proposed by Samuel et al. 2022 suggests three different ways to encode the sentiment graph: "node-centric", "labeled-edge", and "opinion-tuple".
59
- The model released here
60
- - uses "labeled-edge" graph encoding,
61
- - does not use character-level embedding,
 
 
 
62
  - all other hyperparameters are set to [default values](https://github.com/jerbarnes/direct_parsing_to_sent_graph/blob/main/perin/config/edge_norec.yaml),
63
- - is trained on top of underlying masked language model [NorBERT 2](https://huggingface.co/ltg/norbert2).
64
 
65
- It achieves the following results on the held-out test set of NoReC_fine:
66
 
67
- | Unlabeled sentiment tuple F1 | Target F1 | Relative polarity precision |
68
- |:----------------------------:|:----------:|:---------------------------:|
69
- | 0.434 | 0.541 | 0.926 |
70
 
71
- The scripts used for training can be found on the [github](https://github.com/jerbarnes/direct_parsing_to_sent_graph) repository accompanying the paper by Samuel et al., 2022.
 
 
72
 
73
 
74
- ## Quote us
75
 
76
  If you use this model in your academic work, please quote the following paper:
77
  ```bibtex
@@ -85,3 +112,6 @@ If you use this model in your academic work, please quote the following paper:
85
  address = "Dublin, Ireland"
86
  }
87
  ```
 
 
 
 
27
  value: 93.19%
28
  ---
29
 
30
+ # Model Card for SSA-PERIN for Norwegian
31
 
32
 
33
+ ## Model Details
34
+
35
  We here release a pretrained model (and an easy-to-run wrapper) for structured sentiment analysis of Norwegian text, pre-trained on the [NoReC_fine dataset](https://github.com/ltgoslo/norec_fine).
36
+ This is an implementation of the method described in the paper [Direct parsing to sentiment graphs](https://aclanthology.org/2022.acl-short.51/) by Samuel et al. 2022 which demonstrated how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text.
37
+
38
+
39
+ ### Model Description
40
+
41
+ - **Developed by:** The [SANT](https://www.mn.uio.no/ifi/english/research/projects/sant/) project (Sentiment Analysis for Norwegian Text) at [the Language Technology Group](https://www.mn.uio.no/ifi/english/research/groups/ltg/) (LTG) at the University of Oslo.
42
+ - **Funded by:** [SANT](https://www.mn.uio.no/ifi/english/research/projects/sant/) is funded by the Research Council of Norway
43
+ - **Language(s):** Norwegian (Bokmål/Nynorsk)
44
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
45
+
46
+ ### Model Sources
47
+
48
+ - **Paper:** [Direct parsing to sentiment graphs](https://aclanthology.org/2022.acl-short.51/) by D. Samuel, J. Barnes, R. Kurtz, S. Oepen, L. Øvrelid, and E. Velldal, in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, 2022
49
+ - **Repository:** The scripts used for training can be found on the [github](https://github.com/jerbarnes/direct_parsing_to_sent_graph) repository accompanying the paper of Samuel et al. (2022) above.
50
+ - **Demo:** To see a demo of how it works, you can try the model in our [Hugging Face Space](https://huggingface.co/spaces/ltg/ssa-perin).
51
+ - **Limitations** The training data is based on professional reviews covering multiple domains, but the model may not necessarily generalize to other text types or domains.
52
 
 
53
 
54
+ ## How to Get Started with the Model
55
 
 
56
 
57
+ The model will attempt to identify the following components for a given sentence it deems to be sentiment-bearing: _source expressions_ (the opinion holder), _target expressions_ (what the opinion is directed towards), _polar expressions_ (the part of the text indicating that an opinion is expressed), and finally the _polarity_ (positive or negative). For more information about how these categories are defined in the training data, please the paper [A Fine-grained Sentiment Dataset for Norwegian](https://aclanthology.org/2020.lrec-1.618/) by Øvrelid et al. 2020. For each identified expression, the character offsets in the text are also provided.
58
+
59
+ Here is an example showing how to use the model for predicting such sentiment tuples:
60
 
61
  ```python
62
  >>> import model_wrapper
 
70
  'Polarity': 'Positive'}]}]
71
  ```
72
 
73
+ ## Training Details
74
+
75
+ ### Training Data
76
+
77
+ The model is trained on NoReC_fine, a dataset for fine-grained sentiment analysis in Norwegian, based on a subset of documents from the Norwegian Review Corpus (NoReC) which constists of professionally authored reviews from multiple news-sources and across a wide variety of domains, including literature, games, music, products, movies and more.
78
 
79
+ - **Paper:** [A Fine-grained Sentiment Dataset for Norwegian](https://aclanthology.org/2020.lrec-1.618/) by L. Øvrelid, P. Mæhlum, J. Barnes, and E Velldal, in the Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, 2020
80
+ - **Repository:** [https://github.com/ltgoslo/norec_fine](https://github.com/ltgoslo/norec_fine)
81
 
82
+
83
+ ### Model Configuration and Training Hyperparameters
84
+
85
+ The method proposed by Samuel et al. (2022) suggests three different ways to encode sentiment graphs: "node-centric", "labeled-edge", and "opinion-tuple".
86
+ The model released here uses the following configuration:
87
+ - "labeled-edge" graph encoding,
88
+ - no character-level embeddings,
89
  - all other hyperparameters are set to [default values](https://github.com/jerbarnes/direct_parsing_to_sent_graph/blob/main/perin/config/edge_norec.yaml),
90
+ - trained on top of underlying masked language model [NorBERT 2](https://huggingface.co/ltg/norbert2).
91
 
92
+ ## Evaluation
93
 
94
+ The model achieves the following results on the held-out test set of NoReC_fine (see the paper for description the metrics):
 
 
95
 
96
+ - Unlabeled sentiment tuple F1: 0.434
97
+ - Target F1: 0.541
98
+ - Relative polarity precision: 0.926
99
 
100
 
101
+ ## Citation
102
 
103
  If you use this model in your academic work, please quote the following paper:
104
  ```bibtex
 
112
  address = "Dublin, Ireland"
113
  }
114
  ```
115
+
116
+ ## Model Card Authors
117
+ Erik Velldal and Larisa Kolesnichenko