Update README.md
Browse files
README.md
CHANGED
@@ -96,7 +96,7 @@ We use the binary formulation of this task (positive vs. negative).
|
|
96 |
<summary>Method</summary>
|
97 |
|
98 |
* Evaluation setting: zero-shot and few-shot perplexity-based evaluation.
|
99 |
-
* Prompt: ```"Tekst: {text}\nSentiment:{label}"```, where the ```label``` is either "positiv" or "negativ".
|
100 |
* Few-shot results show the average scores across 5 repetitions
|
101 |
* Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/sentiment_analysis.py
|
102 |
* Performance metric: macro-averaged F1-score.
|
@@ -129,7 +129,7 @@ We use the binary formulation of this task (positive vs. negative).
|
|
129 |
<summary>Method</summary>
|
130 |
|
131 |
* Evaluation setting: zero-shot and few-shot settings via natural language generation using the greedy decoding strategy.
|
132 |
-
* Prompt: ```"Tittel: {title}\n\nTekst: {text}\n\nSpørsmål: {question}\n\nSvar:{answer}"```
|
133 |
* Few-shot results show the average scores across 5 repetitions
|
134 |
* Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/norquad.py
|
135 |
* Performance metrics: macro-averaged F1-score and exact match (EM).
|
|
|
96 |
<summary>Method</summary>
|
97 |
|
98 |
* Evaluation setting: zero-shot and few-shot perplexity-based evaluation.
|
99 |
+
* Prompt: ```"Tekst: {text}\nSentiment:{label}"```, where the ```label``` is either "positiv" or "negativ".
|
100 |
* Few-shot results show the average scores across 5 repetitions
|
101 |
* Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/sentiment_analysis.py
|
102 |
* Performance metric: macro-averaged F1-score.
|
|
|
129 |
<summary>Method</summary>
|
130 |
|
131 |
* Evaluation setting: zero-shot and few-shot settings via natural language generation using the greedy decoding strategy.
|
132 |
+
* Prompt: ```"Tittel: {title}\n\nTekst: {text}\n\nSpørsmål: {question}\n\nSvar:{answer}"``` Based on [Brown et al. (2020)](https://arxiv.org/abs/2005.14165).
|
133 |
* Few-shot results show the average scores across 5 repetitions
|
134 |
* Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/norquad.py
|
135 |
* Performance metrics: macro-averaged F1-score and exact match (EM).
|