davda54 commited on
Commit
d0a30cf
1 Parent(s): 69e469f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -97,7 +97,7 @@ We use the binary formulation of this task (positive vs. negative).
97
  <summary>Method</summary>
98
 
99
  * Evaluation setting: zero-shot and few-shot perplexity-based evaluation.
100
- * Prompt: ```"Tekst: {text}\nSentiment:{label}"```, where the ```label``` is either "positiv" or "negativ". Based on [Brown et al. (2020)](https://arxiv.org/abs/2005.14165).
101
  * Few-shot results show the average scores across 5 repetitions
102
  * Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/sentiment_analysis.py
103
  * Performance metric: macro-averaged F1-score.
@@ -124,13 +124,13 @@ We use the binary formulation of this task (positive vs. negative).
124
 
125
  ### Reading comprehension
126
 
127
- [NorQuAD](https://huggingface.co/datasets/ltg/norquad) ([Ivanova et al., 2023](https://aclanthology.org/2023.nodalida-1.17/)) is a dataset for extractive question answering in Norwegian designed similarly to [SQuAD (Rajpurkar et al., 2016)](https://aclanthology.org/D16-1264/).
128
 
129
  <details>
130
  <summary>Method</summary>
131
 
132
  * Evaluation setting: zero-shot and few-shot settings via natural language generation using the greedy decoding strategy.
133
- * Prompt: ```"Tittel: {title}\n\nTekst: {text}\n\nSpørsmål: {question}\n\nSvar:{answer}"```
134
  * Few-shot results show the average scores across 5 repetitions
135
  * Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/norquad.py
136
  * Performance metrics: macro-averaged F1-score and exact match (EM).
 
97
  <summary>Method</summary>
98
 
99
  * Evaluation setting: zero-shot and few-shot perplexity-based evaluation.
100
+ * Prompt: ```"Tekst: {text}\nSentiment:{label}"```, where the ```label``` is either "positiv" or "negativ".
101
  * Few-shot results show the average scores across 5 repetitions
102
  * Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/sentiment_analysis.py
103
  * Performance metric: macro-averaged F1-score.
 
124
 
125
  ### Reading comprehension
126
 
127
+ [NorQuAD](https://huggingface.co/datasets/ltg/norquad) ([Ivanova et al., 2023](https://aclanthology.org/2023.nodalida-1.17/)) is a dataset for extractive question answering in Norwegian designed similarly to [SQuAD (Rajpurkar et al., 2016)](https://aclanthology.org/D16-1264/).
128
 
129
  <details>
130
  <summary>Method</summary>
131
 
132
  * Evaluation setting: zero-shot and few-shot settings via natural language generation using the greedy decoding strategy.
133
+ * Prompt: ```"Tittel: {title}\n\nTekst: {text}\n\nSpørsmål: {question}\n\nSvar:{answer}"``` Based on [Brown et al. (2020)](https://arxiv.org/abs/2005.14165).
134
  * Few-shot results show the average scores across 5 repetitions
135
  * Evaluation script: https://github.com/ltgoslo/norallm/blob/main/initial_evaluation/norquad.py
136
  * Performance metrics: macro-averaged F1-score and exact match (EM).