KennethTM
/

bert-base-uncased-danish

@@ -49,20 +49,34 @@ Initially, only the word token embeddings are trained using 1.000.000 samples. F
 The performance of the pretrained model was evaluated using [ScandEval](https://github.com/ScandEval/ScandEval).
-| task                     | dataset      | summary                                                                                    |
-|:-------------------------|:-------------|:-------------------------------------------------------------------------------------------|
-| sentiment-classification | swerec       | mcc = 63.02, mcc_se = 2.16, macro_f1 = 62.2, macro_f1_se = 3.61                            |
-| sentiment-classification | angry-tweets | mcc = 47.21, mcc_se = 0.53, macro_f1 = 64.21, macro_f1_se = 0.53                           |
-| sentiment-classification | norec        | mcc = 42.23, mcc_se = 8.69, macro_f1 = 57.24, macro_f1_se = 7.67                           |
-| named-entity-recognition | suc3         | micro_f1 = 50.03, micro_f1_se = 4.16, micro_f1_no_misc = 53.55, micro_f1_no_misc_se = 4.57 |
-| named-entity-recognition | dane         | micro_f1 = 76.44, micro_f1_se = 1.36, micro_f1_no_misc = 80.61, micro_f1_no_misc_se = 1.11 |
-| named-entity-recognition | norne-nb     | micro_f1 = 68.38, micro_f1_se = 1.72, micro_f1_no_misc = 73.08, micro_f1_no_misc_se = 1.66 |
-| named-entity-recognition | norne-nn     | micro_f1 = 60.45, micro_f1_se = 1.71, micro_f1_no_misc = 64.39, micro_f1_no_misc_se = 1.8  |
-| linguistic-acceptability | scala-sv     | mcc = 5.01, mcc_se = 5.41, macro_f1 = 49.46, macro_f1_se = 3.67                            |
-| linguistic-acceptability | scala-da     | mcc = 54.74, mcc_se = 12.22, macro_f1 = 76.25, macro_f1_se = 6.09                          |
-| linguistic-acceptability | scala-nb     | mcc = 19.18, mcc_se = 14.01, macro_f1 = 55.3, macro_f1_se = 8.85                           |
-| linguistic-acceptability | scala-nn     | mcc = 5.72, mcc_se = 5.91, macro_f1 = 49.56, macro_f1_se = 3.73                            |
-| question-answering       | scandiqa-da  | em = 26.36, em_se = 1.17, f1 = 32.41, f1_se = 1.1                                          |
-| question-answering       | scandiqa-no  | em = 26.14, em_se = 1.59, f1 = 32.02, f1_se = 1.59                                         |
-| question-answering       | scandiqa-sv  | em = 26.38, em_se = 1.1, f1 = 32.33, f1_se = 1.05                                          |
-| speed                    | speed        | speed = 4.55, speed_se = 0.0                                                               |

 The performance of the pretrained model was evaluated using [ScandEval](https://github.com/ScandEval/ScandEval).
+| Task                     | Dataset      | Score (±SE)                      |
+|:-------------------------|:-------------|:---------------------------------|
+| sentiment-classification | swerec       | mcc = 63.02 (±2.16)              |
+|                          |              | macro_f1 = 62.2 (±3.61)          |
+| sentiment-classification | angry-tweets | mcc = 47.21 (±0.53)              |
+|                          |              | macro_f1 = 64.21 (±0.53)         |
+| sentiment-classification | norec        | mcc = 42.23 (±8.69)              |
+|                          |              | macro_f1 = 57.24 (±7.67)         |
+| named-entity-recognition | suc3         | micro_f1 = 50.03 (±4.16)         |
+|                          |              | micro_f1_no_misc = 53.55 (±4.57) |
+| named-entity-recognition | dane         | micro_f1 = 76.44 (±1.36)         |
+|                          |              | micro_f1_no_misc = 80.61 (±1.11) |
+| named-entity-recognition | norne-nb     | micro_f1 = 68.38 (±1.72)         |
+|                          |              | micro_f1_no_misc = 73.08 (±1.66) |
+| named-entity-recognition | norne-nn     | micro_f1 = 60.45 (±1.71)         |
+|                          |              | micro_f1_no_misc = 64.39 (±1.8)  |
+| linguistic-acceptability | scala-sv     | mcc = 5.01 (±5.41)               |
+|                          |              | macro_f1 = 49.46 (±3.67)         |
+| linguistic-acceptability | scala-da     | mcc = 54.74 (±12.22)             |
+|                          |              | macro_f1 = 76.25 (±6.09)         |
+| linguistic-acceptability | scala-nb     | mcc = 19.18 (±14.01)             |
+|                          |              | macro_f1 = 55.3 (±8.85)          |
+| linguistic-acceptability | scala-nn     | mcc = 5.72 (±5.91)               |
+|                          |              | macro_f1 = 49.56 (±3.73)         |
+| question-answering       | scandiqa-da  | em = 26.36 (±1.17)               |
+|                          |              | f1 = 32.41 (±1.1)                |
+| question-answering       | scandiqa-no  | em = 26.14 (±1.59)               |
+|                          |              | f1 = 32.02 (±1.59)               |
+| question-answering       | scandiqa-sv  | em = 26.38 (±1.1)                |
+|                          |              | f1 = 32.33 (±1.05)               |
+| speed                    | speed        | speed = 4.55 (±0.0)              |