nljubesi commited on
Commit
d4776c4
1 Parent(s): 4801367

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -15,7 +15,7 @@ license: apache-2.0
15
 
16
  This Electra model was trained on more than 6 billion tokens of Bosnian, Croatian, Montenegrin and Serbian text.
17
 
18
- Comparing this model to [multilingual BERT](https://huggingface.co/bert-base-multilingual-cased) and [CroSloEngual BERT](https://huggingface.co/EMBEDDIA/crosloengual-bert) on the tasks of part-of-speech tagging, named entity recognition, geolocation prediction and choice of plausible alternatives shows this model to be superior to the other two.
19
 
20
  ## Part-of-speech tagging
21
 
@@ -42,6 +42,8 @@ reldi-sr | Serbian | internet non-standard | - | 81.29 | 82.76 | **87.92*&as
42
 
43
  ## Geolocation prediction
44
 
 
 
45
  Evaluation metrics are median and mean of distance between gold and predicted geolocations (lower is better). No statistical significance is computed due to large test set (39,723 instances). Centroid baseline predicts each text to be created in the centroid of the training dataset.
46
 
47
  System | Median | Mean
@@ -51,9 +53,11 @@ mBERT | 42.25 | 82.05
51
  cseBERT | 40.76 | 81.88
52
  BERTić | **37.96** | **79.30**
53
 
54
- ## Choice Of Plausible Alternatives (translation to Croatian)
 
 
55
 
56
- Evaluation metric is accuracy. Best results are presented in bold. Statistical significance is calculated between two best-performing systems via a two-tailed t-test (&ast; p<=0.05, &ast;&ast; p<=0.01, &ast;&ast;&ast; p<=0.001, &ast;&ast;&ast;&ast;&ast; p<=0.0001).
57
 
58
  System | Accuracy
59
  ---|---
 
15
 
16
  This Electra model was trained on more than 6 billion tokens of Bosnian, Croatian, Montenegrin and Serbian text.
17
 
18
+ Comparing this model to [multilingual BERT](https://huggingface.co/bert-base-multilingual-cased) and [CroSloEngual BERT](https://huggingface.co/EMBEDDIA/crosloengual-bert) on the tasks of (1) part-of-speech tagging, (2) named entity recognition, (3) geolocation prediction, and (4) commonsense causal reasoning, shows the BERTić model to be superior to the other two.
19
 
20
  ## Part-of-speech tagging
21
 
 
42
 
43
  ## Geolocation prediction
44
 
45
+ The dataset comes from the VarDial 2020 evaluation campaign's shared task on [Social Media variety Geolocation prediction](https://sites.google.com/view/vardial2020/evaluation-campaign). The task is to predict the latitude and longitude of a tweet given its text.
46
+
47
  Evaluation metrics are median and mean of distance between gold and predicted geolocations (lower is better). No statistical significance is computed due to large test set (39,723 instances). Centroid baseline predicts each text to be created in the centroid of the training dataset.
48
 
49
  System | Median | Mean
 
53
  cseBERT | 40.76 | 81.88
54
  BERTić | **37.96** | **79.30**
55
 
56
+ ## Choice Of Plausible Alternatives
57
+
58
+ The dataset is a translation of the [COPA dataset](https://people.ict.usc.edu/~gordon/copa.html) into Croatian (to-be-released).
59
 
60
+ Evaluation metric is accuracy. Reported are means of five runs. Best results are presented in bold. Statistical significance is calculated between two best-performing systems via a two-tailed t-test (&ast; p<=0.05, &ast;&ast; p<=0.01, &ast;&ast;&ast; p<=0.001, &ast;&ast;&ast;&ast;&ast; p<=0.0001).
61
 
62
  System | Accuracy
63
  ---|---