nljubesi commited on
Commit
f33e8d5
1 Parent(s): d4afb50

Basic evaluation and language codes added

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -1,2 +1,54 @@
 
 
 
1
  # BERTić [bert-ich] /bɜrtitʃ/ - A BERT model for Bosnian, Croatian, Montenegrin and Serbian
2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "hr sr bs cnr hbs"
3
+ ---
4
  # BERTić [bert-ich] /bɜrtitʃ/ - A BERT model for Bosnian, Croatian, Montenegrin and Serbian
5
 
6
+ The model was trained on more than 6 billion tokens of Bosnian, Croatian, Montenegrin and Serbian text.
7
+
8
+ Comparing this model to [multilingual BERT](https://huggingface.co/bert-base-multilingual-cased) and [CroSloEngual BERT](https://huggingface.co/EMBEDDIA/crosloengual-bert) on the tasks of part-of-speech tagging, named entity recognition, geolocation prediction and choice of plausible alternatives shows this model to be superior to the other two.
9
+
10
+ ## Part-of-speech tagging
11
+
12
+ Evaluation metric is (seqeval) microF1. Reported are means of five runs. Best results are presented in bold. Statistical significance is calculated between two best-performing systems via a two-tailed t-test (&ast; p<=0.05, &ast;&ast; p<=0.01, &ast;&ast;&ast; p<=0.001, &ast;&ast;&ast;&ast;&ast; p<=0.0001).
13
+
14
+ Dataset | Language | Variety | CLASSLA | mBERT | cseBERT | BERTić
15
+ ---|---|---|---|---|---|---
16
+ hr500k | Croatian | standard | 93.87 | 94.60 | 95.74 | **&ast;&ast;&ast;95.81**
17
+ reldi-hr | Croatian | internet non-standard | - | 88.87 | 91.63 | **&ast;&ast;&ast;92.28**
18
+ SETimes.SR | Serbian | standard | 95.00 | 95.50 | **96.41** | 96.31
19
+ reldi-sr | Serbian | internet non-standard | - | 91.26 | 93.54 | **&ast;&ast;&ast;93.90**
20
+
21
+ ## Named entity recognition
22
+
23
+ Evaluation metric is (seqeval) microF1. Reported are means of five runs. Best results are presented in bold. Statistical significance is calculated between two best-performing systems via a two-tailed t-test (&ast; p<=0.05, &ast;&ast; p<=0.01, &ast;&ast;&ast; p<=0.001, &ast;&ast;&ast;&ast;&ast; p<=0.0001).
24
+
25
+ Dataset | Language | Variety | CLASSLA | mBERT | cseBERT | BERTić
26
+ ---|---|---|---|---|---|---
27
+ hr500k | Croatian | standard | 80.13 | 85.67 | 88.98 | **&ast;&ast;&ast;&ast;89.21**
28
+ reldi-hr | Croatian | internet non-standard | - | 76.06 | 81.38 | **&ast;&ast;&ast;&ast;83.05**
29
+ SETimes.SR | Serbian | standard | 84.64 | **92.41** | 92.28 | 92.02
30
+ reldi-sr | Serbian | internet non-standard | - | 81.29 | 82.76 | **&ast;&ast;&ast;&ast;87.92**
31
+
32
+
33
+ ## Geolocation prediction
34
+
35
+ Evaluation metrics are median and mean of distance between gold and predicted geolocations (lower is better). No statistical significance is computed due to large test set (39,723 instances). Centroid baseline predicts each text to be created in the centroid of the training dataset.
36
+
37
+ System | Median | Mean
38
+ ---|---|---
39
+ centroid | 107.10 | 145.72
40
+ mBERT | 42.25 | 82.05
41
+ cseBERT | 40.76 | 81.88
42
+ BERTić | **37.96** | **79.30**
43
+
44
+ ## Choice Of Plausible Alternatives (translation to Croatian)
45
+
46
+ Evaluation metric is accuracy. Best results are presented in bold. Statistical significance is calculated between two best-performing systems via a two-tailed t-test (&ast; p<=0.05, &ast;&ast; p<=0.01, &ast;&ast;&ast; p<=0.001, &ast;&ast;&ast;&ast;&ast; p<=0.0001).
47
+
48
+ System | Accuracy
49
+ ---|---
50
+ random | 50.00
51
+ mBERT | 54.12
52
+ cseBERT | 61.80
53
+ BERTić | **&ast;&ast;65.76**
54
+