ksirts commited on
Commit
438506b
·
1 Parent(s): 082f8b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -3
README.md CHANGED
@@ -27,17 +27,31 @@ should probably proofread and complete it, then remove this comment. -->
27
  # EstBERT128_Rubric
28
 
29
  This model is a fine-tuned version of [tartuNLP/EstBERT](https://huggingface.co/tartuNLP/EstBERT) on the rubric categories of the [Estonian Valence dataset](http://peeter.eki.ee:5000/valence/paragraphsquery).
30
- It achieves the following results on the evaluation set:
 
31
  - Loss: 2.0552
32
  - Accuracy: 0.8329
33
 
34
  ## Model description
35
 
36
- More information needed
37
 
38
  ## Intended uses & limitations
39
 
40
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## Training and evaluation data
43
 
 
27
  # EstBERT128_Rubric
28
 
29
  This model is a fine-tuned version of [tartuNLP/EstBERT](https://huggingface.co/tartuNLP/EstBERT) on the rubric categories of the [Estonian Valence dataset](http://peeter.eki.ee:5000/valence/paragraphsquery).
30
+ The data was split into train/dev/test parts with 70/10/20 proportions.
31
+ It achieves the following results on the test set:
32
  - Loss: 2.0552
33
  - Accuracy: 0.8329
34
 
35
  ## Model description
36
 
37
+ A single linear layer classifier is fit on top of the last layer [CLS] token representation. The model is fully fine-tuned during training.
38
 
39
  ## Intended uses & limitations
40
 
41
+ This model is intended to be used as it is. It can be used to predict nine rubric categories of Estonian texts. The nine rubric labels in the Estonian Valence dataset are:
42
+ - ARVAMUS (opinion)
43
+ - EESTI (domestic)
44
+ - ELU-O (life)
45
+ - KOMM-O-ELU (comments)
46
+ - KOMM-P-EESTI (comments)
47
+ - KRIMI (crime)
48
+ - KULTUUR (culture)
49
+ - SPORT (sports)
50
+ - VALISMAA (world)
51
+
52
+ It probably makes sense to treat the two comments categories (KOMM-O-ELU and KOMM-P-EESTI) as a single category.
53
+
54
+ We do not guarantee that the model is useful for anything or that the predictions are accurate on new data.
55
 
56
  ## Training and evaluation data
57