erikhenriksson commited on
Commit
17e46b3
1 Parent(s): 7f1a62f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,7 +12,7 @@ metrics:
12
  # Web register classification (multilingual model)
13
 
14
  A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
15
- The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed [here](https://turkunlp.org/register-annotation-docs/abbreviations).
16
  The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
17
  It is designed to support the development of open language models and for linguists analyzing register variation.
18
  ## Model Details
@@ -34,6 +34,8 @@ It is designed to support the development of open language models and for lingui
34
 
35
  ## Register labels and their abbreviations
36
 
 
 
37
  - **MT:** Machine translated or generated
38
  - **LY:** Lyrical
39
  - **SP:** Spoken
 
12
  # Web register classification (multilingual model)
13
 
14
  A multilingual web register classifier, fine-tuned from XLM-RoBERTa-large.
15
+ The model is trained with the multilingual CORE corpora across five languages (English, Finnish, French, Swedish, Turkish) to classify documents based on the CORE taxonomy, detailed below.
16
  The model demonstrates state-of-the-art performance in classifying web registers and achieves good zero-shot performance for additional languages.
17
  It is designed to support the development of open language models and for linguists analyzing register variation.
18
  ## Model Details
 
34
 
35
  ## Register labels and their abbreviations
36
 
37
+ Below is a list of the register labels predicted by the model. Note that some labels are hierarchical; when a sublabel is predicted, its parent label is also predicted. For a more detailed description, see [here]{https://turkunlp.org/register-annotation-docs/}.
38
+
39
  - **MT:** Machine translated or generated
40
  - **LY:** Lyrical
41
  - **SP:** Spoken