lbourdois commited on
Commit
3c8955f
1 Parent(s): 35c14d2

Add multilingual to the language tag

Browse files

Hi! A PR to add multilingual to the language tag to improve the referencing.

Files changed (1) hide show
  1. README.md +13 -12
README.md CHANGED
@@ -1,6 +1,4 @@
1
  ---
2
- annotations_creators:
3
- - crowdsourced
4
  language:
5
  - amh
6
  - orm
@@ -25,17 +23,9 @@ language:
25
  - twi
26
  - xho
27
  - zul
28
- language_creators:
29
- - crowdsourced
30
  license:
31
  - cc-by-4.0
32
- multilinguality:
33
- - monolingual
34
- pretty_name: afrolm-dataset
35
- size_categories:
36
- - 1M<n<10M
37
- source_datasets:
38
- - original
39
  tags:
40
  - afrolm
41
  - active learning
@@ -43,6 +33,17 @@ tags:
43
  - research papers
44
  - natural language processing
45
  - self-active learning
 
 
 
 
 
 
 
 
 
 
 
46
  task_categories:
47
  - fill-mask
48
  task_ids:
@@ -57,7 +58,7 @@ This repository contains the model for our paper [`AfroLM: A Self-Active Learnin
57
  ![Model](afrolm.png)
58
 
59
  ## Languages Covered
60
- AfroLM has been pretrained from scratch on 23 African Languages: Amharic, Afan Oromo, Bambara, Ghomalá, Éwé, Fon, Hausa, Ìgbò, Kinyarwanda, Lingala, Luganda, Luo, Mooré, Chewa, Naija, Shona, Swahili, Setswana, Twi, Wolof, Xhosa, Yorùbá, and Zulu.
61
 
62
  ## Evaluation Results
63
  AfroLM was evaluated on MasakhaNER1.0 (10 African Languages) and MasakhaNER2.0 (21 African Languages) datasets; on text classification and sentiment analysis. AfroLM outperformed AfriBERTa, mBERT, and XLMR-base, and was very competitive with AfroXLMR. AfroLM is also very data efficient because it was pretrained on a dataset 14x+ smaller than its competitors' datasets. Below are the average F1-score performances of various models, across various datasets. Please consult our paper for more language-level performance.
 
1
  ---
 
 
2
  language:
3
  - amh
4
  - orm
 
23
  - twi
24
  - xho
25
  - zul
26
+ - multilingual
 
27
  license:
28
  - cc-by-4.0
 
 
 
 
 
 
 
29
  tags:
30
  - afrolm
31
  - active learning
 
33
  - research papers
34
  - natural language processing
35
  - self-active learning
36
+ annotations_creators:
37
+ - crowdsourced
38
+ language_creators:
39
+ - crowdsourced
40
+ multilinguality:
41
+ - monolingual
42
+ pretty_name: afrolm-dataset
43
+ size_categories:
44
+ - 1M<n<10M
45
+ source_datasets:
46
+ - original
47
  task_categories:
48
  - fill-mask
49
  task_ids:
 
58
  ![Model](afrolm.png)
59
 
60
  ## Languages Covered
61
+ AfroLM has been pretrained from scratch on 23 African Languages: Amharic, Afan Oromo, Bambara, Ghomal�, �w�, Fon, Hausa, �gb�, Kinyarwanda, Lingala, Luganda, Luo, Moor�, Chewa, Naija, Shona, Swahili, Setswana, Twi, Wolof, Xhosa, Yor�b�, and Zulu.
62
 
63
  ## Evaluation Results
64
  AfroLM was evaluated on MasakhaNER1.0 (10 African Languages) and MasakhaNER2.0 (21 African Languages) datasets; on text classification and sentiment analysis. AfroLM outperformed AfriBERTa, mBERT, and XLMR-base, and was very competitive with AfroXLMR. AfroLM is also very data efficient because it was pretrained on a dataset 14x+ smaller than its competitors' datasets. Below are the average F1-score performances of various models, across various datasets. Please consult our paper for more language-level performance.