bonadossou commited on
Commit
2d9a5d7
1 Parent(s): 27b518a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  # AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages
 
2
 
3
  This repository contains the model for our paper `AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages` which will appear at the Third Simple and Efficient Natural Language Processing, at EMNLP 2022.
4
 
@@ -9,7 +10,7 @@ This repository contains the model for our paper `AfroLM: A Self-Active Learning
9
  AfroLM has been pretrained from scratch on 23 African Languages: Amharic, Afan Oromo, Bambara, Ghomalá, Éwé, Fon, Hausa, Ìgbò, Kinyarwanda, Lingala, Luganda, Luo, Mooré, Chewa, Naija, Shona, Swahili, Setswana, Twi, Wolof, Xhosa, Yorùbá, and Zulu.
10
 
11
  ## Evaluation Results
12
- AfroLM was evaluated on MasakhaNER1.0 (10 African Languages) and MasakhaNER2.0 (21 African Languages) datasets; on text classification and sentiment analysis. AfroLM outperformed AfriBERTa, mBERT, and XLMR-base, and was very competitive with AfroXLMR. AfroLM is also very data efficient because it was pretrained on a dataset 14x+ smaller than its competitors' datasets. Below is the average of performance of various models, across various datasets. Please consult our paper for more language-level performance.
13
 
14
  Model | MasakhaNER | MasakhaNER2.0* | Text Classification (Yoruba/Hausa) | Sentiment Analysis (YOSM) | OOD Sentiment Analysis (Twitter -> YOSM) |
15
  |:---: |:---: |:---: | :---: |:---: | :---: |
 
1
  # AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages
2
+ - [GitHub Repository of the Paper](https://github.com/bonaventuredossou/MLM_AL)
3
 
4
  This repository contains the model for our paper `AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages` which will appear at the Third Simple and Efficient Natural Language Processing, at EMNLP 2022.
5
 
 
10
  AfroLM has been pretrained from scratch on 23 African Languages: Amharic, Afan Oromo, Bambara, Ghomalá, Éwé, Fon, Hausa, Ìgbò, Kinyarwanda, Lingala, Luganda, Luo, Mooré, Chewa, Naija, Shona, Swahili, Setswana, Twi, Wolof, Xhosa, Yorùbá, and Zulu.
11
 
12
  ## Evaluation Results
13
+ AfroLM was evaluated on MasakhaNER1.0 (10 African Languages) and MasakhaNER2.0 (21 African Languages) datasets; on text classification and sentiment analysis. AfroLM outperformed AfriBERTa, mBERT, and XLMR-base, and was very competitive with AfroXLMR. AfroLM is also very data efficient because it was pretrained on a dataset 14x+ smaller than its competitors' datasets. Below is the average performance of various models, across various datasets. Please consult our paper for more language-level performance.
14
 
15
  Model | MasakhaNER | MasakhaNER2.0* | Text Classification (Yoruba/Hausa) | Sentiment Analysis (YOSM) | OOD Sentiment Analysis (Twitter -> YOSM) |
16
  |:---: |:---: |:---: | :---: |:---: | :---: |