Further info on pretraining

#1
by manueltonneau - opened

Hi @Davlan and thank you very much for this contribution. Could you please provide more info on the pretraining dataset, especially its size? Also, could you please say a bit more about how you pretrained (is it adaptive finetuning like AfroXLMR)?

manueltonneau changed discussion title from Size of pretraining dataset to Further info on pretraining

Finally, is there a paper I can cite if I want to reference this model in a paper? Thank you!

Yes, there is a paper, please cite our MasakhaNER paper https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00416/107614/MasakhaNER-Named-Entity-Recognition-for-African

Table 10 has the information on the monolingual finetuning corpus

manueltonneau changed discussion status to closed

Sign up or log in to comment