README.md · sultan/ArabicTransformer-large at 05ffc9e643701b7c9929b52fe581d3bf90147635

ArabicTransformer small model (B8-8-8 with decoder)

Paper : ArabicTransformer: Efficient Large Arabic Language Model with Funnel Transformer and ELECTRA Objective (EMNLP21)

Abstract

Pre-training Transformer-based models such as BERT and ELECTRA on a collection of Arabic corpora, demonstrated by both AraBERT and AraELECTRA, shows an impressive result on downstream tasks. However, pre-training Transformer-based language models is computationally expensive, especially for large-scale models. Recently, Funnel Transformer has addressed the sequential redundancy inside Transformer architecture by compressing the sequence of hidden states, leading to a significant reduction in the pretraining cost. This paper empirically studies the performance and efficiency of building an Arabic language model with Funnel Transformer and ELECTRA objective. We find that our model achieves state-of-the-art results on several Arabic downstream tasks despite using less computational resources compared to other BERT-based models.

Description

This model was pre-trained on 44GB of Arabic corpora using Funnel Transformer with ELECTRA objective. We will update you with more details about the model and our accepted paper later at EMNLP21. Check our GitHub page for the latest updates and examples: https://github.com/salrowili/ArabicTransformer