Edit model card

ViDeBERTa: A powerful pre-trained language model for Vietnamese

ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on 138GB of Vietnamese text of high-quality and diverse Vietnamese text using DeBERTaV3 architecture.

Please check the official repository for more implementation details and updates

The DeBERTa V3 xsmall model comes with 12 layers and a hidden size of 384. It has only 22M backbone parameters with a vocabulary containing 128K tokens which introduces 48M parameters in the Embedding layer. This model was trained using CC100 dataset, which consists of 138 GB of Vietnamese text.

Fine-tuning on NLU tasks

We present the dev results on VLSP POS, PhoNER, ViQuAD dataset.

Model #Params(M) POS NER MRC
XLM-R-base 125M 96.2 - 82.0
XLM-R-large 355M 96.3 93.8 87.0
PhoBERT-base 135M 96.7 80.1
PhoBERT-large 370M 96.8 83.5
ViT5-base 310M - 94.5 -
ViT5-large 866M - 93.8 -
ViDeBERTa-xsmall 22M 96.4 93.6 81.3
ViDeBERTa-base 86M 96.8 94.5 85.7
ViDeBERTa-large 304M 97.2 95.3 89.9

Citation

If you find ViDeBERTa useful for your work, please cite the following papers:

@article{dao2023videberta,
  title={ViDeBERTa: A powerful pre-trained language model for Vietnamese},
  author={Dao Tran, Cong and Pham, Nhut Huy and Nguyen, Anh and Son Hy, Truong and Vu, Tu},
  journal={arXiv e-prints},
  pages={arXiv--2301},
  year={2023}
}
Downloads last month
140
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.