Edit model card

This is a finetuned DistilBERT model for Vietnamese essay categories classification.

Overview

  • At primary levels of education in Vietnam, students are introduced to 5 categories of essays:
    • Argumentative - Nghị luận
    • Expressive - Biểu cảm
    • Descriptive - Miêu tả
    • Narrative - Tự sự
    • Expository - Thuyết minh
  • This model will classify sentences into these 5 categories

Pretrained model used in this pipeline:

  • This pipeline includes pre-trained phobert-base and a Multi-label Classification head trained on 8000 manually labeled sample essay sentences.
  • The dataset can be found on Kaggle
  • Usage of PhoBERT can be found on Huggingface

Citation:

The general architecture and experimental results of PhoBERT can be found in EMNLP-2020 Findings paper:

@article{phobert,
    title     = {{PhoBERT: Pre-trained language models for Vietnamese}},
    author    = {Dat Quoc Nguyen and Anh Tuan Nguyen},
    journal   = {Findings of EMNLP},
    year      = {2020}
    }
Downloads last month
3