--- library_name: transformers tags: [] pipeline_tag: fill-mask widget: - text: "shop làm ăn như cái " - text: "hag từ Quảng kực nét" - text: "Set xinh quá, bèo nhèo" - text: "đúng nhận sai " --- # 5CD-AI/viso-twhin-bert-large ## Overview We reduce TwHIN-BERT's vocabulary size to 20k on the UIT dataset and continue pretraining for 10 epochs. Here are the results on 4 downstream tasks on Vietnamese social media texts, including Emotion Recognition(UIT-VSMEC), Hate Speech Detection(UIT-HSD), Spam Reviews Detection(ViSpamReviews), Hate Speech Spans Detection(ViHOS):
Model Avg Emotion Recognition Hate Speech Detection Spam Reviews Detection Hate Speech Spans Detection
Acc WF1 MF1 Acc WF1 MF1 Acc WF1 MF1 Acc WF1 MF1
viBERT 78.16 61.91 61.98 59.7 85.34 85.01 62.07 89.93 89.79 76.8 90.42 90.45 84.55
vELECTRA 79.23 64.79 64.71 61.95 86.96 86.37 63.95 89.83 89.68 76.23 90.59 90.58 85.12
PhoBERT-Base 79.3 63.49 63.36 61.41 87.12 86.81 65.01 89.83 89.75 76.18 91.32 91.38 85.92
PhoBERT-Large 79.82 64.71 64.66 62.55 87.32 86.98 65.14 90.12 90.03 76.88 91.44 91.46 86.56
ViSoBERT 81.58 68.1 68.37 65.88 88.51 88.31 68.77 90.99 90.92 79.06 91.62 91.57 86.8
visobert-14gb-corpus 82.2 68.69 68.75 66.03 88.79 88.6 69.57 91.02 90.88 77.13 93.69 93.63 89.66
viso-twhin-bert-large 83.87 73.45 73.14 70.99 88.86 88.8 70.81 91.6 91.47 79.07 94.08 93.96 90.22
## Usage (HuggingFace Transformers) Install `transformers` package: pip install transformers Then you can use this model for fill-mask task like this: ```python from transformers import pipeline model_path = "5CD-AI/viso-twhin-bert-large" mask_filler = pipeline("fill-mask", model_path) mask_filler("đúng nhận sai ", top_k=10) ``` ## Fine-tune Configuration We fine-tune `5CD-AI/viso-twhin-bert-large` on 4 downstream tasks with `transformer` library with the following configuration: - train_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 4 - weight_decay: 0.01 - optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - training_epochs: 30 - model_max_length: 128 - metric_for_best_model: wf1 - strategy: epoch And different additional configurations for each task: | Emotion Recognition | Hate Speech Detection | Spam Reviews Detection | Hate Speech Spans Detection | | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | |\- learning_rate: 1e-5| \- learning_rate: 5e-6 | \- learning_rate: 1e-5 | \- learning_rate: 5e-6 |