MyPoliBERT

Model Overview

MyPoliBERT is a fine-tuned version of bert-base-uncased designed to classify political texts in Malaysia. This model performs multi-label and multi-class classification for 12 political topics (Democracy, Economy, Race, Leadership, Development, Corruption, Instability, Safety, Administration, Education, Religion, Environment) with four sentiment classes (Unknown: 0, Negative: 1, Neutral: 2, Positive: 3). The training data comprises diverse sources, including Malaysian news articles, Reddit posts, and Instagram content.

Intended Uses and Limitations

  • Intended Uses:
    This model is intended for analyzing political texts (such as news articles and social media posts) in a Malaysian context. It identifies which political topics are mentioned and predicts the sentiment (polarity) for each topic.

  • Limitations:

    1. Language and Context Dependency: The model is designed for English texts related to Malaysian politics. Performance on texts in other languages or from different cultural and political contexts may not be reliable.
    2. Offline Learning: The model is trained on static data and does not incorporate trends or events that occurred after training. Periodic retraining or updates may be necessary to reflect the latest political developments.
    3. Interpretability: As with most deep learning models, the predictions are not inherently interpretable. Human review and explainability techniques are recommended for validating results.

Dataset

  • Data Sources:

    • tnwei/ms-newspapers dataset
    • Malaysian political posts from Reddit
    • Malaysian political posts from Instagram

    These sources were combined into a single dataset containing approximately 24,260 records. 80% of the dataset was used for training, and 20% was reserved for validation.

  • Task:
    The model performs multi-task learning, simultaneously predicting 12 topics and their respective sentiment classes.

Model Architecture

  • Base Model: bert-base-uncased
  • Output Layer: The model generates logits for 12 topics, each with four sentiment classes (Unknown, Negative, Neutral, Positive).

Training Procedure

  • Hyperparameters:

    • learning_rate: 5e-05
    • train_batch_size: 8
    • eval_batch_size: 8
    • seed: 42
    • gradient_accumulation_steps: 2
    • total_train_batch_size: 16
    • optimizer: Adam(betas=(0.9,0.999), epsilon=1e-08)
    • lr_scheduler_type: linear
    • num_epochs: 8
    • mixed_precision_training: Native AMP (fp16)
  • Training Configuration (TrainingArguments):

    • evaluation_strategy: "epoch"
    • save_strategy: "epoch"
    • load_best_model_at_end: True
    • metric_for_best_model: "overall_f1"
    • greater_is_better: True
  • Custom Trainer:
    The compute_loss method calculates the cross-entropy loss for each label and averages the losses across all labels.

Evaluation and Performance

It achieves the following results on the evaluation set:

  • Loss: 0.2776
  • Democracy F1: 0.9521
  • Democracy Accuracy: 0.9530
  • Economy F1: 0.9330
  • Economy Accuracy: 0.9338
  • Race F1: 0.9499
  • Race Accuracy: 0.9503
  • Leadership F1: 0.8086
  • Leadership Accuracy: 0.8065
  • Development F1: 0.8926
  • Development Accuracy: 0.8963
  • Corruption F1: 0.9513
  • Corruption Accuracy: 0.9524
  • Instability F1: 0.9211
  • Instability Accuracy: 0.9250
  • Safety F1: 0.9263
  • Safety Accuracy: 0.9266
  • Administration F1: 0.8900
  • Administration Accuracy: 0.8963
  • Education F1: 0.9584
  • Education Accuracy: 0.9596
  • Religion F1: 0.9408
  • Religion Accuracy: 0.9411
  • Environment F1: 0.9828
  • Environment Accuracy: 0.9829
  • Overall F1: 0.9256
  • Overall Accuracy: 0.9270

These results indicate that the model demonstrates robust performance across most topics, with high accuracy and F1 scores. However, the performance for some topics, such as Leadership, is relatively lower, suggesting room for improvement through additional data or model refinement.

Training Procedure

  • Hyperparameters:

    • learning_rate: 5e-05
    • train_batch_size: 8
    • eval_batch_size: 8
    • seed: 42
    • gradient_accumulation_steps: 2
    • total_train_batch_size: 16
    • optimizer: Adam(betas=(0.9,0.999), epsilon=1e-08)
    • lr_scheduler_type: linear
    • num_epochs: 8
    • mixed_precision_training: Native AMP (fp16)
  • Training Configuration (TrainingArguments):

    • evaluation_strategy: "epoch"
    • save_strategy: "epoch"
    • load_best_model_at_end: True
    • metric_for_best_model: "overall_f1"
    • greater_is_better: True
  • Custom Trainer:
    The compute_loss method calculates the cross-entropy loss for each label and averages the losses across all labels.

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Democracy F1 Democracy Accuracy Economy F1 Economy Accuracy Race F1 Race Accuracy Leadership F1 Leadership Accuracy Development F1 Development Accuracy Corruption F1 Corruption Accuracy Instability F1 Instability Accuracy Safety F1 Safety Accuracy Administration F1 Administration Accuracy Education F1 Education Accuracy Religion F1 Religion Accuracy Environment F1 Environment Accuracy Overall F1 Overall Accuracy
0.2991 1.0 1213 0.2534 0.9250 0.9411 0.9189 0.9239 0.9366 0.9396 0.7906 0.7939 0.8519 0.8673 0.9471 0.9503 0.9060 0.9151 0.9130 0.9147 0.8565 0.8788 0.9515 0.9538 0.9342 0.9386 0.9777 0.9777 0.9091 0.9162
0.2005 2.0 2426 0.2275 0.9433 0.9507 0.9231 0.9225 0.9409 0.9402 0.7996 0.8042 0.8763 0.8844 0.9472 0.9470 0.9102 0.9188 0.9190 0.9186 0.8776 0.8928 0.9603 0.9610 0.9417 0.9437 0.9799 0.9800 0.9183 0.9220
0.1317 3.0 3639 0.2324 0.9434 0.9507 0.9275 0.9295 0.9443 0.9450 0.8091 0.8116 0.8830 0.8899 0.9531 0.9549 0.9142 0.9202 0.9205 0.9200 0.8800 0.8953 0.9568 0.9573 0.9418 0.9417 0.9810 0.9808 0.9212 0.9248
0.0932 4.0 4852 0.2584 0.9436 0.9435 0.9250 0.9258 0.9444 0.9450 0.7886 0.7836 0.8810 0.8889 0.9498 0.9501 0.9149 0.9165 0.9196 0.9200 0.8802 0.8829 0.9559 0.9561 0.9377 0.9378 0.9810 0.9808 0.9185 0.9193
0.0609 5.0 6065 0.2606 0.9483 0.9493 0.9280 0.9283 0.9431 0.9439 0.8079 0.8046 0.8877 0.8912 0.9507 0.9518 0.9152 0.9190 0.9207 0.9209 0.8849 0.8881 0.9570 0.9584 0.9406 0.9400 0.9822 0.9821 0.9222 0.9231
0.0447 6.0 7278 0.2699 0.9500 0.9524 0.9310 0.9326 0.9485 0.9493 0.8048 0.8048 0.8912 0.8945 0.9510 0.9520 0.9168 0.9223 0.9242 0.9239 0.8906 0.8986 0.9575 0.9586 0.9411 0.9417 0.9820 0.9821 0.9241 0.9261
0.0333 7.0 8491 0.2745 0.9529 0.9547 0.9324 0.9334 0.9516 0.9522 0.8055 0.8030 0.8902 0.8934 0.9527 0.9545 0.9189 0.9235 0.9254 0.9260 0.8909 0.8949 0.9580 0.9586 0.9425 0.9427 0.9826 0.9827 0.9253 0.9266
0.0246 8.0 9704 0.2776 0.9521 0.9530 0.9330 0.9338 0.9499 0.9503 0.8086 0.8065 0.8926 0.8963 0.9513 0.9524 0.9211 0.9250 0.9263 0.9266 0.8900 0.8963 0.9584 0.9596 0.9408 0.9411 0.9828 0.9829 0.9256 0.9270

Future Improvements

  • Incorporating additional data and fine-tuning to adapt to new languages and emerging political trends.
  • Enhancing data balance for underperforming topics, particularly Leadership.
  • Introducing explainability techniques (e.g., XAI) to improve model interpretability.

License and Usage Notes

  • The model's predictions should be used as a reference and interpreted within the context of the data sources and Malaysian political environment.
  • Users are encouraged to validate outputs with human review and consider the limitations of the model.
  • Regular updates and retraining are recommended to ensure the model remains relevant and accurate.

Framework versions

  • Transformers 4.18.0
  • Pytorch 2.5.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.12.1
Downloads last month
13
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.