phobert-v2-hsd / README.md
AnnyNguyen's picture
Upload README.md with huggingface_hub
221aca4 verified
metadata
license: mit
base_model: vinai/phobert-base-v2
tags:
  - vietnamese
  - hate-speech-detection
  - text-classification
  - offensive-language-detection
datasets:
  - visolex/vihsd
metrics:
  - accuracy
  - macro-f1
  - weighted-f1
model-index:
  - name: phobert-v2-hsd
    results:
      - task:
          type: text-classification
          name: Hate Speech Detection
        dataset:
          name: ViHSD
          type: hate-speech-detection
        metrics:
          - type: accuracy
            value: 0.9341
          - type: macro-f1
            value: 0.8048
          - type: weighted-f1
            value: 0.9326
          - type: macro-precision
            value: 0.8306
          - type: macro-recall
            value: 0.7854

PhoBERT V2: Hate Speech Detection for Vietnamese Text

This model is a fine-tuned version of vinai/phobert-base-v2 on the ViHSD (Vietnamese Hate Speech Detection Dataset) for classifying Vietnamese text into three categories: CLEAN, OFFENSIVE, and HATE.

Model Details

  • Base Model: vinai/phobert-base-v2
  • Description: PhoBERT V2 fine-tuned cho bài toán phân loại Hate Speech tiếng Việt
  • Architecture: PhoBERT-V2 (Phiên bản cải tiến của PhoBERT với tokenizer syllable-based)
  • Dataset: ViHSD (Vietnamese Hate Speech Detection Dataset)
  • Fine-tuning Framework: HuggingFace Transformers + PyTorch
  • Task: Hate Speech Classification (3 classes)

Hyperparameters

  • Batch size: 32
  • Learning rate: 2e-5
  • Epochs: 100
  • Max sequence length: 256
  • Weight decay: 0.01
  • Warmup steps: 500
  • Early stopping patience: 5
  • Optimizer: AdamW
  • Learning rate scheduler: Cosine with warmup

Dataset

Model was trained on ViHSD (Vietnamese Hate Speech Detection Dataset) containing ~10,000 Vietnamese comments from social media.

Label Descriptions:

  • CLEAN (0): Normal content without offensive language
  • OFFENSIVE (1): Mildly offensive or inappropriate content
  • HATE (2): Hate speech, extremist language, severe threats

Evaluation Results

The model was evaluated on test set with the following metrics:

  • Accuracy: 0.9341
  • Macro-F1: 0.8048
  • Weighted-F1: 0.9326
  • Macro-Precision: 0.8306
  • Macro-Recall: 0.7854

Basic Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "visolex/phobert-v2-hsd"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name
)

# Classify text
text = "Văn bản tiếng Việt cần phân loại"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_label = torch.argmax(predictions, dim=-1).item()

# Label mapping
label_names = {
    0: "CLEAN",
    1: "OFFENSIVE",
    2: "HATE"
}

print(f"Predicted label: {label_names[predicted_label]}")
print(f"Confidence scores: {predictions[0].tolist()}")

Training Details

Training Data

  • Dataset: ViHSD (Vietnamese Hate Speech Detection Dataset)
  • Total samples: ~10,000 Vietnamese comments from social media
  • Training split: ~70%
  • Validation split: ~15%
  • Test split: ~15%

Training Configuration

  • Framework: PyTorch + HuggingFace Transformers
  • Optimizer: AdamW
  • Learning Rate: 2e-5
  • Batch Size: 32
  • Max Length: 256 tokens
  • Epochs: 100 (with early stopping patience: 5)
  • Weight Decay: 0.01
  • Warmup Steps: 500

Contact & Support

License

This model is distributed under the MIT License.

Acknowledgments