Multi-Intent Detection (MID) Model

This model was fine-tuned for the task of Multi-Intent Detection (MID), a type of multi-label classification where each input can have multiple labels assigned. The dataset used for fine-tuning is specifically designed to simplify the MID task, with the number of labels limited to two per instance.

Model Details

  • Base Model: DeBERTa-v3-large
  • Task: Multi-label classification
  • Number of Labels: 2
  • Fine-tuning Framework: Hugging Face Transformers

Training Configuration

  • Training Arguments:
    • Learning Rate: 2e-5
    • Batch Size (Train): 16
    • Batch Size (Eval): 16
    • Gradient Accumulation Steps: 2
    • Number of Epochs: 5
    • Weight Decay: 0.01
    • Warmup Ratio: 10%
    • Learning Rate Scheduler Type: Cosine
    • Mixed Precision Training: Enabled (FP16)
    • Scheduler: Cosine annealing
    • Logging Steps: 50

Performance Metrics

The following table shows the model's performance at each epoch during the training:

Epoch Training Loss Validation Loss Precision Recall F1 Score Accuracy
0 0.052800 0.051748 0.692308 0.011897 0.023392 0.002644
2 0.004800 0.006419 0.983743 0.939855 0.961298 0.881031
4 0.003000 0.005456 0.979877 0.949438 0.964418 0.900198

Final Evaluation Metrics (Epoch 5):

After 5 epochs of training, the model achieved the following performance on the evaluation set:

  • Evaluation Loss: 0.005456
  • Precision: 0.979877
  • Recall: 0.949438
  • F1 Score: 0.964418
  • Accuracy: 0.900198

Training Output

  • Global Steps: 4500
  • Training Loss: 0.041661
  • Training Runtime: 5399.55 seconds
  • Training Samples per Second: 26.68
  • Training Steps per Second: 0.83

Limitations

  • Simplified Multi-Label Setting: This model assumes a fixed number of two labels per instance, which may not generalize to datasets with more complex multi-label settings.
  • Performance on Unseen Data: The model's performance may degrade if applied to data distributions significantly different from the training dataset.
Downloads last month
3
Safetensors
Model size
435M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.