Multi-Intent Detection (MID) Model

This model was fine-tuned for the task of Multi-Intent Detection (MID), a type of multi-label classification where each input can have multiple labels assigned. The dataset used for fine-tuning is specifically designed to simplify the MID task, with the number of labels limited to two per instance.

Model Details

  • Base Model: DeBERTa-v3-large
  • Task: Multi-label classification
  • Number of Labels: 2
  • Fine-tuning Framework: Hugging Face Transformers

Training Configuration

  • Training Arguments:
    • Learning Rate: 2e-5
    • Batch Size (Train): 16
    • Batch Size (Eval): 16
    • Gradient Accumulation Steps: 2
    • Number of Epochs: 5
    • Weight Decay: 0.01
    • Warmup Ratio: 10%
    • Learning Rate Scheduler Type: Cosine
    • Mixed Precision Training: Enabled (FP16)
    • Scheduler: Cosine annealing
    • Logging Steps: 50

Performance Metrics

The following table shows the model's performance at each epoch during the training:

Epoch Training Loss Validation Loss Precision Recall F1 Score Accuracy
0 0.052800 0.051748 0.692308 0.011897 0.023392 0.002644
2 0.004800 0.006419 0.983743 0.939855 0.961298 0.881031
4 0.003000 0.005456 0.979877 0.949438 0.964418 0.900198

Final Evaluation Metrics (Epoch 5):

After 5 epochs of training, the model achieved the following performance on the evaluation set:

  • Evaluation Loss: 0.005456
  • Precision: 0.979877
  • Recall: 0.949438
  • F1 Score: 0.964418
  • Accuracy: 0.900198

Training Output

  • Global Steps: 4500
  • Training Loss: 0.041661
  • Training Runtime: 5399.55 seconds
  • Training Samples per Second: 26.68
  • Training Steps per Second: 0.83

Limitations

  • Simplified Multi-Label Setting: This model assumes a fixed number of two labels per instance, which may not generalize to datasets with more complex multi-label settings.
  • Performance on Unseen Data: The model's performance may degrade if applied to data distributions significantly different from the training dataset.
Downloads last month
4
Safetensors
Model size
435M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.