Multi-Intent Detection (MID) Model

This model was fine-tuned for the task of Multi-Intent Detection (MID), a type of multi-label classification where each input can have multiple labels assigned. The dataset used for fine-tuning is specifically designed to simplify the MID task, with the number of labels limited to two per instance.

Model Details

Base Model: DeBERTa-v3-large
Task: Multi-label classification
Number of Labels: 2
Fine-tuning Framework: Hugging Face Transformers

Training Configuration

Training Arguments:
- Learning Rate: 2e-5
- Batch Size (Train): 16
- Batch Size (Eval): 16
- Gradient Accumulation Steps: 2
- Number of Epochs: 5
- Weight Decay: 0.01
- Warmup Ratio: 10%
- Learning Rate Scheduler Type: Cosine
- Mixed Precision Training: Enabled (FP16)
- Scheduler: Cosine annealing
- Logging Steps: 50

Performance Metrics

The following table shows the model's performance at each epoch during the training:

Epoch	Training Loss	Validation Loss	Precision	Recall	F1 Score	Accuracy
0	0.052800	0.051748	0.692308	0.011897	0.023392	0.002644
2	0.004800	0.006419	0.983743	0.939855	0.961298	0.881031
4	0.003000	0.005456	0.979877	0.949438	0.964418	0.900198

Final Evaluation Metrics (Epoch 5):

After 5 epochs of training, the model achieved the following performance on the evaluation set:

Evaluation Loss: 0.005456
Precision: 0.979877
Recall: 0.949438
F1 Score: 0.964418
Accuracy: 0.900198

Training Output

Global Steps: 4500
Training Loss: 0.041661
Training Runtime: 5399.55 seconds
Training Samples per Second: 26.68
Training Steps per Second: 0.83

Limitations

Simplified Multi-Label Setting: This model assumes a fixed number of two labels per instance, which may not generalize to datasets with more complex multi-label settings.
Performance on Unseen Data: The model's performance may degrade if applied to data distributions significantly different from the training dataset.