matchaoneshot
commited on
Commit
•
b771b6e
1
Parent(s):
662fafa
create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- multi-label-classification
|
4 |
+
- multi-intent-detection
|
5 |
+
- huggingface
|
6 |
+
- deberta-v3
|
7 |
+
- transformers
|
8 |
+
library_name: transformers
|
9 |
+
task:
|
10 |
+
- text-classification
|
11 |
+
license: apache-2.0
|
12 |
+
---
|
13 |
+
|
14 |
+
# Multi-Intent Detection (MID) Model
|
15 |
+
|
16 |
+
This model was fine-tuned for the task of **Multi-Intent Detection (MID)**, a type of multi-label classification where each input can have multiple labels assigned. The dataset used for fine-tuning is specifically designed to simplify the MID task, with the number of labels limited to two per instance.
|
17 |
+
|
18 |
+
## Model Details
|
19 |
+
|
20 |
+
- **Base Model:** DeBERTa-v3-base
|
21 |
+
- **Task:** Multi-label classification
|
22 |
+
- **Number of Labels:** 2
|
23 |
+
- **Fine-tuning Framework:** Hugging Face Transformers
|
24 |
+
|
25 |
+
|
26 |
+
## Training Configuration
|
27 |
+
|
28 |
+
- **Training Arguments:**
|
29 |
+
- **Learning Rate:** 2e-5
|
30 |
+
- **Batch Size (Train):** 16
|
31 |
+
- **Batch Size (Eval):** 16
|
32 |
+
- **Gradient Accumulation Steps:** 2
|
33 |
+
- **Number of Epochs:** 5
|
34 |
+
- **Weight Decay:** 0.01
|
35 |
+
- **Warmup Ratio:** 10%
|
36 |
+
- **Learning Rate Scheduler Type:** Cosine
|
37 |
+
- **Mixed Precision Training:** Enabled (FP16)
|
38 |
+
- **Scheduler**: Cosine annealing
|
39 |
+
- **Logging Steps:** 50
|
40 |
+
|
41 |
+
## Performance Metrics
|
42 |
+
|
43 |
+
The following table shows the model's performance at each epoch during the training:
|
44 |
+
|
45 |
+
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 Score | Accuracy |
|
46 |
+
|-------|---------------|-----------------|----------|---------|----------|----------|
|
47 |
+
| 0 | 0.052800 | 0.051748 | 0.692308 | 0.011897 | 0.023392 | 0.002644 |
|
48 |
+
| 2 | 0.004800 | 0.006419 | 0.983743 | 0.939855 | 0.961298 | 0.881031 |
|
49 |
+
| 4 | 0.003000 | 0.005456 | 0.979877 | 0.949438 | 0.964418 | 0.900198 |
|
50 |
+
|
51 |
+
### Final Evaluation Metrics (Epoch 5):
|
52 |
+
|
53 |
+
After 5 epochs of training, the model achieved the following performance on the evaluation set:
|
54 |
+
|
55 |
+
- **Evaluation Loss**: 0.005456
|
56 |
+
- **Precision**: 0.979877
|
57 |
+
- **Recall**: 0.949438
|
58 |
+
- **F1 Score**: 0.964418
|
59 |
+
- **Accuracy**: 0.900198
|
60 |
+
|
61 |
+
### Training Output
|
62 |
+
|
63 |
+
- **Global Steps**: 4500
|
64 |
+
- **Training Loss**: 0.041661
|
65 |
+
- **Training Runtime**: 5399.55 seconds
|
66 |
+
- **Training Samples per Second**: 26.68
|
67 |
+
- **Training Steps per Second**: 0.83
|
68 |
+
|
69 |
+
|
70 |
+
## Limitations
|
71 |
+
|
72 |
+
- **Simplified Multi-Label Setting:** This model assumes a fixed number of two labels per instance, which may not generalize to datasets with more complex multi-label settings.
|
73 |
+
- **Performance on Unseen Data:** The model's performance may degrade if applied to data distributions significantly different from the training dataset.
|