File size: 3,272 Bytes
ef94eb9
 
 
d5e8202
ef94eb9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
license: apache-2.0
---


---

# Incident Impact Classification Model

## Model Description

This model is a fine-tuned version of BERT (bert-base-uncased) designed to classify the impact of incident records based on their short descriptions. The impact is categorized into three levels: low, medium, and high.

## Intended Use

Only for Demo pupose - The model is intended to assist in the automatic categorization of incident impacts to streamline incident management processes. It can be used by IT service management teams to quickly identify the severity of incidents based on short descriptions.

## How to Use

### Inference

To use the model for inference, you can utilize the Hugging Face `transformers` library. Below is an example of how to load the model and make predictions:

```python
from transformers import pipeline

model_name = "xeroISB/incidentImpactModel"
classifier = pipeline("text-classification", model=model_name)

short_description = "Network outage in building 12"
prediction = classifier(short_description)
print(prediction)
```

### Training

The model was trained using the following configuration:
- **Model:** BERT (bert-base-uncased)
- **Learning Rate:** 2e-5
- **Batch Size:** 16
- **Epochs:** 3
- **Evaluation Strategy:** Epoch
- **Optimizer:** AdamW
- **Loss Function:** Cross-Entropy Loss

### Dataset

The dataset used for training includes the following columns:
- `short_description`: A brief description of the incident.
- `impact`: The impact level categorized into three classes: low (3), medium (2), and high (1).

The `impact` values were mapped to integer labels as follows:
- 3 (low) -> 0
- 2 (medium) -> 1
- 1 (high) -> 2

### Tokenization

The `short_description` was tokenized using the BERT tokenizer with padding to the maximum length and truncation enabled.

## Performance

### Confusion Matrix

The confusion matrix on the validation set is as follows:

|      | Predicted Low | Predicted Medium | Predicted High |
|------|---------------|------------------|----------------|
| Low  |       414     |        194       |       0        |
| Medium |     463     |        220       |       0        |
| High  |      33      |        13        |       0        |

### Classification Report

```
              Precision - 47%
              Accuracy  - 47%
              Recall    - 47%
              F1 - 45%

```

## Limitations

- The model's performance is dependent on the quality and representativeness of the training data.
- It may not perform well on unseen incident descriptions that are significantly different from the training data.
- The model's predictions are limited to the context of the provided short descriptions and do not take into account other contextual information.

## Ethical Considerations

- Ensure the model is used in an ethical manner, considering the potential impact of misclassifications on incident management and prioritization.
- Regularly monitor the model's performance and update it with new data to maintain accuracy and reliability.

## License

This model is released under the [Apache 2.0 License](https://opensource.org/licenses/Apache2.0).

## Contact Information

For questions or issues, please contact [Tushar Mishra](mailto:tushar_mishra2023@ampba.isb.edu).

---