hipnologo commited on
Commit
2f4e11d
1 Parent(s): b98bf92

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - hipnologo/churn_textual_label
4
+ - hipnologo/telecom_churn
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ pipeline_tag: text-classification
9
+ tags:
10
+ - churn
11
+ - gpt2
12
+ - sentiment-analysis
13
+ - fine-tuned
14
+ widget:
15
+ - text: "Can you tell me about a customer from MD with an account length of 189 and area code 415 who does not have an international plan, and has a voice mail plan and that have made 1 customer service calls."
16
+ - text: "Can you tell me about a customer from SC with an account length of 87 and area code 408 who does not have an international plan, and does not have a voice mail plan and that have made 2 customer service calls."
17
+ ---
18
+
19
+ # Fine-tuned GPT-2 Model for Telecom Churn Analysis
20
+
21
+ ## Model Description
22
+
23
+ This is a GPT-2 model fine-tuned on the Telecom Churn dataset for churn analysis. It classifies a customer's text into two classes: "churn" or "no churn".
24
+
25
+ ## Intended Uses & Limitations
26
+
27
+ This model is intended to be used for binary churn analysis of English customer texts. It can determine whether a customer is likely to churn or not. It should not be used for languages other than English, or for text with ambiguous churn indications.
28
+
29
+ ## How to Use
30
+
31
+ Here's a simple way to use this model:
32
+
33
+ ```python
34
+ from transformers import GPT2Tokenizer, GPT2ForSequenceClassification
35
+
36
+ tokenizer = GPT2Tokenizer.from_pretrained("hipnologo/gpt2-churn-finetune")
37
+ model = GPT2ForSequenceClassification.from_pretrained("hipnologo/gpt2-churn-finetune")
38
+
39
+ text = "Your customer text here!"
40
+
41
+ # encoding the input text
42
+ input_ids = tokenizer.encode(text, return_tensors="pt")
43
+
44
+ # Move the input_ids tensor to the same device as the model
45
+ input_ids = input_ids.to(model.device)
46
+
47
+ # getting the logits
48
+ logits = model(input_ids).logits
49
+
50
+ # getting the predicted class
51
+ predicted_class = logits.argmax(-1).item()
52
+
53
+ print(f"The churn prediction by the model is: {'Churn' if predicted_class == 1 else 'No Churn'}")
54
+ ```
55
+
56
+ ## Training Procedure
57
+ The model was trained using the 'Trainer' class from the transformers library, with a learning rate of `2e-5`, batch size of 1, and 3 training epochs.
58
+
59
+ ## Evaluation
60
+ The fine-tuned model was evaluated on the test dataset. Here are the results:
61
+
62
+ - Evaluation Loss: 0.28965
63
+ - Evaluation Accuracy: 0.9
64
+ - Evaluation F1 Score: 0.90239
65
+ - Evaluation Precision: 0.85970
66
+ - Evaluation Recall: 0.94954
67
+
68
+ The evaluation metrics suggest that the model has a high accuracy and good precision-recall balance for the task of churn classification.
69
+
70
+ ## How to Reproduce
71
+ The evaluation results can be reproduced by loading the model and the tokenizer from Hugging Face Model Hub and then running the model on the evaluation dataset using the Trainer class from the Transformers library, with the compute_metrics function defined as above.
72
+
73
+ The evaluation loss is the cross-entropy loss of the model on the evaluation dataset, a measure of how well the model's predictions match the actual labels. The closer this is to zero, the better.
74
+
75
+ The evaluation accuracy is the proportion of predictions the model got right. This number is between 0 and 1, with 1 meaning the model got all predictions right.
76
+
77
+ The F1 score is a measure of a test's accuracy that considers both precision (the number of true positive results divided by the number of all positive results) and recall (the number of true positive results divided by the number of all samples that should have been identified as positive). An F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0.
78
+
79
+ The evaluation precision is how many of the positively classified were actually positive. The closer this is to 1, the better.
80
+
81
+ The evaluation recall is how many of the actual positives our model captured through labeling it as positive. The closer this is to 1, the better.
82
+
83
+ ## Fine-tuning Details
84
+ The model was fine-tuned using the Telecom Churn dataset.