File size: 3,732 Bytes
3c6d23b
 
697cd6f
 
 
 
12326a0
697cd6f
 
 
 
 
 
 
 
 
 
 
 
 
ade9196
533a40e
3c6d23b
976736f
3c6d23b
 
 
 
 
 
 
 
 
976736f
 
 
 
edf8908
15f687d
3c6d23b
 
 
 
 
 
 
 
 
 
 
 
2286f94
 
 
15f687d
d972dbb
 
 
 
 
 
 
 
 
15f687d
 
d972dbb
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
3f8ff5e
245ed5a
 
 
 
 
3f8ff5e
15f687d
 
 
 
 
 
 
3c6d23b
 
171f37b
15f687d
 
 
3c6d23b
2286f94
 
3c6d23b
15f687d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
 
3c6d23b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- f1
- auc
model-index:
- name: pretrained_model
  results:
  - task:
      name: Text Classification
      type: text-classification
    metrics:
    - name: F1
      type: f1
      value: 0.6356
    - name: AUC
      type: auc
      value: 0.7643
widget:
 - text: "I have trouble understanding what other people think or feel. I also like numbers, and finding patterns in numbers."
---
This model is a hybrid fine-tuned version of distilbert-base-uncased on Reddit dataset contains text related to mental health reports of users. it predicts mental health disorders from textual content.

It achieves the following results on the validation set:

* Loss: 0.1873
* F1: 0.6356
* AUC: 0.7643
* Precision: 0.7671

# Description
This model is based on an existing lighter variation of BERT (distilBERT), in order to predict different mental disorders.   
* It is using combinded features of sentiments and emotions (distilbert-base-uncased-finetuned-sst-2-english and roberta-base-go_emotions).   
* It is trained on a costume dataset of texts or posts (from Reddit) about general experiences of users with mental health problems.   
* All direct mentions of the disorder names in the texts were removed.     
    
It includes the following classes:   

* Borderline
* Anxiety
* Depression
* Bipolar
* OCD
* ADHD
* Schizophrenia
* Asperger
* PTSD

# Training
Train size: 90%   
Val size: 10%   
   
Training set class counts (text samples) after balancing:   
Borderline:       10398   
Anxiety:          10393   
Depression:       10400   
Bipolar:          10359   
OCD:              10413   
ADHD:             10412   
Schizophrenia:    10447   
Asperger:         10470   
PTSD:             10489   
   
Validation set class counts after balancing:   
Borderline:       1180   
Anxiety:          1185   
Depression:       1178   
Bipolar:          1219   
OCD:              1165   
ADHD:             1166   
Schizophrenia:    1131   
Asperger:         1108   
PTSD:             1089   

   
model-finetuning: distilbert/distilbert-base-uncased   
   
additional features (GoEmotions - SamLowe/roberta-base-go_emotions + SST2 - distilbert/distilbert-base-uncased-finetuned-sst-2-english):    
    negative, positive, admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity,    
    desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief,    
    joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, neutral   

The following hyperparameters were used during training:   
   
learning_rate: 1e-5   
train_batch_size: 64   
val_batch_size: 64   
weight_decay: 0.01   
optimizer: AdamW   
num_epochs: 2-3   

# Training results
| Epoch | Training Loss | Validation Loss |
|-------|---------------|-----------------|
| 1.0   | 0.2660        | 0.2031          |
| 2.0   | 0.1891        | 0.1872          |

F1 Score: 0.6355   
AUC Score: 0.7642   

## Classification Report
Borderline:   
 Precision: 0.7606   
 Recall: 0.4525   
 F1-score: 0.5674   
  
Anxiety:   
 Precision: 0.7063   
 Recall: 0.5459   
 F1-score: 0.6158   
  
Depression:   
 Precision: 0.7286   
 Recall: 0.4626   
 F1-score: 0.5659   
  
Bipolar:   
 Precision: 0.7997   
 Recall: 0.4487   
 F1-score: 0.5748   
    
OCD:   
 Precision: 0.8222   
 Recall: 0.5957   
 F1-score: 0.6908   
    
ADHD:   
 Precision: 0.8856   
 Recall: 0.5711   
 F1-score: 0.6944   
 
Schizophrenia:   
 Precision: 0.7540   
 Recall: 0.6153   
 F1-score: 0.6777   
    
Asperger:   
 Precision: 0.6743   
 Recall: 0.6335   
 F1-score: 0.6533   
    
PTSD:
 Precision: 0.7724   
 Recall: 0.6235   
 F1-score: 0.6900