NiGuLa commited on
Commit
77cff9e
1 Parent(s): 2536ef2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+
5
+ tags:
6
+ - toxic comments classification
7
+
8
+ licenses:
9
+ - cc-by-nc-sa
10
+ ---
11
+
12
+ Bert-based classifier trained on merge of Russian Language Toxic Comments [dataset](https://www.kaggle.com/blackmoon/russian-language-toxic-comments/metadata) collected from 2ch.hk and Toxic Russian Comments [dataset](https://www.kaggle.com/alexandersemiletov/toxic-russian-comments) collected from ok.ru.
13
+
14
+ The datasets were merged, shuffled, and split into train,dev,test splits in 80-10-10 proportion.
15
+ The metrics obtained from test dataset is as follows
16
+
17
+ | | precision | recall | f1-score | support |
18
+ |:------------:|:---------:|:------:|:--------:|:-------:|
19
+ | 0 | 0.98 | 0.99 | 0.98 | 21384 |
20
+ | 1 | 0.94 | 0.92 | 0.93 | 4886 |
21
+ | accuracy | 0.97 | 26270 | 0.94 | |
22
+ | macro avg | 0.96 | 0.96 | 0.96 | 26270 |
23
+ | weighted avg | 0.97 | 0.97 | 0.97 | 26270 |
24
+