amarkv commited on
Commit
0139ea5
·
1 Parent(s): 57dfe7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -8
README.md CHANGED
@@ -14,28 +14,42 @@ widget:
14
  example_title: "Dialog example 3"
15
  ---
16
 
17
- # dialog-inapropriate-messages-classifier
18
 
19
  [BERT classifier from Skoltech](https://huggingface.co/Skoltech/russian-inappropriate-messages), finetuned on contextual data with 4 labels.
20
 
21
  # Training
22
 
23
- *Skoltech/russian-inappropriate-messages* was finetuned on a multiclass data with four classes
24
 
25
  1) OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
26
  2) Toxic label -- the message might be seen as a offensive one in given context.
27
  3) Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
28
  4) Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics)
29
 
30
- The model was finetuned on DATASET_LINK.
31
 
32
  # Evaluation results
33
 
34
- Model achieves the following results:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- | | OK - F1-score | TOXIC - F1-score | SEVERE TOXIC - F1-score | RISKS - F1-score |
37
- |-------------------------|-------------------------|-------------------|----------------|------------------|
38
- | DATASET_TWITTER val.csv | 0.896 | 0.348 | 0.490 | 0.591 |
39
- | DATASET_GENA val.csv | 0.940 | 0.295 | 0.729 | 0.46 |
40
 
41
  The work was done during internship at Tinkoff by [Nikita Stepanov](https://huggingface.co/nikitast).
 
14
  example_title: "Dialog example 3"
15
  ---
16
 
17
+ # response-toxicity-classifier-base
18
 
19
  [BERT classifier from Skoltech](https://huggingface.co/Skoltech/russian-inappropriate-messages), finetuned on contextual data with 4 labels.
20
 
21
  # Training
22
 
23
+ *Skoltech/russian-inappropriate-messages* was finetuned on a multiclass data with four classes (*check the exact mapping between idx and label in* `model.config`).
24
 
25
  1) OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
26
  2) Toxic label -- the message might be seen as a offensive one in given context.
27
  3) Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
28
  4) Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics)
29
 
30
+ The model was finetuned on a soon-to-be-posted dataset of dialogs.
31
 
32
  # Evaluation results
33
 
34
+ Model achieves the following results on the validation datasets (will be posted soon):
35
+
36
+ | OK - F1-score | TOXIC - F1-score | SEVERE TOXIC - F1-score | RISKS - F1-score |
37
+ |-------------------------|-------------------|----------------|------------------|
38
+ | 0.896 | 0.348 | 0.490 | 0.591 |
39
+ | 0.940 | 0.295 | 0.729 | 0.46 |
40
+
41
+ # Use in transformers
42
+
43
+ ```python
44
+ import torch
45
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
+ tokenizer = AutoTokenizer.from_pretrained('tinkoff-ai/response-toxicity-classifier-base')
47
+ model = AutoModelForSequenceClassification.from_pretrained('tinkoff-ai/response-toxicity-classifier-base')
48
+ inputs = tokenizer('[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]норм, у тя как?', max_length=128, add_special_tokens=False, return_tensors='pt')
49
+ with torch.inference_mode():
50
+ logits = model(**inputs).logits
51
+ probas = torch.sigmoid(logits)[0].cpu().detach().numpy()
52
+ ```
53
 
 
 
 
 
54
 
55
  The work was done during internship at Tinkoff by [Nikita Stepanov](https://huggingface.co/nikitast).