---
language: ["ru"]
tags:
- russian
- pretraining
license: mit
widget:
- text: "[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] норм"
  example_title: "Dialog example 1"
- text: "[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] соси вола"
  example_title: "Dialog example 2"
- text: "[CLS] здравствуйте товарищ [RESPONSE_TOKEN] что это за говно на тебе надето?))"
  example_title: "Dialog example 3"
---

# dialog-inapropriate-messages-classifier

[BERT classifier from Skoltech](https://huggingface.co/Skoltech/russian-inappropriate-messages), finetuned on contextual data with 4 labels.

# Training

*Skoltech/russian-inappropriate-messages* was finetuned on a multiclass data with four classes 

1) OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
2) Toxic label -- the message might be seen as a offensive one in given context.
3) Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
4) Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics) 

The model was finetuned on DATASET_LINK. 

# Evaluation results

Model achieves the following results:

|                         | OK - F1-score | TOXIC - F1-score | SEVERE TOXIC - F1-score | RISKS - F1-score |
|-------------------------|-------------------------|-------------------|----------------|------------------|
| DATASET_TWITTER val.csv | 0.896         | 0.348            | 0.490                   | 0.591            |
| DATASET_GENA val.csv    | 0.940         | 0.295            | 0.729                   | 0.46             |

The work was done during internship at Tinkoff by [Nikita Stepanov](https://huggingface.co/nikitast).