metadata

language:
  - ru
tags:
  - russian
  - pretraining
license: mit
widget:
  - text: '[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] норм'
    example_title: Dialog example 1
  - text: '[CLS] привет [SEP] привет! [SEP] как дела? [RESPONSE_TOKEN] соси вола'
    example_title: Dialog example 2
  - text: >-
      [CLS] здравствуйте товарищ [RESPONSE_TOKEN] что это за говно на тебе
      надето?))
    example_title: Dialog example 3

dialog-inapropriate-messages-classifier

BERT classifier from Skoltech, finetuned on contextual data with 4 labels.

Training

Skoltech/russian-inappropriate-messages was finetuned on a multiclass data with four classes

OK label -- the message is OK in context and does not intent to offend or somehow harm the reputation of a speaker.
Toxic label -- the message might be seen as a offensive one in given context.
Severe toxic label -- the message is offencive, full of anger and was written to provoke a fight or any other discomfort
Risks label -- the message touches on sensitive topics and can harm the reputation of the speaker (i.e. religion, politics)

The model was finetuned on DATASET_LINK.

Evaluation results

Model achieves the following results:

	OK - F1-score	TOXIC - F1-score	SEVERE TOXIC - F1-score	RISKS - F1-score
DATASET_TWITTER val.csv	0.896	0.348	0.490	0.591
DATASET_GENA val.csv	0.940	0.295	0.729	0.46

The work was done during internship at Tinkoff by Nikita Stepanov.