--- license: cc-by-sa-4.0 datasets: - kubota/defamation-japanese-twitter language: - ja pipeline_tag: text-classification widget: - text: お前のことを殺すぞ - text: 本当に不細工だなぁ - text: あの人は殺人を犯した犯罪者らしい --- # luke-large-defamation-detection-japanese # 日本語誹謗中傷検出器 This model is a fine-tuned version of [studio-ousia/luke-japanese-large](https://huggingface.co/studio-ousia/luke-japanese-large) for the Japanese language finetuned for automatic defamation detection. The original foundation model was finetuned on a balanced dataset created by unifying two datasets: - [![Generic badge](https://img.shields.io/badge/Dataset-DefamationJapaneseTwitter-red.svg)](https://huggingface.co/datasets/kubota/defamation-japanese-twitter) - `DefamationJapaneseYouTube` : TBA Labels:\ 0 -> "中傷性のない発言"\ 1 -> "脅迫的な発言"\ 2 -> "侮蔑的な発言"\ 3"-> "名誉を低下させる発言" ## Example Pipeline [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kubotaissei/defamation_japanese_twitter/blob/master/notebooks/pipeline_example.ipynb) ```python # !pip install transformers==4.26 sentencepiece from transformers import pipeline pipe = pipeline(model="kubota/luke-large-defamation-detection-japanese") pipe("あの人は殺人を犯した犯罪者らしい") ``` ``` [{'label': '名誉を低下させる発言', 'score': 0.8889994621276855}] ``` ## Training Scripts [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kubotaissei/defamation_japanese_twitter/blob/master/notebooks/train_example.ipynb) ## Licenses The finetuned model with all attached files is licensed under [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/), or Creative Commons Attribution-ShareAlike 4.0 International License. Creative Commons License