metadata

library_name: transformers
language: zh
tags:
  - violence-detection
  - sequence-classification
license: apache-2.0
widget:
  - text: 在事件前，他公然向行凶暴徒的反华集会，发表煽动反华排华的讲话
  - text: 今天天气那么好，我们出去玩吧！
  - text: 公共交通工具使用率越来越高，人们更愿意选择绿色出行方式。

BERT Violence Detection Model

This model is fine-tuned for detecting violent and non-violent content in Chinese text using BERT. It is based on the research presented in the paper "Vectors of Violence: Legitimation and Distribution of State Power in the People's Liberation Army Daily (Jiefangjun Bao), 1956-1989" by Aaron Gilkison and Maciej Kurzynski.

Model Description

The model is based on bert-base-chinese and fine-tuned for sequence classification with two labels: non-violent (0) and violent (1). The fine-tuning corpus included a large number of articles from the People's Liberation Army Daily (Jiefangjun Bao, or JFJB) which is one of the official newspapers in the People's Republic of China. Further updates to the model are expected in the future.

Usage

Here’s how you can use this model with the Transformers library:

from transformers import BertTokenizer, BertForSequenceClassification

model_name = "qhchina/BERT-JFJB-violence-0.1"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)

sentence = "今天天气那么好，我们出去玩吧"

inputs = tokenizer(sentence, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

# Get the predicted label
predicted_label = logits.argmax(-1).item()
label_mapping = {0: "non-violent", 1: "violent"}
predicted_label_name = label_mapping[predicted_label]

print(f"Sentence: {sentence}")
print(f"Predicted label: {predicted_label_name}")

Citing the Paper

If you use this model in your research, please kindly cite the following paper:

Gilkison, Aaron, and Maciej Kurzynski. "Vectors of Violence: Legitimation and Distribution of State Power in the People's Liberation Army Daily (Jiefangjun Bao), 1956-1989." Journal of Cultural Analytics, vol. 9, no. 1, May 2024, https://doi.org/10.22148/001c.115481.