BanglaBERT Crime Tagging — Multi-Task Model

A multi-task model fine-tuned on Bangla news articles for automated crime analysis. Built on top of csebuetnlp/banglabert (a Bangla ELECTRA-based encoder), it jointly predicts six structured outputs from a single article (headline + body).


Model Architecture

BanglaBERT (ELECTRA encoder, hidden=768)
│
├── [CLS] token → Dropout(0.1)
│   ├── crime_head    Linear(768 → 2)      is_crime
│   ├── event_head    Linear(768 → 49)    event type
│   ├── direct_head   Linear(768 → 2)      is_direct
│   └── origin_head   Linear(768 → 3)      origin
│
└── All tokens → Dropout(0.1)
    ├── loc_ner_head   Linear(768 → 3)     location BIO tags
    └── time_ner_head  Linear(768 → 3)     time-span BIO tags

Input: [CLS] headline [SEP] article content [SEP], max 256 tokens.


Output Heads

Head Task type Output
is_crime Binary classification True / False
event Multi-class (49 classes) event type string or null
is_direct Binary classification True / False
origin 3-class classification local / international / null
loc_ner Token classification (BIO) O, B-LOC, I-LOC
time_ner Token classification (BIO) O, B-TIME, I-TIME

Training Details

Parameter Value
Base model csebuetnlp/banglabert
Max sequence length 256
Batch size 16
Learning rate 2e-5
Warmup ratio 0.1
Weight decay 0.01
Loss (event head) Focal loss (γ=2.0)
Loss (other heads) Class-weighted cross-entropy
Best val avg-F1 N/A
Early-stop epoch N/A

Evaluation Results (Test Set)

Overall average macro-F1 across all heads: 0.8413

is_crime (Binary)

Precision Recall F1 Support
not-crime
crime

origin (3-class)

Class Precision Recall F1 Support
none 0.90 0.82 0.86 907
international 0.80 0.91 0.85 338
local 0.94 0.95 0.94 1873
macro avg 0.88 0.89 0.88 3118
weighted avg 0.91 0.91 0.91 3118

Accuracy: 0.91

event (48-class) — macro-F1: 0.7003 | weighted-F1: 0.7810

Event Precision Recall F1 Support
none 0.91 0.79 0.85 724
armed_attack 0.60 0.60 0.60 40
arms_trafficking 0.75 0.55 0.63 11
arrest 0.80 0.84 0.82 187
arson 0.79 0.93 0.86 29
assault 0.80 0.81 0.80 122
attempted_murder 1.00 0.80 0.89 5
blockade 0.80 0.88 0.84 32
bribery 0.43 0.75 0.55 4
burglary 0.50 0.50 0.50 2
child_abuse 0.83 0.71 0.77 7
corruption 0.74 0.82 0.78 131
cybercrime 0.71 0.75 0.73 61
data_breach 0.71 0.71 0.71 7
drug_trafficking 0.71 0.77 0.74 31
fraud 0.76 0.73 0.75 119
gang_crime 0.80 0.50 0.62 16
hacking 0.25 0.20 0.22 5
human_chain 0.62 0.89 0.73 9
human_trafficking 0.88 0.70 0.78 20
identity_theft 0.00 0.00 0.00 0
kidnapping 0.86 0.86 0.86 21
legal_proceedings 0.80 0.79 0.79 200
looting 0.62 0.57 0.59 14
movement 0.23 1.00 0.38 3
murder 0.84 0.81 0.82 228
online_scam 0.29 0.25 0.27 8
organized_crime 0.00 0.00 0.00 0
other_crime 0.44 0.55 0.49 128
phishing 0.83 1.00 0.91 5
police_action 0.66 0.68 0.67 164
procession 0.75 0.67 0.71 9
protest_unrest 0.79 0.79 0.79 276
raid 0.83 0.91 0.87 22
rally 0.67 0.62 0.65 16
ransomware 1.00 0.50 0.67 2
rape 0.80 0.96 0.87 47
riot 0.41 0.47 0.44 15
robbery 0.80 0.88 0.84 58
sexual_harassment 0.70 0.83 0.76 52
shooting 0.65 0.85 0.74 41
sit_in 0.00 0.00 0.00 0
smuggling 0.83 0.94 0.88 36
snatching 0.85 0.85 0.85 33
stabbing 0.77 0.86 0.81 28
strike 0.92 0.96 0.94 25
terrorism 1.00 0.40 0.57 5
theft 0.78 0.84 0.81 45
vandalism 0.85 0.75 0.79 75
macro avg 0.68 0.69 0.67 3118
weighted avg 0.79 0.78 0.78 3118

Accuracy: 0.78

loc_ner (Location BIO) — macro-F1: 0.7743

Tag Precision Recall F1 Support
O 1.00 0.99 0.99 164,441
B-LOC 0.61 0.90 0.72 1,691
I-LOC 0.49 0.79 0.60 661
macro avg 0.70 0.89 0.77 166,793
weighted avg 0.99 0.99 0.99 166,793

Accuracy: 0.99

time_ner (Time-span BIO) — macro-F1: 0.8499

Tag Precision Recall F1 Support
O 1.00 1.00 1.00 164,825
B-TIME 0.67 0.90 0.77 795
I-TIME 0.69 0.90 0.78 1,173
macro avg 0.79 0.93 0.85 166,793
weighted avg 1.00 0.99 0.99 166,793

Accuracy: 0.99


Event Labels (49 classes)

[
  "none",
  "armed_attack",
  "arms_trafficking",
  "arrest",
  "arson",
  "assault",
  "attempted_murder",
  "blockade",
  "bribery",
  "burglary",
  "child_abuse",
  "corruption",
  "cybercrime",
  "data_breach",
  "drug_trafficking",
  "fraud",
  "gang_crime",
  "hacking",
  "human_chain",
  "human_trafficking",
  "identity_theft",
  "kidnapping",
  "legal_proceedings",
  "looting",
  "movement",
  "murder",
  "online_scam",
  "organized_crime",
  "other_crime",
  "phishing",
  "police_action",
  "procession",
  "protest_unrest",
  "raid",
  "rally",
  "ransomware",
  "rape",
  "riot",
  "robbery",
  "sexual_harassment",
  "shooting",
  "sit_in",
  "smuggling",
  "snatching",
  "stabbing",
  "strike",
  "terrorism",
  "theft",
  "vandalism"
]

Origin Labels

[
  "none",
  "international",
  "local"
]

Usage

Installation

pip install torch transformers huggingface_hub

Download model files

from huggingface_hub import hf_hub_download

# Download the checkpoint (contains all head weights + label metadata)
ckpt_path = hf_hub_download(repo_id="arafatfahim/crime-event-detection", filename="checkpoint.pt")

Full inference example

import torch
import torch.nn as nn
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel
from huggingface_hub import hf_hub_download

REPO_ID = "arafatfahim/crime-event-detection"
DEVICE  = torch.device("cuda" if torch.cuda.is_available() else "cpu")
MAX_LEN = 256


# ── 1. Recreate the model class ──────────────────────────────────────────────
class BanglaBertMultiTask(nn.Module):
    def __init__(self, bert, num_events, num_origins):
        super().__init__()
        self.bert    = bert
        hidden       = self.bert.config.hidden_size
        self.dropout = nn.Dropout(0.1)
        self.crime_head    = nn.Linear(hidden, 2)
        self.event_head    = nn.Linear(hidden, num_events)
        self.direct_head   = nn.Linear(hidden, 2)
        self.origin_head   = nn.Linear(hidden, num_origins)
        self.loc_ner_head  = nn.Linear(hidden, 3)
        self.time_ner_head = nn.Linear(hidden, 3)

    def forward(self, input_ids, attention_mask):
        out     = self.bert(input_ids=input_ids, attention_mask=attention_mask)
        seq_out = self.dropout(out.last_hidden_state)
        cls     = seq_out[:, 0, :]
        return {
            "is_crime" : self.crime_head(cls),
            "event"    : self.event_head(cls),
            "is_direct": self.direct_head(cls),
            "origin"   : self.origin_head(cls),
            "loc_ner"  : self.loc_ner_head(seq_out),
            "time_ner" : self.time_ner_head(seq_out),
        }


# ── 2. Load checkpoint & tokenizer ──────────────────────────────────────────
ckpt_path = hf_hub_download(repo_id=REPO_ID, filename="checkpoint.pt")
ckpt      = torch.load(ckpt_path, map_location=DEVICE)

event_labels  = ckpt["event_labels"]   # list of str
origin_labels = ckpt["origin_labels"]  # ["none", "international", "local"]

tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
bert      = AutoModel.from_pretrained(REPO_ID)

model = BanglaBertMultiTask(bert, ckpt["num_events"], ckpt["num_origins"]).to(DEVICE)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()


# ── 3. Predict ───────────────────────────────────────────────────────────────
headline = "ঢাকায় ছিনতাইয়ের ঘটনায় যুবক গ্রেপ্তার"
content  = "রাতে একটি মোটরসাইকেল থামিয়ে যাত্রীর মোবাইল ও টাকা ছিনিয়ে নেয় দুর্বৃত্তরা।"

enc = tokenizer(
    headline, content,
    max_length=MAX_LEN,
    padding="max_length",
    truncation=True,
    return_tensors="pt",
    return_offsets_mapping=True,
    return_token_type_ids=True,
)
offset_mapping = enc.pop("offset_mapping").squeeze(0).tolist()
token_type_ids = enc["token_type_ids"].squeeze(0).tolist()

with torch.no_grad():
    logits = model(enc["input_ids"].to(DEVICE), enc["attention_mask"].to(DEVICE))

# Classification heads
is_crime  = bool(logits["is_crime"].argmax(dim=-1).item())
is_direct = bool(logits["is_direct"].argmax(dim=-1).item())

event_idx  = logits["event"].argmax(dim=-1).item()
event      = event_labels[event_idx] if event_idx != 0 else None
event_conf = F.softmax(logits["event"], dim=-1).squeeze()[event_idx].item()

origin_idx = logits["origin"].argmax(dim=-1).item()
origin     = origin_labels[origin_idx] if origin_idx != 0 else None

# NER heads — decode BIO spans from token predictions
def decode_bio(preds, offsets, type_ids, texts):
    spans, current = [], []
    for pred, (s, e), tid in zip(preds, offsets, type_ids):
        text = texts[tid] if tid < len(texts) else ""
        if pred == 1:
            if current: spans.append("".join(current))
            current = [] if (s == 0 and e == 0) else [text[s:e]]
        elif pred == 2 and current and not (s == 0 and e == 0):
            current.append(text[s:e])
        else:
            if current: spans.append("".join(current)); current = []
    if current: spans.append("".join(current))
    return list(dict.fromkeys(spans))   # deduplicate, preserve order

loc_preds  = logits["loc_ner"].squeeze(0).argmax(dim=-1).tolist()
time_preds = logits["time_ner"].squeeze(0).argmax(dim=-1).tolist()
locations  = decode_bio(loc_preds,  offset_mapping, token_type_ids, [headline, content])
time_spans = decode_bio(time_preds, offset_mapping, token_type_ids, [headline, content])

print({
    "is_crime"      : is_crime,
    "event"         : event,
    "event_conf"    : round(event_conf, 4),
    "is_direct"     : is_direct,
    "origin"        : origin,
    "locations"     : locations,
    "event_occurred": time_spans[0] if time_spans else None,
})

Expected output structure

{
  "is_crime"      : true,
  "event"         : "theft",
  "event_conf"    : 0.9132,
  "is_direct"     : true,
  "origin"        : "local",
  "locations"     : ["ঢাকা"],
  "event_occurred": null
}

Files in this repository

File Description
config.json BERT encoder config (ELECTRA architecture)
model.safetensors BERT encoder weights
tokenizer_config.json / tokenizer.json Tokenizer files
checkpoint.pt Full model weights (all 6 heads) + label metadata

Note: checkpoint.pt is required to restore the classification/NER heads. The config.json + model.safetensors files only contain the shared BERT encoder.

Downloads last month
21
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for arafatfahim/crime-event-detection

Finetuned
(26)
this model