News Category Classification for IPTC NewsCodes

This model is a fine-tuned version of KB/bert-base-swedish-cased on a private dataset.

Built from a limited set of English, Swedish and Norwegian titles to classify news content within 16 categories as specified by the IPTC NewsCodes.

The model has been fine-tuned on a dataset that is greatly skewed, but has been slightly augmented to stabilize it.

Model description

The model is intended to categorize Norwegian, Swedish and English news content within the specified 16 categories but is a test model for demonstration purposes. It needs more data within several categories to provide 100% value but it will outperform Claude Haiku and GPT-3.5 on this use case.

Intended uses & limitations

Use it to categorize news texts. Only set the category if the value is at least 60% for the label, otherwise the model is uncertain.

Test examples

Input: Mann siktet for drapsforsøk på Slovakias statsministeren

Output: politics

Input: Tre døde i kioskbrann i Tyskland

Output: disaster, accident, and emergency incident

Input: Kultfilm får Netflix-oppfølger. Kultfilmen «Happy Gilmore» fra 1996 får en oppfølger på Netflix. Det røper strømmetjenesten selv på X, tidligere Twitter. –Happy Gilmore er tilbake!

Output: arts, culture, entertainment and media

Performance

It achieves the following results on the evaluation set:

Loss: 0.8030
Accuracy: 0.7431
F1: 0.7474
Precision: 0.7695
Recall: 0.7431

See the performance (accuracy) for each label below:

Arts, culture, entertainment and media: 0.6842
Conflict, war and peace: 0.7351
Crime, law and justice: 0.8918
Disaster, accident, and emergency incident: 0.8699
Economy, business, and finance: 0.6893
Environment: 0.4483
Health: 0.7222
Human interest: 0.3182
Labour: 0.5
Lifestyle and leisure: 0.5556
Politics: 0.7909
Science and technology: 0.4583
Society: 0.3538
Sport: 0.9615
Weather: 1.0
Religion: 0.0

Training and evaluation data

Trained with the trainer, setting a learning rate of 2e-05 and batch size of 16 for 3 epochs.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall	Accuracy Label Arts, culture, entertainment and media	Accuracy Label Conflict, war and peace	Accuracy Label Crime, law and justice	Accuracy Label Disaster, accident, and emergency incident	Accuracy Label Economy, business, and finance	Accuracy Label Environment	Accuracy Label Health	Accuracy Label Human interest	Accuracy Label Labour	Accuracy Label Lifestyle and leisure	Accuracy Label Politics	Accuracy Label Science and technology	Accuracy Label Society	Accuracy Label Sport	Accuracy Label Weather
1.9761	0.2907	200	1.4046	0.6462	0.6164	0.6057	0.6462	0.3158	0.8315	0.7629	0.7055	0.5437	0.0	0.5	0.0	0.0	0.3333	0.4843	0.0833	0.0	0.9615	0.0
1.2153	0.5814	400	1.0225	0.6894	0.6868	0.7652	0.6894	0.7895	0.6554	0.8196	0.8562	0.6408	0.2414	0.8333	0.1364	0.0	0.6667	0.8467	0.375	0.0154	0.9615	1.0
0.954	0.8721	600	0.8858	0.7231	0.7138	0.7309	0.7231	0.7368	0.7795	0.8918	0.8699	0.6214	0.3448	0.8889	0.1818	1.0	0.5556	0.6899	0.25	0.0462	0.9615	1.0
0.6662	1.1628	800	0.9381	0.6881	0.7009	0.7618	0.6881	0.7895	0.6126	0.8454	0.8630	0.6505	0.4483	0.7222	0.2273	1.0	0.4444	0.8293	0.5417	0.2308	0.9615	1.0
0.5554	1.4535	1000	0.8791	0.7025	0.7124	0.7628	0.7025	0.7368	0.6478	0.9021	0.8562	0.6602	0.3103	0.7778	0.3636	0.5	0.5556	0.8084	0.5	0.1846	0.9615	1.0
0.4396	1.7442	1200	0.8275	0.7175	0.7280	0.7686	0.7175	0.7895	0.6631	0.8196	0.8836	0.6893	0.3793	0.8333	0.4091	0.5	0.5556	0.8362	0.4167	0.3692	0.9615	1.0
0.383	2.0349	1400	0.7929	0.745	0.7501	0.7653	0.745	0.6842	0.7841	0.8866	0.8767	0.7087	0.4483	0.7778	0.4091	0.5	0.5556	0.6899	0.4167	0.2923	0.9615	0.0
0.3418	2.3256	1600	0.8042	0.7438	0.7440	0.7686	0.7438	0.7895	0.7351	0.9072	0.8493	0.7864	0.4483	0.7778	0.3182	0.5	0.5556	0.7909	0.4167	0.1846	0.9615	0.0
0.248	2.6163	1800	0.8387	0.7275	0.7325	0.7610	0.7275	0.6842	0.6891	0.8814	0.8699	0.7573	0.4138	0.8333	0.4091	0.5	0.5556	0.8014	0.4167	0.2769	0.9615	0.0
0.2525	2.9070	2000	0.8137	0.735	0.7413	0.7697	0.735	0.6842	0.7106	0.8763	0.8699	0.6796	0.4483	0.7222	0.3636	0.5	0.5556	0.8153	0.4583	0.3385	0.9615	0.0

Framework versions

Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

ilsilfverskiold
/

classify-news-category-iptc

News Category Classification for IPTC NewsCodes

Model description

Intended uses & limitations

Test examples

Performance

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

News Category Classification for IPTC NewsCodes

Model description

Intended uses & limitations

Test examples

Performance

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from KB/bert-base-swedish-cased

Evaluation results

Finetuned from