File size: 3,178 Bytes
7c965f7 2d758de 7c965f7 e0ca997 7c965f7 2d758de 7c965f7 2d758de e0ca997 217053a d0dd2e4 0fe590e d0dd2e4 0fe590e d0dd2e4 217053a 2d758de |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
---
language: "en"
thumbnail: "https://pbs.twimg.com/profile_images/1092721745994440704/d6R-AHzj_400x400.jpg"
tags:
- propaganda
- bert
license: "MIT"
datasets:
-
metrics:
-
---
Propaganda Techniques Analysis BERT
----
This model is a BERT based model to make predictions of propaganda techniques in
news articles in English. The model is described in
[this paper](https://propaganda.qcri.org/papers/EMNLP_2019__Fine_Grained_Propaganda_Detection.pdf).
## Model description
Please find propaganda definition here:
https://propaganda.qcri.org/annotations/definitions.html
You can also try the model in action here: https://www.tanbih.org/prta
### How to use
```python
>>> from transformers import BertTokenizerFast
>>> from .model import BertForTokenAndSequenceJointClassification
>>>
>>> tokenizer = BertTokenizerFast.from_pretrained('bert-base-cased')
>>> model = BertForTokenAndSequenceJointClassification.from_pretrained(
>>> "QCRI/PropagandaTechniquesAnalysis-en-BERT",
>>> revision="v0.1.0",
>>> )
>>>
>>> inputs = tokenizer.encode_plus("Hello, my dog is cute", return_tensors="pt")
>>> outputs = model(**inputs)
>>> sequence_class_index = torch.argmax(outputs.sequence_logits, dim=-1)
>>> sequence_class = model.sequence_tags[sequence_class_index[0]]
>>> token_class_index = torch.argmax(outputs.token_logits, dim=-1)
>>> tokens = tokenizer.convert_ids_to_tokens(inputs.input_ids[0][1:-1])
>>> tags = [model.token_tags[i] for i in token_class_index[0].tolist()[1:-1]]
```
### BibTeX entry and citation info
```bibtex
@inproceedings{da-san-martino-etal-2019-fine,
title = "Fine-Grained Analysis of Propaganda in News Article",
author = "Da San Martino, Giovanni and
Yu, Seunghak and
Barr{\'o}n-Cede{\~n}o, Alberto and
Petrov, Rostislav and
Nakov, Preslav",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1565",
doi = "10.18653/v1/D19-1565",
pages = "5636--5646",
abstract = "Propaganda aims at influencing people{'}s mindset with the purpose of advancing a specific agenda. Previous work has addressed propaganda detection at document level, typically labelling all articles from a propagandistic news outlet as propaganda. Such noisy gold labels inevitably affect the quality of any learning system trained on them. A further issue with most existing systems is the lack of explainability. To overcome these limitations, we propose a novel task: performing fine-grained analysis of texts by detecting all fragments that contain propaganda techniques as well as their type. In particular, we create a corpus of news articles manually annotated at fragment level with eighteen propaganda techniques and propose a suitable evaluation measure. We further design a novel multi-granularity neural network, and we show that it outperforms several strong BERT-based baselines.",
}
```
|