nishan-chatterjee
link to github
996e640
---
title: multilingual-persuasion-detection-from-text
app_file: inference.py
pinned: false
license: gpl-3.0
language:
- multilingual
tags:
- mbart-50
- text-classification
- multi-label-classification
- persuasion-detection
- meme-analysis
- social-media-analysis
- propaganda-detection
- hierarchical-classification
- multilingual
pipeline_tag: text-classification
inference: True
widget:
- text: "THIS IS WHY YOU NEED. A SHARPIE WITH YOU AT ALL TIMES."
example_title: "Sentence 1"
- text: "WHEN YOU'RE THE FBI, THEY LET YOU DO IT."
example_title: "Sentence 2"
- text: "Move your ships away!\n\noooook\n\nMove your ships away!\n\nNo, and I just added 10 more"
example_title: "Sentence 3"
- text: "Let's Make America Great Again!"
example_title: "Sentence 4"
---
# Multilingual Persuasion Detection in Memes
Given only the “textual content” of a meme, the goal is to identify which of the 20 persuasion techniques, organized in a hierarchy, it uses. Selecting only the ancestor node of a technique gives only a partial reward. This is a hierarchical multi-label classification problem based on the [SemEval 2024 Task 4 Subtask 1 of "Multilingual Detection of Persuasion Techniques in Memes"](https://propaganda.math.unipd.it/semeval2024task4/index.html).
The source code to train the model along with additional implementations can be found [here](https://github.com/nishan-chatterjee/what-do-you-meme). The paper describing our method was accepted at SemEval 2024. Link to paper coming soon!!
### Hierarchy
<img src="images/persuasion_techniques_hierarchy_graph.png" width="622" height="350">
### Usage Example
- **Input:** "I HATE TRUMP\n\nMOST TERRORIST DO",
- **Outputs:**
- Child-only Label List: ['Name calling/Labeling', 'Loaded Language']
- Complete Hierarchical Label List: ['Ethos', 'Ad Hominem', 'Name calling/Labeling', 'Pathos', 'Loaded Language']
Note:
- Make sure to have the dependencies installed in your environment from requirements.txt
- Make to have the trained model and tokenizer in the same directory as inference.py
## Training Hyperparameters
- Base Model: "facebook/mbart-large-50-many-to-many-mmt"
- Learning Rate: 5e-05
- Max Length: 256
- Batch Size: 64
- Epoch: 3
- Seed: 42
## Model Statistics
The model obtained the following metrics on the Development Set as of March 31st, 2024:
- Hierarchical F1: 63.58%
- Hierarchical Precision: 58.3%
- Hierarchical Recall: 69.9%
## Licensing
The model is available under the GNU General Public License v3.0 (GPL-3.0), which allows for free use, modification, and distribution under the same license. However, it is strictly for research purposes only and cannot be used for malicious activities, including but not limited to manipulation, targeted harassment, hate speech, deception, and discrimination.
The dataset is available on the [competition website](https://propaganda.math.unipd.it/semeval2024task4/). Users must accept an online agreement before downloading and using the data. This agreement stipulates that the data is for research purposes only and cannot be redistributed or used for malicious purposes as outlined above.