nishan-chatterjee
/

multilingual-persuasion-detection-from-text

Text Classification

multi-label-classification

persuasion-detection

social-media-analysis

propaganda-detection

hierarchical-classification

Inference Endpoints

Model card Files Files and versions Community

multilingual-persuasion-detection-from-text / README.md

nishan-chatterjee

link to github

996e640 3 months ago

|

history blame contribute delete

No virus

3.22 kB

	---
	title: multilingual-persuasion-detection-from-text
	app_file: inference.py
	pinned: false
	license: gpl-3.0
	language:
	- multilingual
	tags:
	- mbart-50
	- text-classification
	- multi-label-classification
	- persuasion-detection
	- meme-analysis
	- social-media-analysis
	- propaganda-detection
	- hierarchical-classification
	- multilingual
	pipeline_tag: text-classification
	inference: True
	widget:
	- text: "THIS IS WHY YOU NEED. A SHARPIE WITH YOU AT ALL TIMES."
	example_title: "Sentence 1"
	- text: "WHEN YOU'RE THE FBI, THEY LET YOU DO IT."
	example_title: "Sentence 2"
	- text: "Move your ships away!\n\noooook\n\nMove your ships away!\n\nNo, and I just added 10 more"
	example_title: "Sentence 3"
	- text: "Let's Make America Great Again!"
	example_title: "Sentence 4"
	---

	# Multilingual Persuasion Detection in Memes

	Given only the “textual content” of a meme, the goal is to identify which of the 20 persuasion techniques, organized in a hierarchy, it uses. Selecting only the ancestor node of a technique gives only a partial reward. This is a hierarchical multi-label classification problem based on the [SemEval 2024 Task 4 Subtask 1 of "Multilingual Detection of Persuasion Techniques in Memes"](https://propaganda.math.unipd.it/semeval2024task4/index.html).

	The source code to train the model along with additional implementations can be found [here](https://github.com/nishan-chatterjee/what-do-you-meme). The paper describing our method was accepted at SemEval 2024. Link to paper coming soon!!

	### Hierarchy
	<img src="images/persuasion_techniques_hierarchy_graph.png" width="622" height="350">

	### Usage Example
	- Input: "I HATE TRUMP\n\nMOST TERRORIST DO",
	- Outputs:
	- Child-only Label List: ['Name calling/Labeling', 'Loaded Language']
	- Complete Hierarchical Label List: ['Ethos', 'Ad Hominem', 'Name calling/Labeling', 'Pathos', 'Loaded Language']

	Note:
	- Make sure to have the dependencies installed in your environment from requirements.txt
	- Make to have the trained model and tokenizer in the same directory as inference.py

	## Training Hyperparameters
	- Base Model: "facebook/mbart-large-50-many-to-many-mmt"
	- Learning Rate: 5e-05
	- Max Length: 256
	- Batch Size: 64
	- Epoch: 3
	- Seed: 42

	## Model Statistics
	The model obtained the following metrics on the Development Set as of March 31st, 2024:
	- Hierarchical F1: 63.58%
	- Hierarchical Precision: 58.3%
	- Hierarchical Recall: 69.9%

	## Licensing
	The model is available under the GNU General Public License v3.0 (GPL-3.0), which allows for free use, modification, and distribution under the same license. However, it is strictly for research purposes only and cannot be used for malicious activities, including but not limited to manipulation, targeted harassment, hate speech, deception, and discrimination.

	The dataset is available on the [competition website](https://propaganda.math.unipd.it/semeval2024task4/). Users must accept an online agreement before downloading and using the data. This agreement stipulates that the data is for research purposes only and cannot be redistributed or used for malicious purposes as outlined above.