Karim-Gamal
/

Roberta_finetuned_fake_news_english.pt

 metrics:
 - roc_auc
 pipeline_tag: text-classification
+---
+# Pretrained Language Model for Fake News Detection
+> This repository contains a pretrained language model for fake news detection. The model was developed using PyTorch and the Hugging Face Transformers library, and was fine-tuned on a dataset of news articles to classify each article as either "fake" or "Satire".
+# Usage
+To use the pretrained model for fake news detection, you can follow these steps:
+> 1- Install the required dependencies, including PyTorch, Transformers, and scikit-learn.
+>
+> 2- Load the pretrained model using the `from_pretrained()` method in the Transformers library.
+>
+> 3- Tokenize your input text using the `tokenizer.encode_plus()` method.
+>
+> 4- Pass the tokenized input to the model's `forward()` method to get a prediction.
+Here's an example code snippet that demonstrates how to use the model:
+```
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load the pretrained model and tokenizer
+model_name = "Karim-Gamal/Roberta_finetuned_fake_news_english.pt"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Tokenize the input text
+input_text = "This is a fake news article"
+inputs = tokenizer.encode_plus(input_text, padding=True, truncation=True, max_length=128, return_tensors="pt")
+# Get the model's prediction
+outputs = model(inputs["input_ids"], attention_mask=inputs["attention_mask"])
+predictions = torch.softmax(outputs.logits, dim=1).detach().numpy()
+```
+# Performance
+> The model was evaluated on a test set of news articles, and achieved an AUC score of `95%`. This indicates that the model is able to effectively distinguish between fake and real news articles.
+> |Model name|Roc Auc|
+> |:---:|:---:|
+> |jy46604790/Fake-News-Bert-Detect|88 %|
+> |ghanashyamvtatti/roberta-fake-news|95 %|
+> |gpt2 / reward_model|82 %|
+> |gpt2 / imdb-sentiment-classifier|79 %|
+> |microsoft/Multilingual-MiniLM-L12-H384|86 %|
+> |hamzab/roberta-fake-news-classification|84 %|
+> |mainuliitkgp/ROBERTa_fake_news_classification|86 %|
+> |ghanashyamvtatti/roberta-fake-news after cleaning|82 %|
+> Based on the provided results, it seems like the `ghanashyamvtatti/roberta-fake-news` model performed the best with a ROC AUC of `95%`. This model was specifically designed for detecting fake news, which explains its high performance on this task.
+> Additionally, the `microsoft/Multilingual-MiniLM-L12-H384` model had a respectable performance with a ROC AUC of `86%` while being a lightweight model. Therefore, it was used in [our paper which is ( Federated Learning Based Multilingual Emoji prediction )](https://github.com/kareemgamalmahmoud/FEDERATED-LEARNING-BASED-MULTILINGUAL-EMOJI-PREDICTION-IN-CLEAN-AND-ATTACK-SCENARIOS) despite having slightly lower performance than other models.
+> On the other hand, the `GPT2` models (`gpt2/reward_model` and `gpt2/imdb-sentiment-classifier`) had lower performance compared to the other models. This may be due to the fact that `GPT2` models were pre-trained on different tasks and not specifically designed for fake news detection.
+>It is worth noting that even though the `ghanashyamvtatti/roberta-fake-news after cleaning` model had a lower performance (`82%`) than the original `ghanashyamvtatti/roberta-fake-news` model (`95%`), it might still be useful in certain scenarios, especially after cleaning the data.
+Finally, it is important to test the performance of the selected model after loading it from Huggingface to make sure it is functioning properly in the desired environment.
+# Model Card
+> For more information about the model's architecture, [The original model link](https://huggingface.co/ghanashyamvtatti/roberta-fake-news)