--- license: mit widget: - text: "Some ninja attacked the White House." example_title: "Fake example 1" language: - en tags: - classification datasets: - "https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset" --- ## Overview The model is a `roberta-base` fine-tuned on [fake-and-real-news-dataset](https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset). It has a 100% accuracy on that dataset. The model takes a news article and predicts if it is true or fake. The format of the input should be: ``` TITLE HERE <content> CONTENT HERE <end> ``` ## Using this model in your code To use this model, first download it from the hugginface website: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("hamzab/roberta-fake-news-classification") model = AutoModelForSequenceClassification.from_pretrained("hamzab/roberta-fake-news-classification") ``` Then, make a prediction like follows: ```python import torch def predict_fake(title,text): input_str = "<title>" + title + "<content>" + text + "<end>" input_ids = tokenizer.encode_plus(input_str, max_length=512, padding="max_length", truncation=True, return_tensors="pt") device = 'cuda' if torch.cuda.is_available() else 'cpu' model.to(device) with torch.no_grad(): output = model(input_ids["input_ids"].to(device), attention_mask=input_ids["attention_mask"].to(device)) return dict(zip(["Fake","Real"], [x.item() for x in list(torch.nn.Softmax()(output.logits)[0])] )) print(predict_fake(<HEADLINE-HERE>,<CONTENT-HERE>)) ``` You can also use Gradio to test the model on real-time: ```python import gradio as gr iface = gr.Interface(fn=predict_fake, inputs=[gr.inputs.Textbox(lines=1,label="headline"),gr.inputs.Textbox(lines=6,label="content")], outputs="label").launch(share=True) ```