|
--- |
|
language: |
|
- Python |
|
tags: |
|
- NLP |
|
- Fake News Detection |
|
- XLM RoBERTa |
|
datasets: |
|
- https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset |
|
metrics: |
|
- Accuracy |
|
- F1-score |
|
--- |
|
|
|
# Write up: |
|
|
|
## Link to hugging face model: |
|
https://huggingface.co/Sajib-006/fake_news_detection_xlmRoberta |
|
|
|
## Model Description: |
|
* Used pretrained XLM-Roberta base model. |
|
* Added classifier layer after bert model |
|
* For tokenization, i used max length of text as 512(which is max bert can handle) |
|
|
|
## Result: |
|
* Using bert base uncased english model, the accuracy was near 85% (For all samples) |
|
* Using XLM Roberta base model, the accuracy was almost 100% ( For only 2k samples) |
|
|
|
## Limitations: |
|
* Pretrained XLM Roberta is a heavy model. Training it with the full dataset(44k+ samples) was not possible using google colab free version. So i had to take small sample of 2k size for my experiment. |
|
* As we can see, there is almost 100% accuracy and F1-score for 2000 dataset, so i haven't tried to find misclassified data. |
|
* I couldn't run the model for the whole dataset as i used google colab free version, there was RAM and disk restrictions. XLMRoberta is a heavy model, so training it for the full dataset tends to take huge time. Colab doesn't provide GPU for long time. |
|
* As one run for one epoch took huge time, i had to save checkpoint after 1 epoch and retrain the model loading weights for 2nd time. After 2 epoch it showed almost 100% accuracy, so i didn't continue to train again. |
|
* A more clear picture could have been seen if it could be run for the full dataset. I thought of some ideas about better model but couldn't implement for hardware restriction as mentioned and time constraint. My ideas are given below. |
|
|
|
## Ideas to imrove on full dataset: |
|
* Using XLM Roberta large instead of base can improve |
|
* Adding dense layer and dropout layer to reduce overfitting(Though in my result there is 100% accuracy on hold-out test set, so no overfitting seems to be there) |
|
* Adding convolutional layer after the bert encoder work even better. |
|
* Combination of different complex convolution layers can be added to check if accuracy increases further more. |
|
* Hyperparameter tuning of the layers to ensure best result. |
|
|
|
|