breitbot / README.md
twdooley's picture
create README.md
b0568cf

BreitBot

Timothy W. Dooley

___________________________________________________

GitHub

The GitHub for the project can be found [here](https://github.com/twdooley/election_news)

Model


This model was trained on about 16,000 headlines from Breitbart.com spannning March 2019- 11 November 2020. The purpose of this project was to better understand how strongly polarized news crafts a narrative through Natural Language Processing. The BreitBot model was specifically created to understand the 'clickbaity' nature of a Breitbart headline. Many of the results are 'reasonable' within the scope of Breitbart's production. I will leave it to the user to make further interpretation. The full project noted that over 70% of Breitbart's articles from month to month have a negative sentiment score. Subjectively, I believe this is shown through the headlines generated.

Training


BreitBot is a finetuned on GPT2 with about 16,000 headlines. The maximum length allowed in the tokenizer was the length of the longest headline (~50 tokens). A huge credit goes to Richard Bownes, PhD whose article ["Fine Tuning GPT-2 for Magic the Gathering Flavour Text Generation"](https://medium.com/swlh/fine-tuning-gpt-2-for-magic-the-gathering-flavour-text-generation-3bafd0f9bb93) provided incredible direction and help in training this model. It was trained using a GPU on Google Colab.