--- language: en license: apache-2.0 datasets: - tweets widget: - text: "COVID-19 vaccines are safe and effective." --- # Disclaimer: This page is under maintenance. Please DO NOT refer to the information on this page to make any decision yet. # Vaccinating COVID tweets Fine-tuned model on English language using a masked language modeling (MLM) objective from BERTweet in [this repository](https://github.com/VinAIResearch/BERTweet) for the classification task for factual information about COVID-19/vaccine. ## Intended uses & limitations #### How to use ```python # You can include sample code which will be formatted ``` #### Limitations and bias Provide examples of latent issues and potential remediations. ## Training data & Procedure #### Pre-trained baseline model - Pre-trained model: [BERTweet](https://github.com/VinAIResearch/BERTweet) - trained based on the RoBERTa pre-training procedure - 850M General English Tweets (Jan 2012 to Aug 2019) - 23M COVID-19 English Tweets - Size of the model: >134M parameters - Further training - Pre-training with recent COVID-19/vaccine tweets and fine-tuning for fact classification #### 1) Pre-training language model - Tweets with trending #CovidVaccine hashtag, 207,000 tweets uploaded across Aug 2020 to Apr 2021 [kaggle](https://www.kaggle.com/kaushiksuresh147/covidvaccine-tweets) - Tweets about all COVID-19 vaccines, 78,000 tweets uploaded across Dec 2020 to May 2021 [kaggle](https://www.kaggle.com/gpreda/all-covid19-vaccines-tweets) - COVID-19 Twitter chatter dataset, 590,000 tweets uploaded across Mar 2021 to May 2021 [github](https://github.com/thepanacealab/covid19_twitter) #### 2) Fine-tuning for fact classification - Statements from Poynter and Snopes with Selenium 14,000 fact-checked statements from Jan 2020 to May 2021 - Divide original labels within 3 categories - False: false, no evidence, manipulated, fake, not true, unproven, unverified - Misleading: misleading, exaggerated, out of context, needs context - True: true, correct ## Eval results # Contributors - This page is a part of final team project from MLDL for DS class at SNU - Team BIBI - Vaccinating COVID-NineTweets - Team members: Ahn, Hyunju; An, Jiyong; An, Seungchan; Jeong, Seokho; Kim, Jungmin; Kim, Sangbeom - Advisor: Prof. Wen-Syan Li # ![GSDS](https://gsds.snu.ac.kr/sites/gsds.snu.ac.kr/files/GSDS_logo.png)