Edit model card

Model Specification

  • This is the Democratic community GPT-2 language model, fine-tuned on 4.7M (~100M tokens) tweets of Democratic Twitter users between 2019-01-01 and 2020-04-10.
  • For more details about the CommunityLM project, please refer to this our paper and github page.
  • In the paper, it is referred as the Fine-tuned CommunityLM for the Democratic Twitter community.

How to use the model

  • PRE-PROCESSING: when you apply the model on tweets, please make sure that tweets are preprocessed by the TweetTokenizer to get the best performance.
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("CommunityLM/republican-twitter-gpt2")

model = AutoModelForCausalLM.from_pretrained("CommunityLM/republican-twitter-gpt2")

References

If you use this repository in your research, please kindly cite our paper:

@inproceedings{jiang-etal-2022-communitylm,
    title = "CommunityLM: Probing Partisan Worldviews from Language Models",
     author = {Jiang, Hang and Beeferman, Doug and Roy, Brandon and Roy, Deb},
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    year = "2022",
    publisher = "International Committee on Computational Linguistics",
}
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.