emmatliu's picture
Update README.md
190c630 verified
|
raw
history blame
2.57 kB
metadata
language:
  - en
widget:
  - text: >-
      Ninna Gay is an exceptional photographer who has been exhibiting her work
      since 1996 in Ireland, Northern Ireland, and France. She is a dominant
      figure in the world of photography, and her photographs are a testament to
      her outstanding talent and forceful personality.

Language Agency Classifier

The Language Agency Classifier was created by (Wan et al., 2023) and aims to classify sentences based on the level of agency expressed in each sentence. Classifying sentence agency can help expose latent gender bias, where women may be described with more communal (community-oriented) words and men may be described with more agentic (self/leadership-oriented) words.

The Language Agency Classifier is implemented with a BERT model architecture given an 80/10/10 train/dev/test split. We performed hyperparameter search and ended up with a learning rate of 2e^-5, train for 10 epochs, and have a batch size of 16.

In the dataset (Language Agency Classifier Dataset), the initial biography is sampled from the Bias in Bios dataset (De-Arteaga et al., 2019a), which is sourced from online biographies in the Common Crawl corpus. We prompt ChatGPT to rephrase the initial briography into two versions: one leaning towards agentic language style and another leaning towards communal language style.

An example usage is below.

# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("emmatliu/language-agency-classifier")
model = AutoModelForSequenceClassification.from_pretrained("emmatliu/language-agency-classifier")

Model Sources

Citation

@misc{wan2023kelly,
      title={"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters}, 
      author={Yixin Wan and George Pu and Jiao Sun and Aparna Garimella and Kai-Wei Chang and Nanyun Peng},
      year={2023},
      eprint={2310.09219},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Model Card Authors

This repository is organized by Miri Liu (github: emmatliu).