Edit model card

Language Agency Classifier

The Language Agency Classifier was created by (Wan et al., 2023) and aims to classify sentences based on the level of agency expressed in each sentence. Classifying sentence agency can help expose latent gender bias, where women may be described with more communal (community-oriented) words and men may be described with more agentic (self/leadership-oriented) words.

The Language Agency Classifier is implemented with a BERT model architecture given an 80/10/10 train/dev/test split. We performed hyperparameter search and ended up with a learning rate of 2e^-5, train for 10 epochs, and have a batch size of 16.

In the dataset (Language Agency Classifier Dataset), the initial biography is sampled from the Bias in Bios dataset (De-Arteaga et al., 2019a), which is sourced from online biographies in the Common Crawl corpus. We prompt ChatGPT to rephrase the initial briography into two versions: one leaning towards agentic language style and another leaning towards communal language style.

An example usage of the model is below.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("emmatliu/language-agency-classifier")
model = AutoModelForSequenceClassification.from_pretrained("emmatliu/language-agency-classifier")

sentence = "She is a decisive leader in her field."

inputs = tokenizer(sentence, return_tensors="pt")
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)

predicted_class = torch.argmax(probabilities).item()

labels = {
    1: 'agentic',
    0: 'communal'
}

print(f"Predicted class: {labels[predicted_class]}")

Model Sources

Citation

@misc{wan2023kelly,
      title={"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters}, 
      author={Yixin Wan and George Pu and Jiao Sun and Aparna Garimella and Kai-Wei Chang and Nanyun Peng},
      year={2023},
      eprint={2310.09219},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Model Card Authors

This repository is organized by Miri Liu (github: emmatliu).

Downloads last month
13

Space using emmatliu/language-agency-classifier 1