Edit model card

Schwartz Value Classifier

This classifier is intended to predict the existence of social values from text snippets.

Disclaimer: this is not the official repo published by the authors of the paper, and may not truly replicate the performance described in the original study

Value dimensions

In this model we follow the 10-dimensional categorization of the Schwartz values. link

  1. security – safety, harmony, and stability of society, of relationships, and of self
  2. power – social status and prestige, control or dominance over people and resources
  3. achievement – personal success through demonstrating competence according to social standards
  4. hedonism – pleasure or sensuous gratification for oneself
  5. stimulation – excitement, novelty and challenge in life
  6. self-direction – independent thought and action—choosing, creating, exploring
  7. universalism – understanding, appreciation, tolerance, and protection for the welfare of all people and for nature
  8. benevolence – preserving and enhancing the welfare of those with whom one is in frequent personal contact (the 'in-group')
  9. conformity – restraint of actions, inclinations, and impulses likely to upset or harm others and violate social expectations or norms
  10. tradition – respect, commitment, and acceptance of the customs and ideas that one's culture or religion provides

Datasets

This model is finetuned on two datasets: ValueNet (A New Dataset for Human Value Driven Dialogue System, Qiu et al. 2021) and Touche23-ValueEval (The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments, Mirzakhmedova et al., 2023).

We follow the original paper to convert both datasets into a binary classification task for each dimension.

  • ValueNet
    • A sentence has a positive label if the original label contains 1 (positive) or -1 (negative), and 0 if the original label is 0.
  • ValueEval
    • A sentence is assigned a positive label if the original label vector is marked 1 for that dimension.
    • Since the original paper follows a 20-dimension refined categorization, we map them back to 10 dimensions. Therefore, the same sentence appears ten times, once for each dimension.

How to use

Start your sentence with a label that indicates which dimension to measure. An example would be:

  • <power> [SEP] staying out late after telling my girlfriend I could be home early

Please make sure to follow the exact format "<value_name>" at the beginning of the sentence as this is a special token in the tokenizer: any spaces or different formats will not be encoded correctly.

Performances

  • macro F1 score

    • on ValueNet: 0.648
    • on ValueEval: 0.744
    • Combined: 0.759
  • ROC-AUC

    • on ValueNet: 0.736
    • on ValueEval:0.847
    • Combined: 0.855

Training details

  • Base model: bert-base-uncased
  • Epochs: 10 w/ early stopping after no F1 increase in 3 epochs
  • Learning rate: 5e-5 w/ warmup for 0.03 steps and subsequent linear decay
  • Batch size: 32
  • Upsampled training set to maintain 1:1 balance for pos:neg labels.

References

  • Do Differences in Values Influence Disagreements in Online Discussions? (EMNLP'23) link
Downloads last month
23
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train devnote5676/schwartz-values-classifier