Schwartz Value Classifier

This classifier is intended to predict the existence of social values from text snippets.

Disclaimer: this is not the official repo published by the authors of the paper, and may not truly replicate the performance described in the original study

Value dimensions

In this model we follow the 10-dimensional categorization of the Schwartz values. link

security – safety, harmony, and stability of society, of relationships, and of self
power – social status and prestige, control or dominance over people and resources
achievement – personal success through demonstrating competence according to social standards
hedonism – pleasure or sensuous gratification for oneself
stimulation – excitement, novelty and challenge in life
self-direction – independent thought and action—choosing, creating, exploring
universalism – understanding, appreciation, tolerance, and protection for the welfare of all people and for nature
benevolence – preserving and enhancing the welfare of those with whom one is in frequent personal contact (the 'in-group')
conformity – restraint of actions, inclinations, and impulses likely to upset or harm others and violate social expectations or norms
tradition – respect, commitment, and acceptance of the customs and ideas that one's culture or religion provides

Datasets

This model is finetuned on two datasets: ValueNet (A New Dataset for Human Value Driven Dialogue System, Qiu et al. 2021) and Touche23-ValueEval (The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments, Mirzakhmedova et al., 2023).

We follow the original paper to convert both datasets into a binary classification task for each dimension.

ValueNet
- A sentence has a positive label if the original label contains 1 (positive) or -1 (negative), and 0 if the original label is 0.
ValueEval
- A sentence is assigned a positive label if the original label vector is marked 1 for that dimension.
- Since the original paper follows a 20-dimension refined categorization, we map them back to 10 dimensions. Therefore, the same sentence appears ten times, once for each dimension.

How to use

Start your sentence with a label that indicates which dimension to measure. An example would be:

<power> [SEP] staying out late after telling my girlfriend I could be home early

Please make sure to follow the exact format "<value_name>" at the beginning of the sentence as this is a special token in the tokenizer: any spaces or different formats will not be encoded correctly.

Performances

macro F1 score
- on ValueNet: 0.648
- on ValueEval: 0.744
- Combined: 0.759
ROC-AUC
- on ValueNet: 0.736
- on ValueEval:0.847
- Combined: 0.855

Training details

Base model: bert-base-uncased
Epochs: 10 w/ early stopping after no F1 increase in 3 epochs
Learning rate: 5e-5 w/ warmup for 0.03 steps and subsequent linear decay
Batch size: 32
Upsampled training set to maintain 1:1 balance for pos:neg labels.

References

Do Differences in Values Influence Disagreements in Online Discussions? (EMNLP'23) link

devnote5676
/

schwartz-values-classifier