tum-nlp/Deberta_Human_Value_Detector

This is a fine-tuned Deberta model to detect human values in arguments. The model is part of the ensemble that was the best-performing system in the SemEval2023 task: Detecting Human Values in arguments It was trained and tested on a dataset of 9324 annotated arguments. The whole ensemble system achieved a F1-Score of 0.56 in the competiton. This model achieves a F1-Score of 0.55. Code for retraining the ensemble is accessible in this repo

Model Usage

This model is built on custom code. So the inference api cannot be used directly. To use the model please follow the steps below...


from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

tokenizer =  AutoTokenizer.from_pretrained("tum-nlp/Deberta_Human_Value_Detector")
trained_model = AutoModelForSequenceClassification.from_pretrained("tum-nlp/Deberta_Human_Value_Detector", trust_remote_code=True)

example_text ='We should ban whaling because whales are a species at the risk of distinction'

encoding = tokenizer.encode_plus(
        example_text,
        add_special_tokens=True,
        max_length=512,
        return_token_type_ids=False,
        padding="max_length",
        return_attention_mask=True,
        return_tensors='pt',
    )

with torch.no_grad():
        test_prediction = trained_model(encoding["input_ids"], encoding["attention_mask"])
        test_prediction = test_prediction["output"].flatten().numpy()

Prediction

To make a prediction and map the the outputs to the correct labels. During the competiton a threshold of 0.25 was used to binarize the output.

THRESHOLD = 0.25
LABEL_COLUMNS = ['Self-direction: thought','Self-direction: action','Stimulation','Hedonism','Achievement','Power: dominance','Power: resources','Face','Security: personal',
                 'Security: societal','Tradition','Conformity: rules','Conformity: interpersonal','Humility','Benevolence: caring','Benevolence: dependability','Universalism: concern','Universalism: nature','Universalism: tolerance','Universalism: objectivity']
print(f"Predictions:")
for label, prediction in zip(LABEL_COLUMNS, test_prediction):
    if prediction < THRESHOLD:
        continue
    print(f"{label}: {prediction}")

Citation

@inproceedings{schroter-etal-2023-adam,
    title = "{A}dam-Smith at {S}em{E}val-2023 Task 4: Discovering Human Values in Arguments with Ensembles of Transformer-based Models",
    author = "Schroter, Daniel  and
      Dementieva, Daryna  and
      Groh, Georg",
    editor = {Ojha, Atul Kr.  and
      Do{\u{g}}ru{\"o}z, A. Seza  and
      Da San Martino, Giovanni  and
      Tayyar Madabushi, Harish  and
      Kumar, Ritesh  and
      Sartori, Elisa},
    booktitle = "Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.semeval-1.74",
    doi = "10.18653/v1/2023.semeval-1.74",
    pages = "532--541",
    abstract = "This paper presents the best-performing approach alias {``}Adam Smith{''} for the SemEval-2023 Task 4: {``}Identification of Human Values behind Arguments{''}. The goal of the task was to create systems that automatically identify the values within textual arguments. We train transformer-based models until they reach their loss minimum or f1-score maximum. Ensembling the models by selecting one global decision threshold that maximizes the f1-score leads to the best-performing system in the competition. Ensembling based on stacking with logistic regressions shows the best performance on an additional dataset provided to evaluate the robustness ({``}Nahj al-Balagha{''}). Apart from outlining the submitted system, we demonstrate that the use of the large ensemble model is not necessary and that the system size can be significantly reduced.",
}

tum-nlp
/

Deberta_Human_Value_Detector

Model Usage

Prediction

Citation

Space using tum-nlp/Deberta_Human_Value_Detector 1