File size: 7,917 Bytes
c1f4cde
 
 
 
 
 
 
 
 
 
 
 
ddc7180
2a921af
ddc7180
2a921af
 
 
ddc7180
aaabbbe
7214723
 
b10ee75
 
5b0bab2
b10ee75
5b0bab2
 
b10ee75
5b0bab2
 
 
 
 
b10ee75
5b0bab2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b10ee75
f7e23f3
 
 
 
 
46bd76f
f7e23f3
46bd76f
 
d32b145
9d1da55
 
46bd76f
 
 
 
 
d32b145
46bd76f
 
 
 
 
 
 
 
 
 
f7e23f3
9d1da55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aaabbbe
 
 
 
7214723
 
c1f4cde
aaabbbe
 
 
 
 
 
 
 
 
 
 
c1f4cde
aaabbbe
f7e23f3
 
7214723
 
576bc37
 
 
 
 
 
 
7214723
 
 
f7e23f3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
language: en
datasets:
- ValueNet
tags:
- regression
- classification
- stance detection
- DeBERTa
license: mit
---

## Study Overview

In this study, we employ the Microsoft DeBERTa v3 model, which introduces an additional embedding for positional indexing, enhancing the model used in Qiu et al. (2022). To date, no model has empirically validated the impact of the positional index on regression or classification tasks. For training and fine-tuning, we developed a custom trainer to evaluate Mean-Squared Error (MSE) as outlined in Qiu et al. (2022). However, the implementation details of the sigmoid activation function with a threshold of \([-1, 1]\)—where \(-1\) indicates a personal stance not aligned with the value in question, \(1\) indicates alignment, and \(0\) denotes neutrality (irrelevance)—were not clearly specified.

To model this threshold effectively, we opted for the tanh activation function, which provides a more appropriate representation. Consequently, we implemented an MSE loss function with tanh activation, followed by rounding to the nearest integer for evaluation purposes.

Utilizing this approach, we demonstrated improvements in regression tasks for evaluating stances on each test scenario. While the overall MSE did not show significant improvement, we observed higher accuracy, recall, and precision for the regression tasks. It is important to note that the classification task specified in Qiu et al. (2022) solely determines the presence or absence of the value in question, without considering the specific stance presented in the text. Therefore, our regression task, which assesses the particular stance, should not be directly compared with the classification task from Qiu et al. (2022).



## Usage

```python

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the model and tokenizer
model_path = 'nharrel/Valuesnet_DeBERTa_v3'
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
model.eval()

# Define maximum length for padding and truncation
max_length = 128

def custom_round(x):
    if x >= 0.50:
        return 1
    elif x < -0.50:
        return -1
    else:
        return 0

def predict(text):
    inputs = tokenizer(text, padding='max_length', truncation=True, max_length=max_length, return_tensors='pt')
    with torch.no_grad():
        outputs = model(**inputs)

    prediction = torch.tanh(outputs.logits).cpu().numpy()
    rounded_prediction = custom_round(prediction)
    return rounded_prediction

def test_sentence(sentence):
    prediction = predict(sentence)
    label_map = {-1: 'Against', 0: 'Not Present', 1: 'Supports'}
    predicted_label = label_map.get(prediction, 'unknown')
    print(f"Sentence: {sentence}")
    print(f"Predicted Label: {predicted_label}")

# Define Schwartz's 10 values
schwartz_values = [
    "BENEVOLENCE", "UNIVERSALISM", "SELF-DIRECTION", "STIMULATION", "HEDONISM",
    "ACHIEVEMENT", "POWER", "SECURITY", "CONFORMITY", "TRADITION"
]

for value in schwartz_values:
    print("Values stance is: " + value)
    test_sentence(f"[{value}] You are a very pleasant person to be around.")


```


## Results from Qiu et al. (2022)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/654e897ddd2482b0592bfffa/M3d-1h_6XvUA11kHd-ge1.png)

## Our Results with same dataset using DeBERTa v3

# main



![image/png](https://cdn-uploads.huggingface.co/production/uploads/654e897ddd2482b0592bfffa/EqqRYWJKBFY96cy2v_QPD.png)

# augmented



![image/png](https://cdn-uploads.huggingface.co/production/uploads/654e897ddd2482b0592bfffa/AIlkSlnipt0gnykyTe4_X.png)

# balanced


![image/png](https://cdn-uploads.huggingface.co/production/uploads/654e897ddd2482b0592bfffa/5XaAMBbZGmAGt5gcjZDdZ.png)

# Classification Tasks


![image/png](https://cdn-uploads.huggingface.co/production/uploads/654e897ddd2482b0592bfffa/dR9oN4kFbPsnimJVV2ePe.png)

# Interpretation

We can only compare our classification task with the BART model that has the highest classification. This model only classifies whether the value is present or not.
1 for present and 0 for not. Qiu et al (2022) used BART to perform this classification with the highest accuracy using the main dataset with 67%. Using DeBERTa v3,
we were able to get an accuracy of 73% (0.7283). DeBERTa's disentanglement feature allows for a significant improvement in classifying human values.

We can also see a very noticable improvement with the regression tasks. This is a more difficult task, because the model must determine if the value in question is either
present or not; then determine if the agent's perspective is either supporting or against the value's stance. However, we can see that DeBerta v3 outperforms BERT by 4% (65% vs 61%).
I simply just replicated Qiu et al (2022) and have not tried to improve their design.

# Future Work

I am currently working to develop and ensemble model that will leverage text generation to create multiple stance positions for each values. We hypothesize that if
the model can differentiate between different stance positions on the same topic associated with the target value, the model can more accurately predict an agents
values stance.

## Acknowledgements

We would like to acknowledge the authors of the ValueNet dataset for their valuable contribution to this work.

Please give them credit if you use this model, because this model would not be possible without their work.

```bibtex
@article{Qiu_Zhao_Li_Lu_Peng_Gao_Zhu_2022, 
    title={ValueNet: A New Dataset for Human Value Driven Dialogue System}, 
    volume={36}, 
    url={https://ojs.aaai.org/index.php/AAAI/article/view/21368}, 
    DOI={10.1609/aaai.v36i10.21368}, 
    abstractNote={Building a socially intelligent agent involves many challenges, one of which is to teach the agent to speak guided by its value like a human. However, value-driven chatbots are still understudied in the area of dialogue systems. Most existing datasets focus on commonsense reasoning or social norm modeling. In this work, we present a new large-scale human value dataset called ValueNet, which contains human attitudes on 21,374 text scenarios. The dataset is organized in ten dimensions that conform to the basic human value theory in intercultural research. We further develop a Transformer-based value regression model on ValueNet to learn the utility distribution. Comprehensive empirical results show that the learned value model could benefit a wide range of dialogue tasks. For example, by teaching a generative agent with reinforcement learning and the rewards from the value model, our method attains state-of-the-art performance on the personalized dialog generation dataset: Persona-Chat. With values as additional features, existing emotion recognition models enable capturing rich human emotions in the context, which further improves the empathetic response generation performance in the EmpatheticDialogues dataset. To the best of our knowledge, ValueNet is the first large-scale text dataset for human value modeling, and we are the first one trying to incorporate a value model into emotionally intelligent dialogue systems. The dataset is available at https://liang-qiu.github.io/ValueNet/.}, 
    number={10}, 
    journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
    author={Qiu, Liang and Zhao, Yizhou and Li, Jinchao and Lu, Pan and Peng, Baolin and Gao, Jianfeng and Zhu, Song-Chun}, 
    year={2022}, 
    month={Jun.}, 
    pages={11183-11191}
}


If you like my model; please give me credit:

@misc {nick_h_2024,
	author       = { {Nicholas Harrell} },
	title        = { Valuesnet_DeBERTa_v3 (Revision 7214723) },
	year         = 2024,
	url          = { https://huggingface.co/nharrel/Valuesnet_DeBERTa_v3 },
	doi          = { 10.57967/hf/2873 },
	publisher    = { Hugging Face }
}