EmoPair
Collection
3 items • Updated • 1
Prompted Emotion Regression Transformer fine-tuned on the EmoPair dataset for Valence–Arousal–Dominance (VAD) regression.
A roberta-large encoder with a linear regression head predicting three emotion
dimensions simultaneously. Input texts are wrapped with prompt before tokenization.
Base model: roberta-large
Input format:
Context: '<text>' - Valence: The emotional valence of the context is <mask>.
- Arousal: The level of arousal of the context is <mask>.
- Dominance: The perceived dominance associated with the context is <mask>.
Please predict the missing value for each dimension using the content provided.
| Parameter | Value |
|---|---|
| Epochs | 15 |
| Batch size | 16 (effective: 64) |
| Learning rate | 3e-05 |
| Max sequence length | N/A |
| Optimizer | AdamW |
| Metric | Valence | Arousal | Dominance | Overall |
|---|---|---|---|---|
| MAE | 0.1527 | 0.1773 | 0.2338 | 0.1879 |
| R² | 0.6579 | 0.7362 | 0.5273 | 0.6404 |
| Pearson | 0.8338 | 0.8616 | 0.7313 | 0.9864 |
import torch
from transformers import AutoTokenizer, RobertaModel
import torch.nn as nn
class RoBERTaForVAD(nn.Module):
def __init__(self, base_model_name):
super().__init__()
self.base_model = RobertaModel.from_pretrained(base_model_name)
self.dropout = nn.Dropout(p=0.1)
self.regressor = nn.Linear(self.base_model.config.hidden_size, 3)
def forward(self, input_ids, attention_mask):
outputs = self.base_model(input_ids=input_ids, attention_mask=attention_mask)
cls = self.dropout(outputs.last_hidden_state[:, 0, :])
return self.regressor(cls)
repo_id = "edsi-umd/PERT-EmoPair"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = RoBERTaForVAD("roberta-large")
state = torch.load(
"pytorch_model.bin", # download from HF Hub first
map_location="cpu",
weights_only=True
)
model.load_state_dict(state["model_state_dict"])
model.eval()
PROMPT = (
"- Valence: The emotional valence of the context is <mask>. "
"- Arousal: The level of arousal of the context is <mask>. "
"- Dominance: The perceived dominance associated with the context is <mask>. "
"Please predict the missing value for each dimension using the content provided."
)
text = "I just got a promotion at work!"
prompted = f"Context: '{text}' {PROMPT}"
enc = tokenizer(prompted, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
vad = model(**enc).squeeze(0).tolist()
print(dict(zip(["valence", "arousal", "dominance"], vad)))
If you use this model, please cite the EmoPair paper and dataset using the most recent citation on the GitHub repository, https://github.com/EDSI-UMD-College-Park/EMOPAIR.