Emotional GPT-2 Large

emotional-gpt2-large is a GPT-2 Large causal language model fine-tuned for emotion-conditioned dialogue generation with DailyDialog-derived data.

GitHub repository: Mario-RC/emotional-gpt

Model Details

  • Base model: gpt2-large
  • Architecture: GPT2LMHeadModel
  • Task: text generation
  • Context length: 1024 tokens
  • Parameters: 774.0M
  • Evaluation perplexity: 7.4115

Model Comparison

Training

The fine-tuning run used the following setup:

  • Framework: Hugging Face Transformers
  • Training data: data/gpt-dialogues/train.txt; evaluation data: data/gpt-dialogues/dev.txt, built from DailyDialog CSV resources
  • Epochs: 4
  • Train/eval batch size per GPU: 1 / 1
  • Gradient accumulation steps: 6
  • Effective training batch size: 6
  • Learning rate: 1e-5
  • Max gradient norm: 1.0
  • Objective: line-by-line causal language modeling
  • Seed: 42
  • Checkpointing/logging: every 5000 optimizer steps; last checkpoint kept
  • Memory optimization: gradient checkpointing enabled

Training Format

Training examples use adjacent DailyDialog utterance pairs with explicit source and target emotion labels:

<bos><source_emotion>source utterance<sep><target_emotion>target utterance<|endoftext|>

Prompt Format

At generation time, the prompt should include the source utterance and the desired target emotion:

<bos><source_emotion>source utterance<sep><target_emotion>

Prompt and training tags:

  • <bos> marks the beginning of one formatted dialogue example.
  • <source_emotion> is a placeholder for one emotion label describing the input/source utterance, for example <fear>.
  • source utterance is the user/input text.
  • <sep> separates the source side from the response side.
  • <target_emotion> is a placeholder for the emotion you want the generated response to follow, for example <happiness>.
  • target utterance is the response text generated by the model.
  • <|endoftext|> marks the end of one example. GPT-2 uses this as its native end-of-text/eos token, and generation can stop when this token is produced.

Emotion conditioning: replace <source_emotion> and <target_emotion> in the template with one of the model's literal emotion tokens in each position.

Supported emotion labels:

  • <no emotion>
  • <anger>
  • <disgust>
  • <fear>
  • <happiness>
  • <sadness>
  • <surprise>

For example:

<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>

This means: the source utterance expresses fear, and the requested response should be conditioned toward happiness.

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "mario-rc/emotional-gpt2-large"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.config.pad_token_id = tokenizer.pad_token_id

prompt = "<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    do_sample=True,
    max_new_tokens=80,
    temperature=0.8,
    top_p=0.95,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id,
)

generated = outputs[0][inputs["input_ids"].shape[-1]:]
response = tokenizer.decode(generated, skip_special_tokens=False)
response = response.split(tokenizer.eos_token, 1)[0].strip()

emotion_labels = [
    "<no emotion>",
    "<anger>",
    "<disgust>",
    "<fear>",
    "<happiness>",
    "<sadness>",
    "<surprise>",
]

for label in emotion_labels:
    if response.startswith(label):
        response = response[len(label):].strip()
        break

print(response)

Limitations

The model is intended for experimental dialogue/text generation. Generated text may be inaccurate, biased, repetitive, or emotionally inappropriate, and should be reviewed before user-facing use.

Downloads last month
333
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mario-rc/emotional-gpt2-large

Finetuned
(130)
this model

Collection including mario-rc/emotional-gpt2-large