HumorGen
An Open-Weight Ecosystem for Computational Humor Generation
Carnegie Mellon University
View on Hugging Face · Read the Paper · CLEF 2026 Paper
1. The Problem: Why are LLMs so unfunny?
Large Language Models (LLMs) possess vast amounts of world knowledge, yet they notoriously struggle to generate genuinely funny content. The reason isn't a lack of data; it's a fundamental flaw in how they are trained.
Standard LLM training minimizes perplexity—it teaches the model to predict the most probable (i.e., average) next token. However, in the context of humor, "average" means boring, generic, or cliché. Comedy relies on subverting expectations. It lives at the extreme edges of the probability distribution: in the unexpected turn of phrase, the precise word that collapses two meanings at once, or the observation that highlights the absurdity of a mundane situation.
When you prompt a standard LLM to "be funny," it suffers from Mode Collapse in creative generation. It defaults to the safest, most statistically common "joke-like" structures, producing text that scans as a joke but completely fails to land.
2. Our Approach: The Cognitive Synergy Framework (CSF)
To solve this, we cannot simply tell a model to "be funny." Instead, we introduce the Cognitive Synergy Framework (CSF), which represents the first application of Mixture-of-Thought (MoT) specifically designed for creative divergence.
- Traditional MoT: Typically used for logic or math to converge on one correct answer.
- Cognitive Synergy: Used for creativity to diverge into many distinct, valid humorous answers.
CSF structures humor generation not as a single task, but as an ensemble of specific cognitive processes. We force the model to adopt distinct "Cognitive Personas." Each persona acts as a Latent Expert, grounded in a specific psychological theory of humor. By forcing generation through these specific lenses, we push the model away from the generic mean and into those creative "tails" of the distribution.
The Six Latent Experts
Instead of one generic generation path, CSF produces six parallel candidates per input—one from each persona. This guarantees a diverse pool of comedic angles by construction.
| Persona | Psychological Theory | Comedic Lens & Mechanism |
|---|---|---|
| The Neurotic | Superiority (Self-Deprecation) | Focuses on anxiety, overthinking, and personal insecurity. Finds humor in vulnerability and self-inflicted struggles. |
| The Cynic | Superiority (Mockery) | Cuts through polite society. Focuses on hypocrisy, social contradictions, and the dark, unspoken truths we all secretly acknowledge. |
| The Observer | Incongruity (Relatability) | The "Seinfeld" lens. Highlights the absurdity hiding in mundane, everyday life and unspoken social rules. |
| The Wordsmith | Incongruity (Linguistic) | Treats jokes as linguistic puzzles. Focuses on phonology, double entendres, puns, and structural ambiguity. |
| The Optimist | Benign Violation | Focuses on wholesome misinterpretation. Finds a ridiculous, forced silver lining where there absolutely shouldn't be one. |
| The Absurdist | Incongruity (Surprise) | Abandons reality. Focuses on non-sequiturs, surreal logic, and completely violating causal reasoning. |
The Training Pipeline: Baking Synergy into the Weights
Our framework is a full training pipeline designed to distill this multi-persona capability into a compact open-weight model.
- Phase 1: MoT Generation (Teacher) — We use a powerful teacher LLM to generate six independent joke candidates (one per persona) for a given prompt (e.g., a news headline).
- Phase 2: Structure Distillation (SFT) — A compact 7B student model is trained on this diverse pool. The crucial lesson here is that a single input can support multiple, entirely valid, and distinct comedic interpretations.
- Phase 3: Quality Alignment (DPO / O-GRPO) — We don't just train on the "best" joke. We use HumorRank—a Bradley-Terry pairwise evaluation framework—to rank the six persona outputs against each other. Through Direct Preference Optimization (DPO) and Offline Group Relative Policy Optimization (O-GRPO), the model learns: "In this specific context, The Cynic's approach was funnier than The Wordsmith's."
3. Extending to Constrained Humor: CLEF 2026 JOKER Task 4
The Cognitive Synergy Framework isn't just for open-ended headline jokes. To prove its robustness, we extended it to a significantly harder problem: Constrained Multilingual Pun Generation, specifically for the CLEF 2026 JOKER Task 4.
The Challenge: The task requires the model to generate a pun-brief—a single sentence that simultaneously satisfies a specific pun word and two distinct semantic senses. For example, given the word "bark," sense 1 ("the sound a dog makes"), and sense 2 ("the outer covering of a tree"), the model must weave both meanings into a natural, genuinely funny sentence. This requires strict lexical constraint adherence alongside creative humor generation.
The Solution: To scale CSF to this multilingual, constrained environment, we utilized a two-stage cross-lingual LoRA curriculum:
- Domain-Agnostic Pretraining (HumorGen Base): We first trained foundational humor models at the 14B and 32B scale on the SemEval MWAHAHA corpus across all languages (English, French, Spanish). These serve as incredibly strong, general-purpose multilingual humor generators.
- Task-Specific Adaptation (JOKER Suite): We branched from these foundational bases, applying per-language LoRA fine-tuning specifically on the CLEF-JOKER constraint data.
In our CLEF study, we explicitly test whether the persona-based CSF prompting (API teacher) outperforms a single "creativity-first" prompt, and how well our task-adapted open weights (HumorGen-JOKER 14B/32B) compete against those frontier models.
4. Model Collection
All models are released as LoRA adapters on Hugging Face. View the full collection here: Jayi2424/HumorGen.
Core HumorGen Suite — 7B
Open-ended headline humor generation. This suite represents a full ablation across post-training paradigms (SFT, DPO, O-GRPO), tested with and without explicit Chain-of-Thought (CoT) reasoning traces. Base model: Qwen2.5-7B-Instruct.
| Model | Training Stage | CoT | Hugging Face Link |
|---|---|---|---|
| HumorGen_SFT_7B | Supervised Fine-Tuning | — | Jayi2424/HumorGen_SFT_7B |
| HumorGen_SFT_Think_7B | Supervised Fine-Tuning | Yes | Jayi2424/HumorGen_SFT_Think_7B |
| HumorGen_DPO_7B | Direct Preference Optimization | — | Jayi2424/HumorGen_DPO_7B |
| HumorGen_DPO_Think_7B | Direct Preference Optimization | Yes | Jayi2424/HumorGen_DPO_Think_7B |
| HumorGen_GRPO_7B | Offline GRPO | — | Jayi2424/HumorGen_GRPO_7B |
| HumorGen_GRPO_Think_7B | Offline GRPO | Yes | Jayi2424/HumorGen_GRPO_Think_7B |
Multilingual Base Models — 14B & 32B
Domain-agnostic humor pretraining on the SemEval MWAHAHA corpus across all languages. Released independently as general-purpose multilingual humor generators and used as the foundation for the JOKER suite. Base models: Qwen3-14B / Qwen3-32B.
| Model | Scale | Hugging Face Link |
|---|---|---|
| HumorGen_SFT_14B | 14B | Jayi2424/HumorGen_SFT_14B |
| HumorGen_SFT_32B | 32B | Jayi2424/HumorGen_SFT_32B |
CLEF 2026 JOKER Suite — Constrained Pun Generation
Task: dual-sense pun-brief generation. Two-stage cross-lingual LoRA curriculum: multilingual pretraining → per-language JOKER fine-tuning. Available in English, French, and Spanish.
| Model | Language | Scale | Hugging Face Link |
|---|---|---|---|
| HumorGen_JOKER_EN_14B | English | 14B | Jayi2424/HumorGen_JOKER_EN_14B |
| HumorGen_JOKER_EN_32B | English | 32B | Jayi2424/HumorGen_JOKER_EN_32B |
| HumorGen_JOKER_FR_14B | French | 14B | Jayi2424/HumorGen_JOKER_FR_14B |
| HumorGen_JOKER_FR_32B | French | 32B | Jayi2424/HumorGen_JOKER_FR_32B |
| HumorGen_JOKER_ES_14B | Spanish | 14B | Jayi2424/HumorGen_JOKER_ES_14B |
| HumorGen_JOKER_ES_32B | Spanish | 32B | Jayi2424/HumorGen_JOKER_ES_32B |
5. Papers
HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation
Ajayi, E. et al. · arXiv 2026
arxiv.org/abs/2604.09629
Cross-Lingual Cognitive Synergy for Constrained Humor Generation in LLMs: SaLT Lab at the CLEF 2026 JOKER Track
Ajayi, E. et al. · Working Notes of CLEF 2026
edwardajayi.github.io/assets/papers/HumorGen-JOKER.pdf
6. Citation
@misc{ajayi2026humorgen,
title = {HumorGen: Cognitive Synergy for Humor Generation in Large Language
Models via Persona-Based Distillation},
author = {Ajayi, Edward and others},
year = {2026},
eprint = {2604.09629},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2604.09629}
}
@inproceedings{ajayi2026joker,
title = {Cross-Lingual Cognitive Synergy for Constrained Humor Generation in LLMs: SaLT Lab at the CLEF 2026 JOKER Track},
author = {Ajayi, Edward and others},
booktitle = {Working Notes of CLEF 2026},
year = {2026},
url = {https://edwardajayi.github.io/assets/papers/HumorGen-JOKER.pdf}
}
Carnegie Mellon University