Opinerium: Fine-Tuned flan-T5 for Generating Subjective Inquiries

Abstract

This model is the culmination of extensive research into generating subjective inquiries to enhance public interaction with media content. Our approach diverges from the norm by shifting focus from objective to subjective question generation, aiming to elicit personal preferences and opinions based on given texts. Employing fine-tuning techniques on flan-T5 and GPT3 models for Seq2Seq generation, this model has undergone rigorous evaluation against a custom dataset of 40,000 news articles, supplemented with human-generated questions. The comparative analysis highlights Opinerium's superiority, especially when measured against a suite of lexical and semantic metrics.

Introduction

Opinerium is a groundbreaking model fine-tuned from the flan-T5-large architecture, designed to generate poll or opinion-based questions from textual content. This innovation aims to foster public engagement by inviting personal perspectives on various topics, primarily focusing on news media posts. Unlike traditional models that target factual questions with definitive answers, Opinerium delves into the realm of subjective questioning, enabling a deeper interaction with trending media topics.

Model Training

Opinerium was meticulously fine-tuned using the flan-T5 variants from the Hugging Face platform, specifically tailored for the task of generating subjective questions. The fine-tuning process was meticulously crafted to address the unique challenges of subjective question generation, such as capturing nuances in tone, understanding context deeply, and generating engaging, open-ended questions that prompt personal reflection.

Training Details

The training was conducted on a Tesla P100-16GB GPU, utilizing the Transformer library in PyTorch. We adopted a comprehensive approach to hyperparameter optimization, exploring various configurations to find the optimal balance between model performance and computational efficiency. Key training parameters included:

Batch size: 32 for training, ensuring robust gradient estimates while maintaining a manageable computational load.
Gradient accumulation: Set to 64, this technique allowed us to effectively simulate a larger batch size, enhancing the stability and quality of the model updates.
Learning rate: Initially set to 3e-4, with careful adjustments based on performance metrics to ensure steady and effective learning.
Optimizer: AdaFactor was chosen for its efficiency and effectiveness in handling sparse data and adapting learning rates dynamically.

Dataset

The training dataset comprised 40,000 news articles spanning a wide array of topics, ensuring the model's exposure to diverse content and question formats. Each article was paired with binary subjective questions, providing a rich ground for learning how to formulate inquiries that elicit personal opinions. The multilingual nature of the original articles added an extra layer of complexity, which was mitigated by translating all content into English to leverage the extensive training data available for English-centric models.

Usage

To utilize Opinerium for generating subjective inquiries, simply prepend your input text with the prompt "generate an opinion-based question from the text:" This signals the model to analyze the content and craft a question designed to engage users in sharing their perspectives.

Example

Title: Standard charging socket for all devices until 2024
Context: Other electrical devices will have to have a standard charging socket in the EU from mid-2024. Negotiators from the EU states and the European Parliament agreed on USB-C as the standard charging socket to prevent thousands of tons of electrical waste from precisely charging sockets.
Poll: Do you think a universal charging socket helps sustainability?

Conclusion

Opinerium stands at the forefront of subjective question generation, offering a novel tool for engaging with content across multiple domains. By fostering the creation of opinion-based inquiries, it encourages more interactive and thought-provoking discussions, contributing to a richer public discourse.

Hugging Face Web UI Usage

For optimal results when using Opinerium on the Hugging Face web UI, prefix your text with:

generate an opinion-based question from the text:

This prompt is essential for directing the model to generate subjective questions from your input.