Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
5.29.0
metadata
title: Controlled Text Summarization
emoji: π
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.20.1
app_file: main.py
pinned: false
license: mit
Creative Text Summarization with Style Control
A machine learning system that summarizes text in different stylistic variations (formal, informal, humorous, poetic) while preserving the content.
Overview
This project creates an AI-powered text summarization system that not only condenses text but adapts its output to different stylistic preferences. It uses transformer models fine-tuned with style-specific summaries to generate summaries that match requested styles while maintaining accuracy.
Features
- Multiple Summary Styles: Generate summaries in formal, informal, humorous, or poetic styles
- Pre-trained Models: Based on BART and other transformer architectures
- User-friendly Interface: Simple Gradio UI for interactive summary generation
- Evaluation Metrics: ROUGE and BLEU scores to evaluate summary quality
Installation
- Clone this repository:
git clone https://github.com/AriachAmine/controlled-text-summarization.git
cd controlled-text-summarization
- Install dependencies:
pip install -r requirements.txt
- Set up the Gemini API key:
# Linux/MacOS
export GEMINI_API_KEY="your-api-key-here"
# Windows
set GEMINI_API_KEY="your-api-key-here"
Usage
Running the Application
python main.py
This will:
- Load the base model or a fine-tuned model (if available)
- Prepare a dataset with stylized summaries (if needed)
- Fine-tune the model on the prepared dataset (if no fine-tuned model exists)
- Launch the Gradio interface for interactive summarization
Using the Gradio Interface
- Enter the text you want to summarize in the text box
- Select your desired summary style from the dropdown (formal, informal, humorous, poetic)
- Click "Submit" to generate the stylized summary
How It Works
- Base Model: Starts with a pre-trained text summarization model (BART)
- Style Training: Fine-tunes the model on summaries with specific styles
- Style Control: Uses style tokens to control output style during generation
- Evaluation: Measures quality using ROUGE and BLEU metrics
Project Structure
controlled-text-summarization/
βββ main.py # Main script to run the application
βββ model.py # Model loading and summarization functions
βββ data.py # Data preparation and processing
βββ evaluation.py # Metrics for evaluating summaries
βββ ui.py # Gradio interface
βββ requirements.txt # Project dependencies
βββ summarization_model/ # Directory for fine-tuned models (created after training)
Dependencies
- torch, transformers: For model loading and fine-tuning
- gradio: For the user interface
- google-generativeai: For generating style-specific training data
- datasets, rouge_score, nltk: For data handling and evaluation
Example
Input:
Scientists have discovered a new species of deep-sea fish that can withstand extreme pressure. The fish, found at depths of over 8,000 meters, has unique adaptations including specialized cell membranes and pressure-resistant proteins. This discovery may lead to new applications in biotechnology and materials science.
Output (Formal Style):
Researchers have identified a novel deep-sea fish species capable of surviving extreme pressures at depths exceeding 8,000 meters. The species exhibits specialized adaptations in cell membrane structure and pressure-resistant proteins, potentially offering valuable insights for biotechnology and materials science applications.
Output (Humorous Style):
Talk about a fish out of water... or rather, a fish VERY deep IN water! Scientists just found a super fish that laughs in the face of crushing ocean pressure. This deep-sea champion, chilling at 8,000 meters down, has fancy cell membranes and proteins that basically say "pressure, what pressure?" Scientists are already dreaming up ways to copy these deep-sea survival tricks for cool new tech!
Future Improvements
- Add more styles (technical, narrative, etc.)
- Implement user feedback collection to improve models
- Add style strength control (slightly humorous vs. very humorous)
- Create a web API for integration with other applications