PerplexityViewer / README.md
Bram van Es
bla
ef12530
---
title: PerplexityViewer
emoji: 📈
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: gpl-3.0
short_description: Simple inspection of perplexity using color-gradients
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# PerplexityViewer 📈
A Gradio-based web application for visualizing text perplexity using color-coded gradients. Perfect for understanding how confident language models are about different parts of your text.
## Features
- **Dual Model Support**: Works with both decoder models (GPT, DialoGPT) and encoder models (BERT, RoBERTa)
- **Interactive Visualization**: Color-coded per-token perplexity using spaCy's displaCy
- **Configurable Analysis**: Adjustable iterations and MLM probability settings
- **Real-time Processing**: Instant analysis with cached models for faster subsequent runs
- **Multiple Model Types**:
- **Decoder Models**: Calculate true perplexity for causal language models
- **Encoder Models**: Calculate pseudo-perplexity using masked language modeling
## How It Works
- **Red tokens**: High perplexity (model is uncertain about this token)
- **Green tokens**: Low perplexity (model is confident about this token)
- **Gradient colors**: Show varying degrees of model confidence
## Installation
1. Clone this repository or download the files
2. Install dependencies:
```bash
pip install -r requirements.txt
```
## Quick Start
### Option 1: Using the startup script (recommended)
```bash
python run.py
```
### Option 2: Direct launch
```bash
python app.py
```
### Option 3: With dependency installation and testing
```bash
python run.py --install --test
```
## Usage
1. **Enter your text** in the input box
2. **Select a model** from the dropdown or enter a custom HuggingFace model name
3. **Choose model type**:
- **Decoder**: For GPT-like models (true perplexity)
- **Encoder**: For BERT-like models (pseudo-perplexity via MLM)
4. **Adjust settings** (optional):
5. **Click "Analyze"** to see the results
## Supported Models
### Decoder Models (Causal LM)
- `gpt2`, `distilgpt2`
- `microsoft/DialoGPT-small`, `microsoft/DialoGPT-medium`
- `openai-gpt`
- Any HuggingFace causal language model
### Encoder Models (Masked LM)
- `bert-base-uncased`, `bert-base-cased`
- `distilbert-base-uncased`
- `roberta-base`
- `albert-base-v2`
- Any HuggingFace masked language model
## Understanding the Results
### Perplexity Interpretation
- **Lower perplexity**: Model is more confident (text is more predictable)
- **Higher perplexity**: Model is less confident (text is more surprising)
### Color Coding
- **Green**: Low perplexity (≤ 2.0) - very predictable
- **Yellow/Orange**: Medium perplexity (2.0-10.0) - somewhat predictable
- **Red**: High perplexity (≥ 10.0) - surprising or difficult to predict
## Technical Details
### Decoder Models (True Perplexity)
- Uses next-token prediction to calculate perplexity
- Formula: `PPL = exp(average_cross_entropy_loss)`
- Each token's perplexity is based on how well the model predicted it given the previous context
### Encoder Models (Pseudo-Perplexity)
- Uses masked language modeling (MLM)
- Masks each token individually and measures prediction confidence
- Pseudo-perplexity approximates true perplexity for bidirectional models
- All content tokens are analyzed for comprehensive results
## Testing
Run the test suite to verify everything works:
```bash
python test_app.py
```
Or use the startup script with testing:
```bash
python run.py --test
```
## Configuration
The app uses sensible defaults but can be customized via `config.py`:
- Default model lists
- Processing settings
- Visualization colors and settings
- UI configuration
## Requirements
- Python 3.7+
- PyTorch
- Transformers
- Gradio 4.0+
- spaCy
- pandas
- numpy
## GPU Support
The app automatically uses GPU acceleration when available, falling back to CPU processing otherwise.
## Troubleshooting
### Common Issues
1. **Model loading errors**: Ensure you have internet connection for first-time model downloads
2. **Memory issues**: Try smaller models like `distilgpt2` or `distilbert-base-uncased`
3. **CUDA out of memory**: Reduce text length or use CPU-only mode
4. **Encoder models slow**: This is normal - each token is analyzed individually for accuracy
5. **Single analysis**: The app now performs one comprehensive analysis per run (no iterations needed)
### Getting Help
If you encounter issues:
1. Check the console output for error messages
2. Try running the test suite: `python test_app.py`
3. Ensure all dependencies are installed: `pip install -r requirements.txt`
## Examples
Try these example texts to see the app in action:
- **"The quick brown fox jumps over the lazy dog."** (Common phrase - should show low perplexity)
- **"Quantum entanglement defies classical intuition."** (Technical content - may show higher perplexity)
- **"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo."** (Grammatically complex - interesting perplexity patterns)