Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
π MLM Probability Fix - Complete Documentation
Issue Identified
The user correctly observed that changing the MLM probability did not affect the results at all in the encoder model visualization. This was a significant bug in how the MLM probability parameter was being used.
Root Cause Analysis
What Was Wrong
The MLM probability setting had two separate effects that were not properly connected:
Average Perplexity Calculation β (Working correctly)
- Used random masking with the specified MLM probability
- Affected the summary statistic shown to the user
Per-Token Visualization β (Bug was here)
- Always masked each token individually
- Completely ignored the MLM probability setting
- This meant changing MLM probability had no visual effect
The Disconnect
# OLD CODE - MLM probability was ignored for visualization
for i in range(len(tokens)):
if not special_token:
# ALWAYS calculated detailed perplexity for every token
masked_input[0, i] = tokenizer.mask_token_id
# ... calculate perplexity
The Fix
1. Made MLM Probability Affect Visualization
Now the MLM probability controls which tokens get detailed analysis:
# NEW CODE - MLM probability affects visualization
for i in range(len(tokens)):
if not special_token:
if torch.rand(1).item() < mlm_probability: # β
Now respects MLM prob
# Calculate detailed perplexity for this token
masked_input[0, i] = tokenizer.mask_token_id
# ... calculate detailed perplexity
else:
# Use baseline perplexity for non-analyzed tokens
token_perplexities.append(2.0) # Neutral baseline
2. Visual Distinction
- Analyzed tokens: Colored by actual perplexity (green/yellow/red)
- Non-analyzed tokens: Gray color with baseline perplexity
- Tooltip: Shows whether token was analyzed or not
3. Clear User Feedback
- Summary now shows:
MLM Probability: 0.15 (3/8 tokens analyzed in detail) - Legend updated:
π’ Low β π‘ Medium β π΄ High β β« Not analyzed - Improved help text: "Probability of detailed analysis per token"
How It Works Now
Low MLM Probability (0.15)
Input: "The capital of France is Paris"
Result: Only ~15% of tokens get detailed analysis
Visualization: Mostly gray tokens with a few colored ones
Effect: Fast analysis, matches BERT training conditions
High MLM Probability (0.5)
Input: "The capital of France is Paris"
Result: ~50% of tokens get detailed analysis
Visualization: More colored tokens, fewer gray ones
Effect: More comprehensive but slower analysis
User Experience Improvements
Before the Fix
- User changes MLM probability from 0.15 β 0.5
- No visual change in token colors
- Only summary statistic changed (confusing!)
After the Fix
- User changes MLM probability from 0.15 β 0.5
- More tokens become colored (analyzed)
- Fewer tokens remain gray (non-analyzed)
- Summary shows token count: "(3/8 tokens analyzed)"
- Clear visual feedback of the parameter's effect
Testing the Fix
1. Quick Test
Try the same text with different MLM probabilities:
- Text: "Machine learning algorithms require computational resources"
- MLM 0.2: Few colored tokens
- MLM 0.8: Most tokens colored
2. Demo Script
python mlm_demo.py
Shows exactly how MLM probability affects analysis.
3. Visual Examples
The app now includes example pairs:
- Same text with MLM 0.2 vs 0.8
- Shows clear visual difference
Technical Details
Randomness Handling
- Uses
torch.rand()for consistency with PyTorch - Each token gets independent random chance
- Reproducible with manual seeds for testing
Baseline Perplexity
- Non-analyzed tokens get perplexity = 2.0
- This represents "neutral" confidence
- Avoids misleading very low/high values
Color Mapping
- Analyzed tokens: Full color spectrum based on actual perplexity
- Non-analyzed tokens: Gray (
rgb(200, 200, 200)) - Tooltips distinguish: "Perplexity: 5.2" vs "Not analyzed"
Performance Implications
Lower MLM Probability (0.15)
- Pros: Faster, matches BERT training, realistic
- Cons: Sparse analysis, some tokens not evaluated
Higher MLM Probability (0.8)
- Pros: Comprehensive analysis, more visual information
- Cons: Slower computation, unrealistic for MLM
Recommendation
- Default 0.15: Standard BERT-like analysis
- Increase to 0.3-0.5: For more detailed exploration
- Avoid >0.8: Diminishing returns, very slow
Impact on Model Types
Decoder Models (GPT, etc.)
- No change: MLM probability only affects encoder models
- Always analyze all tokens for next-token prediction
Encoder Models (BERT, etc.)
- Major improvement: MLM probability now has clear visual effect
- Users can explore different analysis depths
- Better understanding of model confidence patterns
User Guidance
When to Use Different MLM Probabilities
0.15 (Standard)
- Quick analysis
- Matches BERT training
- Good for initial exploration
0.3-0.4 (Detailed)
- More comprehensive view
- Better for understanding difficult texts
- Reasonable computation time
0.5+ (Comprehensive)
- Maximum detail
- Research/analysis purposes
- Slower but thorough
Future Enhancements
Possible Improvements
- Adaptive MLM: Adjust probability based on text difficulty
- Token importance: Prioritize content words over function words
- Interactive selection: Let users click tokens to analyze
- Batch analysis: Process multiple MLM probabilities simultaneously
Configuration Options
The fix is fully configurable via config.py:
- Default MLM probability
- Min/max ranges
- Baseline perplexity value
- Color scheme for non-analyzed tokens
Conclusion
This fix transforms the MLM probability from a "hidden parameter" that only affected summary statistics into a visible, interactive control that directly impacts the visualization. Users now get immediate visual feedback when adjusting MLM probability, making the parameter's purpose clear and the analysis more engaging.
The fix maintains backward compatibility while significantly improving the user experience for encoder model analysis. π