nicer model card
Browse files
README.md
CHANGED
|
@@ -8,7 +8,99 @@ sdk_version: 5.39.0
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
-
short_description:
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
+
short_description: Interactive analyzer for modular refactoring opportunities in HuggingFace Transformers
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# π Transformers Modular Refactor Analyzer
|
| 15 |
+
|
| 16 |
+
This interactive tool helps analyze modular refactoring opportunities in the HuggingFace Transformers library by visualizing model relationships, similarity patterns, and the impact of modularization on code maintainability.
|
| 17 |
+
|
| 18 |
+
## π Features Overview
|
| 19 |
+
|
| 20 |
+
### π **Tab 1: Chronological Timeline**
|
| 21 |
+
Interactive timeline showing the evolution of transformer models with modular dependencies positioned by their creation dates.
|
| 22 |
+
|
| 23 |
+
**Key Features:**
|
| 24 |
+
- Models positioned chronologically by git history
|
| 25 |
+
- Modular dependency connections between models
|
| 26 |
+
- Similarity scores between candidate models (red dashed edges)
|
| 27 |
+
- Timeline axis with year/month markers
|
| 28 |
+
- **Modular Logic Milestone**: May 31, 2024 marker showing when modular logic was introduced
|
| 29 |
+
- Search functionality to highlight specific models and their connections
|
| 30 |
+
- Zoom and pan to explore the full timeline
|
| 31 |
+
|
| 32 |
+
**Visual Legend:**
|
| 33 |
+
- π‘ **Base models**: Foundation models that others depend on
|
| 34 |
+
- π΅ **Modular models**: Models with existing `modular_*.py` implementations
|
| 35 |
+
- π΄ **Candidate models**: Models without modular implementations (refactoring opportunities)
|
| 36 |
+
- **Blue edges**: Import dependencies between modular implementations
|
| 37 |
+
- **Red dashed edges**: High similarity scores indicating refactoring potential
|
| 38 |
+
|
| 39 |
+
### π **Tab 2: LOC Growth**
|
| 40 |
+
Chart visualizing how modular refactoring impacts Lines of Code (LOC) over time in the transformers repository.
|
| 41 |
+
|
| 42 |
+
**Metrics Tracked:**
|
| 43 |
+
- **Effective LOC**: Total maintainable code (modeling LOC for non-modular + modular LOC)
|
| 44 |
+
- **Modular LOC**: Lines of code in `modular_*.py` files
|
| 45 |
+
- **Modeling LOC (all)**: Total lines in all `modeling_*.py` files
|
| 46 |
+
- **Modeling LOC (included)**: Lines in `modeling_*.py` files for models without modular versions
|
| 47 |
+
|
| 48 |
+
**Key Insights:**
|
| 49 |
+
- Shows the trajectory toward reduced code duplication
|
| 50 |
+
- Demonstrates how modular refactoring can reduce total maintainable code
|
| 51 |
+
- May 31, 2024 annotation marks the introduction of modular logic
|
| 52 |
+
- Interactive chart with time-series data from git history
|
| 53 |
+
|
| 54 |
+
### π **Tab 3: Dependency Graph**
|
| 55 |
+
Static network visualization focusing on model relationships and similarity patterns without chronological constraints.
|
| 56 |
+
|
| 57 |
+
**Features:**
|
| 58 |
+
- Force-directed graph layout optimized for relationship visibility
|
| 59 |
+
- Toggle to show/hide candidate models and similarity edges
|
| 60 |
+
- Node sizes reflect connection degree (more connected = larger)
|
| 61 |
+
- Interactive drag-and-drop for graph exploration
|
| 62 |
+
- Zoom and pan capabilities
|
| 63 |
+
|
| 64 |
+
**Analysis Capabilities:**
|
| 65 |
+
- Identify clusters of highly similar models (refactoring targets)
|
| 66 |
+
- Understand modular dependency patterns
|
| 67 |
+
- Spot potential consolidation opportunities
|
| 68 |
+
- Explore the current modular architecture
|
| 69 |
+
|
| 70 |
+
## π οΈ Technical Details
|
| 71 |
+
|
| 72 |
+
### Similarity Methods
|
| 73 |
+
- **Jaccard Similarity**: Token-based similarity using identifier overlap in source code
|
| 74 |
+
- **Embedding Similarity**: CodeBERT-based semantic similarity (when available)
|
| 75 |
+
|
| 76 |
+
### Data Sources
|
| 77 |
+
- **Git History**: Model creation dates from transformers repository commits
|
| 78 |
+
- **Source Analysis**: AST parsing of `modeling_*.py` and `modular_*.py` files
|
| 79 |
+
- **Dependency Tracking**: Import analysis to build modular dependency graphs
|
| 80 |
+
- **Cached Embeddings**: Pre-computed similarity matrices for performance
|
| 81 |
+
|
| 82 |
+
### Filtering Options
|
| 83 |
+
- **Similarity Threshold**: Adjustable cutoff for showing similarity edges (0.5-0.95)
|
| 84 |
+
- **Multimodal Filter**: Focus on models with multimodal capabilities (models mentioning "pixel_values")
|
| 85 |
+
- **Show/Hide Candidates**: Toggle visibility of non-modular models and their similarities
|
| 86 |
+
|
| 87 |
+
## π― Use Cases
|
| 88 |
+
|
| 89 |
+
1. **Refactoring Planning**: Identify which models would benefit most from modularization
|
| 90 |
+
2. **Architecture Analysis**: Understand current modular dependencies and patterns
|
| 91 |
+
3. **Code Reduction**: Quantify the impact of modular refactoring on maintainability
|
| 92 |
+
4. **Timeline Analysis**: See how the transformers library evolved toward modular architecture
|
| 93 |
+
|
| 94 |
+
## π How to Use
|
| 95 |
+
|
| 96 |
+
1. **Chronological Timeline**: Use the search box to find specific models, zoom to explore different time periods, click nodes to highlight connections
|
| 97 |
+
2. **LOC Growth**: Hover over data points to see exact metrics, observe the trend toward code reduction
|
| 98 |
+
3. **Dependency Graph**: Drag nodes to reorganize the layout, toggle candidates on/off, use zoom for detailed exploration
|
| 99 |
+
|
| 100 |
+
## π¬ Research Context
|
| 101 |
+
|
| 102 |
+
This tool supports analysis of modular refactoring in large-scale ML libraries, helping identify code duplication patterns and measure the effectiveness of architectural improvements in reducing maintenance burden.
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
*Built with Gradio, D3.js, and ApexCharts for interactive data visualization*
|