FlexChunk / README.md
DanielSwift's picture
Update README with 300M capability and clearer description
77f0110
---
title: FlexChunk SpMV Demo
emoji:
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.39.0
app_file: app.py
pinned: false
---
# FlexChunk: Enabling Massive Out-of-Core SpMV
This interactive demo showcases **FlexChunk**, an algorithm for performing Sparse Matrix-Vector Multiplication (SpMV) on matrices too large to fit in RAM.
**Key Benefit:** Process matrices up to 300M×300M using only ~1.7 GB RAM by dividing them into manageable horizontal chunks.
## Interactive Demo
The application above provides two modes:
- **Standard Mode**: For matrices up to 200K×200K with optional SciPy comparison
- **Advanced Mode**: For matrices up to 300M×300M using FlexChunk only
You can adjust parameters including:
- Matrix size
- Density
- Number of chunks
- Matrix type (standard or challenging with extreme values)
## Performance Highlights
FlexChunk demonstrates near-linear scaling in both time and memory usage:
| Matrix Size | Non-zero Elements | Total Time | Peak RAM Usage |
|--------------------|-------------------|-----------------|----------------|
| 1.0M × 1.0M | 1.2M | 1.07 s | 17.00 MB |
| 10.0M × 10.0M | 12.0M | 10.21 s | 170.00 MB |
| 50.0M × 50.0M | 62.5M | 55.27 s | 850.00 MB |
| 100.0M × 100.0M | 120.0M | 1 min 47.1 s | 1.70 GB |
Our algorithm scales linearly to even larger matrices (up to 300M×300M) with proportional increases in processing time and memory.
## How It Works
FlexChunk operates in three main stages:
1. **Matrix Division**: Splits the matrix into horizontal chunks and saves to disk
2. **Sequential Processing**: Loads one chunk at a time to minimize memory usage
3. **Result Accumulation**: Combines partial results into the final vector
This approach makes it possible to multiply vectors with matrices that would otherwise exceed available RAM.
## Links
- **Source Code**: [GitHub Repository](https://github.com/DanielSwift1992/FlexChunk)
- **Full Article**: [FlexChunk: Enabling 100M×100M Out-of-Core SpMV](https://www.lesswrong.com/posts/zpRhsdDkWygTDScxb/flexchunk-enabling-100m-100m-out-of-core-spmv-1-8-min-1-7-gb)