Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.50.0
title: English Dialect Classifier
emoji: π
colorFrom: red
colorTo: red
sdk: streamlit
app_file: app.py
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Predicting English Dialect Using Speech Brain and Streamlit
license: apache-2.0
Welcome to Streamlit!
Edit /src/streamlit_app.py
to customize this app to your heart's desire. :heart:
If you have any questions, checkout our documentation and community forums.
π€ English Accent Analyzer Streamlit App PyTorch
A tool to identify English accents from audio/video sources with optimized processing for large files.
π Features Supports local files, direct media URLs, and Loom videos
Automatically splits large files into 1-minute chunks
Early stopping for faster analysis
Confidence-based predictions
Interactive Streamlit dashboard
βοΈ Installation Clone the repository:
bash git clone https://github.com/your-username/accent-analyzer.git cd accent-analyzer Install dependencies:
bash pip install -r requirements.txt Install FFmpeg (required for audio processing):
bash
On Ubuntu/Debian
sudo apt install ffmpeg
On macOS
brew install ffmpeg π₯οΈ Usage Run the Streamlit app:
bash streamlit run app.py The app will open in your browser at http://localhost:8501
π₯ Input Options
- Upload a file Supported formats:
Video: .mp4, .webm, .avi, .mov, .mkv, .m4v
Audio: .mp3, .wav, .m4a, .aac, .ogg, .flac
- Provide a URL Loom videos: https://www.loom.com/share/...
Direct media links: https://example.com/video.mp4
π§ Optimizations for Large Files The system automatically handles large files using these techniques:
Diagram Code
Chunk Processing:
Audio is split into 1-minute segments
Only segments >10 seconds are processed
Enables parallel processing (future implementation)
Early Stopping:
Stops processing when 3 consecutive chunks agree with high confidence
Saves processing time for long files
Efficient Extraction:
Uses FFmpeg for fast audio extraction
Torchaudio fallback for compatibility
Direct streaming for URL sources
Confidence Threshold:
Only predictions >60% confidence are considered
Reduces false positives from noisy segments
π Example Output Example Dashboard
The dashboard shows:
Predicted accent with confidence percentage
Confidence scores per minute
Accent distribution charts
Processing time metrics