Spaces:
Build error
Build error
A newer version of the Gradio SDK is available:
5.42.0
metadata
title: Accent Classifier + Transcriber
emoji: 🎙️
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.20.0
app_file: app.py
pinned: false
Accent Classifier + Speech Transcriber
This Gradio app allows you to:
- Upload or link to audio/video files
- Automatically transcribe the speech (via OpenAI Whisper)
- Detect the speaker's accent (28-class Wav2Vec2 model)
- View a top-5 ranked list of likely accents with confidence scores
How to Use
Option 1: Upload an audio file
- Supported formats: .mp3, .wav
Option 2: Upload a video file
- Supported format: .mp4 (audio will be extracted automatically)
Option 3: Paste a direct .mp4 video URL
- Must be a direct video file URL (not a webpage)
- Example: a file hosted on archive.org or a CDN
Not Supported
- Loom, YouTube, Dropbox, or other webpage links (they don't serve real video files)
- Download the video manually and upload it if needed
Models Used
Transcription:
- openai/whisper-tiny: https://huggingface.co/openai/whisper-tiny
Accent Classification:
- ylacombe/accent-classifier: https://huggingface.co/ylacombe/accent-classifier
Dependencies
Handled automatically in Hugging Face Spaces. For local testing:
pip install gradio transformers torch moviepy requests safetensors soundfile scipy
You must also install ffmpeg:
- macOS: brew install ffmpeg
- Ubuntu: sudo apt install ffmpeg
- Windows: Download from https://ffmpeg.org/
How It Works
- Audio is extracted (if input is a video)
- Audio is converted to .wav and resampled to 16kHz
- Speech is transcribed using Whisper
- Accent is classified using a Wav2Vec2 model
- Output includes:
- Top accent prediction
- Confidence score
- Top-5 accent list
- Full transcription