accent-classifier / README.md
usamaijaz-ai's picture
initial commit
5488aaa

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: Accent Classifier + Transcriber
emoji: 🎙️
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.20.0
app_file: app.py
pinned: false

Accent Classifier + Speech Transcriber

This Gradio app allows you to:

  • Upload or link to audio/video files
  • Automatically transcribe the speech (via OpenAI Whisper)
  • Detect the speaker's accent (28-class Wav2Vec2 model)
  • View a top-5 ranked list of likely accents with confidence scores

How to Use

Option 1: Upload an audio file

  • Supported formats: .mp3, .wav

Option 2: Upload a video file

  • Supported format: .mp4 (audio will be extracted automatically)

Option 3: Paste a direct .mp4 video URL

  • Must be a direct video file URL (not a webpage)
  • Example: a file hosted on archive.org or a CDN

Not Supported

  • Loom, YouTube, Dropbox, or other webpage links (they don't serve real video files)
  • Download the video manually and upload it if needed

Models Used

Transcription:

Accent Classification:


Dependencies

Handled automatically in Hugging Face Spaces. For local testing:

pip install gradio transformers torch moviepy requests safetensors soundfile scipy

You must also install ffmpeg:

  • macOS: brew install ffmpeg
  • Ubuntu: sudo apt install ffmpeg
  • Windows: Download from https://ffmpeg.org/

How It Works

  1. Audio is extracted (if input is a video)
  2. Audio is converted to .wav and resampled to 16kHz
  3. Speech is transcribed using Whisper
  4. Accent is classified using a Wav2Vec2 model
  5. Output includes:
    • Top accent prediction
    • Confidence score
    • Top-5 accent list
    • Full transcription