wave-app / README.md
ajnx014's picture
Update README.md
7e35a51 verified

A newer version of the Gradio SDK is available: 6.0.2

Upgrade
metadata
title: Wave App
emoji: πŸ’»
colorFrom: pink
colorTo: indigo
sdk: gradio
sdk_version: 5.17.1
app_file: app.py
pinned: false
license: mit
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/63bc1e06b8c61b8aa4963cf2/cPoSjWQgu66JrLyQ8HVM3.png
short_description: Built using Gradio, Librosa, and Resemblyzer. This applicati

πŸŽ™οΈ Wave: Voice Recognition with Similarity Testing

Welcome to the Wave: A Voice Recognition application with Similarity Testing project, built using Gradio, Librosa, and Resemblyzer. This application compares uploaded voice samples against reference embeddings to determine similarity, making it ideal for voice authentication and verification tasks.


πŸš€ Key Features

  • Real-time Voice Verification: Instantly compares a test voice against reference samples.
  • Multi-File Training: Upload up to 50 audio samples for robust training.
  • Similarity Scoring: Generates a similarity score, with results interpreted as a match or mismatch.
  • User-Friendly Interface: Powered by Gradio, ensuring a seamless and interactive experience.

πŸ› οΈ Technology Stack

  • Framework: Gradio
  • Audio Processing: Librosa
  • Voice Embeddings: Resemblyzer
  • Numerical Computations: NumPy
  • Audio File Handling: SoundFile

πŸ“ File Structure

voice-recognition-app
β”‚
β”œβ”€β”€ app.ipynb             # Main application notebook
β”œβ”€β”€ requirements.txt      # Required packages
└── README.md             # Project documentation

πŸ’Ύ Usage

  1. Train the Model: Upload up to 50 .wav files as reference samples.
  2. Test a Voice: Upload a single .wav file and receive a similarity score.
  3. Interpret Results: Scores above 0.80 indicate a close match.

πŸ“¦ Dependencies

gradio
librosa
resemblyzer
numpy
soundfile

🧠 How It Works

  1. Audio Loading: Files are loaded and resampled to 16 kHz.
  2. Voice Embeddings: The Resemblyzer extracts embeddings that represent vocal characteristics.
  3. Similarity Calculation: The dot product of normalized embeddings produces the similarity score.

🌐 Access Live Demo

πŸ”— wave-app


πŸ“ License

This project is licensed under the MIT License.


🀝 Contributing

Feel free to contribute! Fork the repository, create a new branch, and submit a pull request.


πŸ“§ Contact

For inquiries or support, please reach out to arjunjagdale14@gmail.com.


Author: Arjun Jagdale
GitHub: ArjunJagdale
Project: Voice Recognition with Similarity Testing