|
|
--- |
|
|
title: Wave App |
|
|
emoji: π» |
|
|
colorFrom: pink |
|
|
colorTo: indigo |
|
|
sdk: gradio |
|
|
sdk_version: 5.17.1 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
thumbnail: >- |
|
|
https://cdn-uploads.huggingface.co/production/uploads/63bc1e06b8c61b8aa4963cf2/cPoSjWQgu66JrLyQ8HVM3.png |
|
|
short_description: Built using Gradio, Librosa, and Resemblyzer. This applicati |
|
|
--- |
|
|
|
|
|
# ποΈ Wave: Voice Recognition with Similarity Testing |
|
|
|
|
|
Welcome to the **Wave: A Voice Recognition application with Similarity Testing** project, built using **Gradio**, **Librosa**, and **Resemblyzer**. This application compares uploaded voice samples against reference embeddings to determine similarity, making it ideal for voice authentication and verification tasks. |
|
|
|
|
|
--- |
|
|
|
|
|
## π **Key Features** |
|
|
- **Real-time Voice Verification:** Instantly compares a test voice against reference samples. |
|
|
- **Multi-File Training:** Upload up to **50** audio samples for robust training. |
|
|
- **Similarity Scoring:** Generates a similarity score, with results interpreted as a match or mismatch. |
|
|
- **User-Friendly Interface:** Powered by **Gradio**, ensuring a seamless and interactive experience. |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ **Technology Stack** |
|
|
- **Framework:** Gradio |
|
|
- **Audio Processing:** Librosa |
|
|
- **Voice Embeddings:** Resemblyzer |
|
|
- **Numerical Computations:** NumPy |
|
|
- **Audio File Handling:** SoundFile |
|
|
|
|
|
--- |
|
|
|
|
|
## π **File Structure** |
|
|
```plaintext |
|
|
voice-recognition-app |
|
|
β |
|
|
βββ app.ipynb # Main application notebook |
|
|
βββ requirements.txt # Required packages |
|
|
βββ README.md # Project documentation |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## πΎ **Usage** |
|
|
1. **Train the Model:** Upload up to **50** `.wav` files as reference samples. |
|
|
2. **Test a Voice:** Upload a single `.wav` file and receive a similarity score. |
|
|
3. **Interpret Results:** Scores above **0.80** indicate a close match. |
|
|
|
|
|
--- |
|
|
|
|
|
## π¦ **Dependencies** |
|
|
```plaintext |
|
|
gradio |
|
|
librosa |
|
|
resemblyzer |
|
|
numpy |
|
|
soundfile |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ **How It Works** |
|
|
1. **Audio Loading:** Files are loaded and resampled to **16 kHz**. |
|
|
2. **Voice Embeddings:** The **Resemblyzer** extracts embeddings that represent vocal characteristics. |
|
|
3. **Similarity Calculation:** The dot product of normalized embeddings produces the similarity score. |
|
|
|
|
|
--- |
|
|
|
|
|
## π **Access Live Demo** |
|
|
π [wave-app](https://huggingface.co/spaces/ajnx014/wave-app) |
|
|
|
|
|
--- |
|
|
|
|
|
## π **License** |
|
|
This project is licensed under the **MIT License**. |
|
|
|
|
|
--- |
|
|
|
|
|
## π€ **Contributing** |
|
|
Feel free to contribute! Fork the repository, create a new branch, and submit a pull request. |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ **Contact** |
|
|
For inquiries or support, please reach out to **[arjunjagdale14@gmail.com](mailto:arjunjagdale14@gmail.com)**. |
|
|
|
|
|
--- |
|
|
|
|
|
> **Author:** Arjun Jagdale |
|
|
> **GitHub:** [ArjunJagdale](https://github.com/ArjunJagdale) |
|
|
> **Project:** Voice Recognition with Similarity Testing |