File size: 2,830 Bytes
17c8e19
5608b3d
 
 
17c8e19
 
 
5608b3d
17c8e19
 
7e35a51
 
 
3daa38c
 
a546e9a
3daa38c
a546e9a
3daa38c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ebc4b15
3daa38c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7e35a51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
title: Wave App
emoji: πŸ’»
colorFrom: pink
colorTo: indigo
sdk: gradio
sdk_version: 5.17.1
app_file: app.py
pinned: false
license: mit
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/63bc1e06b8c61b8aa4963cf2/cPoSjWQgu66JrLyQ8HVM3.png
short_description: Built using Gradio, Librosa, and Resemblyzer. This applicati
---

# πŸŽ™οΈ Wave: Voice Recognition with Similarity Testing

Welcome to the **Wave: A Voice Recognition application with Similarity Testing** project, built using **Gradio**, **Librosa**, and **Resemblyzer**. This application compares uploaded voice samples against reference embeddings to determine similarity, making it ideal for voice authentication and verification tasks.

---

## πŸš€ **Key Features**
- **Real-time Voice Verification:** Instantly compares a test voice against reference samples.
- **Multi-File Training:** Upload up to **50** audio samples for robust training.
- **Similarity Scoring:** Generates a similarity score, with results interpreted as a match or mismatch.
- **User-Friendly Interface:** Powered by **Gradio**, ensuring a seamless and interactive experience.

---

## πŸ› οΈ **Technology Stack**
- **Framework:** Gradio
- **Audio Processing:** Librosa
- **Voice Embeddings:** Resemblyzer
- **Numerical Computations:** NumPy
- **Audio File Handling:** SoundFile

---

## πŸ“ **File Structure**
```plaintext
voice-recognition-app
β”‚
β”œβ”€β”€ app.ipynb             # Main application notebook
β”œβ”€β”€ requirements.txt      # Required packages
└── README.md             # Project documentation
```

---

## πŸ’Ύ **Usage**
1. **Train the Model:** Upload up to **50** `.wav` files as reference samples.
2. **Test a Voice:** Upload a single `.wav` file and receive a similarity score.
3. **Interpret Results:** Scores above **0.80** indicate a close match.

---

## πŸ“¦ **Dependencies**
```plaintext
gradio
librosa
resemblyzer
numpy
soundfile
```

---

## 🧠 **How It Works**
1. **Audio Loading:** Files are loaded and resampled to **16 kHz**.
2. **Voice Embeddings:** The **Resemblyzer** extracts embeddings that represent vocal characteristics.
3. **Similarity Calculation:** The dot product of normalized embeddings produces the similarity score.

---

## 🌐 **Access Live Demo**
πŸ”— [wave-app](https://huggingface.co/spaces/ajnx014/wave-app)

---

## πŸ“ **License**
This project is licensed under the **MIT License**.

---

## 🀝 **Contributing**
Feel free to contribute! Fork the repository, create a new branch, and submit a pull request.

---

## πŸ“§ **Contact**
For inquiries or support, please reach out to **[arjunjagdale14@gmail.com](mailto:arjunjagdale14@gmail.com)**.

---

> **Author:** Arjun Jagdale  
> **GitHub:** [ArjunJagdale](https://github.com/ArjunJagdale)  
> **Project:** Voice Recognition with Similarity Testing