Update README.md
Browse files
README.md
CHANGED
@@ -1,17 +1,9 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
-
|
5 |
|
6 |
-
This
|
7 |
-
* Irish English
|
8 |
-
* Midlands English
|
9 |
-
* Northern English
|
10 |
-
* Scottish English
|
11 |
-
* Southern English
|
12 |
-
* Welsh English
|
13 |
-
|
14 |
-
The model implements transfer learning feature extraction using [Yamnet](https://tfhub.dev/google/yamnet/1) model in order to train a model.
|
15 |
|
16 |
### Yamnet Model
|
17 |
Yamnet is an audio event classifier trained on the AudioSet dataset to predict audio events from the AudioSet ontology. It is available on TensorFlow Hub.
|
@@ -21,12 +13,14 @@ As output, the model returns a 3-tuple:
|
|
21 |
- Embeddings of shape `(N, 1024)`.
|
22 |
- The log-mel spectrogram of the entire audio frame.
|
23 |
|
24 |
-
We will use the embeddings, which are the features extracted from the audio samples, as the input to our dense model.
|
|
|
|
|
25 |
|
26 |
### Dense Model
|
27 |
The dense model that we used consists of:
|
28 |
- An input layer which is embedding output of the Yamnet classifier.
|
29 |
-
- 4 dense hidden layers and 4 dropout layers
|
30 |
- An output dense layer.
|
31 |
|
32 |
<details>
|
@@ -36,7 +30,8 @@ The dense model that we used consists of:
|
|
36 |
|
37 |
</details>
|
38 |
|
39 |
-
|
|
|
40 |
The model achieved the following results:
|
41 |
|
42 |
Results | Training | Validation
|
@@ -61,6 +56,9 @@ native speakers of Southern England, Midlands, Northern England, Wales, Scotland
|
|
61 |
For more info, please refer to the above link or to the following paper:
|
62 |
[Open-source Multi-speaker Corpora of the English Accents in the British Isles](https://aclanthology.org/2020.lrec-1.804.pdf)
|
63 |
|
|
|
|
|
|
|
64 |
---
|
65 |
## Demo
|
66 |
A demo is available in HuggingFace Spaces ...
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
## UK & Ireland Accent Classification Model
|
5 |
|
6 |
+
This model classifies UK & Ireland accents using feature extraction from [Yamnet](https://tfhub.dev/google/yamnet/1).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
### Yamnet Model
|
9 |
Yamnet is an audio event classifier trained on the AudioSet dataset to predict audio events from the AudioSet ontology. It is available on TensorFlow Hub.
|
|
|
13 |
- Embeddings of shape `(N, 1024)`.
|
14 |
- The log-mel spectrogram of the entire audio frame.
|
15 |
|
16 |
+
We will use the embeddings, which are the features extracted from the audio samples, as the input to our dense model.
|
17 |
+
|
18 |
+
For more detailed information about Yamnet, please refer to its [TensorFlow Hub](https://tfhub.dev/google/yamnet/1) page.
|
19 |
|
20 |
### Dense Model
|
21 |
The dense model that we used consists of:
|
22 |
- An input layer which is embedding output of the Yamnet classifier.
|
23 |
+
- 4 dense hidden layers and 4 dropout layers.
|
24 |
- An output dense layer.
|
25 |
|
26 |
<details>
|
|
|
30 |
|
31 |
</details>
|
32 |
|
33 |
+
---
|
34 |
+
## Results
|
35 |
The model achieved the following results:
|
36 |
|
37 |
Results | Training | Validation
|
|
|
56 |
For more info, please refer to the above link or to the following paper:
|
57 |
[Open-source Multi-speaker Corpora of the English Accents in the British Isles](https://aclanthology.org/2020.lrec-1.804.pdf)
|
58 |
|
59 |
+
---
|
60 |
+
## How to use
|
61 |
+
|
62 |
---
|
63 |
## Demo
|
64 |
A demo is available in HuggingFace Spaces ...
|