Update README.md
Browse files
README.md
CHANGED
@@ -26,19 +26,19 @@ It achieves the following results on the evaluation set:
|
|
26 |
|
27 |
Training and evaluation data were augmented with audiomentations [GitHub: iver56/audiomentations](https://github.com/iver56/audiomentations) library and the following augmentation methods have been performed based on previous experiments [Elliott et al.: Tiny transformers for audio classification at the edge](https://arxiv.org/pdf/2103.12157.pdf):
|
28 |
|
29 |
-
|
30 |
- each audio sample is amplified/attenuated by a random factor between 0.5 and 1.5 with a 0.3 probability
|
31 |
|
32 |
-
|
33 |
- a random amount of Gaussian noise with a relative amplitude between 0.001 and 0.015 is added to each audio sample with a 0.5 probability
|
34 |
|
35 |
-
|
36 |
- duration of each audio sample is extended by a random amount between 0.5 and 1.5 with a 0.3 probability
|
37 |
|
38 |
-
|
39 |
- pitch of each audio sample is shifted by a random amount of semitones selected from the closed interval [-4,4] with a 0.3 probability
|
40 |
|
41 |
-
|
42 |
- a random fraction of lenght of each audio sample in the range of (0,0.02] is erased with a 0.3 probability
|
43 |
|
44 |
|
|
|
26 |
|
27 |
Training and evaluation data were augmented with audiomentations [GitHub: iver56/audiomentations](https://github.com/iver56/audiomentations) library and the following augmentation methods have been performed based on previous experiments [Elliott et al.: Tiny transformers for audio classification at the edge](https://arxiv.org/pdf/2103.12157.pdf):
|
28 |
|
29 |
+
**Gain**
|
30 |
- each audio sample is amplified/attenuated by a random factor between 0.5 and 1.5 with a 0.3 probability
|
31 |
|
32 |
+
**Noise**
|
33 |
- a random amount of Gaussian noise with a relative amplitude between 0.001 and 0.015 is added to each audio sample with a 0.5 probability
|
34 |
|
35 |
+
**Speed adjust**
|
36 |
- duration of each audio sample is extended by a random amount between 0.5 and 1.5 with a 0.3 probability
|
37 |
|
38 |
+
**Pitch shift**
|
39 |
- pitch of each audio sample is shifted by a random amount of semitones selected from the closed interval [-4,4] with a 0.3 probability
|
40 |
|
41 |
+
**Time masking**
|
42 |
- a random fraction of lenght of each audio sample in the range of (0,0.02] is erased with a 0.3 probability
|
43 |
|
44 |
|