xpariz10 commited on
Commit
460f417
1 Parent(s): 93cf0ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -14
README.md CHANGED
@@ -12,9 +12,6 @@ model-index:
12
  results: []
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
-
18
  # ast-finetuned-audioset-10-10-0.4593_ft_ESC-50_aug_0-1
19
 
20
  This model is a fine-tuned version of [MIT/ast-finetuned-audioset-10-10-0.4593](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593) on a subset of [ashraq/esc50](https://huggingface.co/datasets/ashraq/esc50) dataset.
@@ -25,19 +22,20 @@ It achieves the following results on the evaluation set:
25
  - Recall: 0.9286
26
  - F1: 0.9244
27
 
28
- ## Model description
29
-
30
- More information needed
31
-
32
- ## Intended uses & limitations
33
-
34
- More information needed
35
-
36
  ## Training and evaluation data
37
 
38
- More information needed
39
 
40
- ## Training procedure
 
 
 
 
 
 
 
 
 
41
 
42
  ### Training hyperparameters
43
 
@@ -68,6 +66,18 @@ The following hyperparameters were used during training:
68
  | 0.4237 | 9.0 | 252 | 0.6443 | 0.9286 | 0.9449 | 0.9286 | 0.9244 |
69
  | 0.3709 | 10.0 | 280 | 0.6304 | 0.9286 | 0.9449 | 0.9286 | 0.9244 |
70
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
  ### Framework versions
73
 
@@ -75,4 +85,3 @@ The following hyperparameters were used during training:
75
  - Pytorch 2.0.0
76
  - Datasets 2.10.1
77
  - Tokenizers 0.13.2
78
-
 
12
  results: []
13
  ---
14
 
 
 
 
15
  # ast-finetuned-audioset-10-10-0.4593_ft_ESC-50_aug_0-1
16
 
17
  This model is a fine-tuned version of [MIT/ast-finetuned-audioset-10-10-0.4593](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593) on a subset of [ashraq/esc50](https://huggingface.co/datasets/ashraq/esc50) dataset.
 
22
  - Recall: 0.9286
23
  - F1: 0.9244
24
 
 
 
 
 
 
 
 
 
25
  ## Training and evaluation data
26
 
27
+ Training and evaluation data were augmented with audiomentations [GitHub: iver56/audiomentations](https://github.com/iver56/audiomentations) library and the following augmentation methods have been performed based on previous experiments [Elliott et al.: Tiny transformers for audio classification at the edge](https://arxiv.org/pdf/2103.12157.pdf):
28
 
29
+ Gain
30
+ - each audio sample is amplified/attenuated by a random factor between 0.5 and 1.5 with a 0.3 probability
31
+ Noise
32
+ - a random amount of Gaussian noise with a relative amplitude between 0.001 and 0.015 is added to each audio sample with a 0.5 probability
33
+ Speed adjust
34
+ - duration of each audio sample is extended by a random amount between 0.5 and 1.5 with a 0.3 probability
35
+ Pitch shift
36
+ - pitch of each audio sample is shifted by a random amount of semitones selected from the closed interval [-4,4] with a 0.3 probability
37
+ Time masking
38
+ - a random fraction of lenght of each audio sample in the range of (0,0.02] is erased with a 0.3 probability
39
 
40
  ### Training hyperparameters
41
 
 
66
  | 0.4237 | 9.0 | 252 | 0.6443 | 0.9286 | 0.9449 | 0.9286 | 0.9244 |
67
  | 0.3709 | 10.0 | 280 | 0.6304 | 0.9286 | 0.9449 | 0.9286 | 0.9244 |
68
 
69
+ ### Test results
70
+ | Parameter | Value |
71
+ |:------------------------:|:------------------:|
72
+ | test_loss | 0.5829914808273315 |
73
+ | test_accuracy | 0.9285714285714286 |
74
+ | test_precision | 0.9446428571428571 |
75
+ | test_recall | 0.9285714285714286 |
76
+ | test_f1 | 0.930292723149866 |
77
+ | test_runtime (s) | 4.1488 |
78
+ | test_samples_per_second | 6.749 |
79
+ | test_steps_per_second | 3.374 |
80
+ | epoch | 10.0 |
81
 
82
  ### Framework versions
83
 
 
85
  - Pytorch 2.0.0
86
  - Datasets 2.10.1
87
  - Tokenizers 0.13.2