BiancaZYCao
/

wav2vec2-base-Speech_Emotion_Recognition

Audio Classification

Inference Endpoints

Model card Files Files and versions Community

BiancaZYCao commited on Mar 14

Commit

75f3f10

•

1 Parent(s): 1dda97d

Create README.md

Files changed (1) hide show

README.md +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,48 @@

+This is to deplicate the work of
+[wav2vec2-base-Speech_Emotion_Recognition](https://huggingface.co/DunnBC22/wav2vec2-base-Speech_Emotion_Recognition)
+*Only little changes are made for success run on google colab.*
+### My Version of metrics:
+|Epoch	|Training Loss	|Validation Loss	|Accuracy	|Weighted f1	|Micro f1	|Macro f1	|Weighted recall	|Micro recall	|Macro recall	|Weighted precision	|Micro precision	|Macro precision
+| ----|----|----|----|----|----|----|----|----|----|----|----|----|
+|0	| 1.789200	| 1.548816	| 0.382590	| 0.287415	| 0.382590	| 0.289045	| 0.382590	| 0.382590	| 0.379768	| 0.473585	| 0.382590	| 0.467116 |
+|1	| 1.789200	| 1.302810	| 0.529823	| 0.511868	| 0.529823	| 0.511619	| 0.529823	| 0.529823	| 0.523766	| 0.552868	| 0.529823	| 0.560496 |
+|2	| 1.789200	| 1.029921	| 0.672757	| 0.668108	| 0.672757	| 0.669246	| 0.672757	| 0.672757	| 0.676383	| 0.674857	| 0.672757	| 0.673698 |
+|3	| 1.789200	| 0.968154	| 0.677055	| 0.671986	| 0.677055	| 0.674074	| 0.677055	| 0.677055	| 0.676891	| 0.701300	| 0.677055	| 0.705734 |
+|4	| 1.789200	| 0.850912	| 0.717894	| 0.714321	| 0.717894	| 0.716527	| 0.717894	| 0.717894	| 0.722476	| 0.716772	| 0.717894	| 0.716698 |
+|5	| 1.789200	| 0.870916	| 0.710371	| 0.706013	| 0.710371	| 0.708563	| 0.710371	| 0.710371	| 0.713853	| 0.710966	| 0.710371	| 0.712245 |
+|6	| 1.789200	| 0.827148	| 0.729178	| 0.725336	| 0.729178	| 0.726744	| 0.729178	| 0.729178	| 0.732127	| 0.735935	| 0.729178	| 0.736041 |
+|7	| 1.789200	| 0.798354	| 0.729715	| 0.727086	| 0.729715	| 0.728847	| 0.729715	| 0.729715	| 0.732476	| 0.729932	| 0.729715	| 0.730688 |
+|8	| 1.789200	| 0.799373	| 0.735626	| 0.732981	| 0.735626	| 0.735058	| 0.735626	| 0.735626	| 0.738147	| 0.741482	| 0.735626	| 0.742782 |
+|9	| 1.789200	| 0.810692	| 0.728103	| 0.724754	| 0.728103	| 0.726852	| 0.728103	| 0.728103	| 0.731083	| 0.731919	| 0.728103	| 0.732869 |
+```***** Running Evaluation *****
+  Num examples = 1861 Batch size = 32 [59/59 08:38]
+{'eval_loss': 0.8106924891471863,
+ 'eval_accuracy': 0.7281031703385277,
+ 'eval_Weighted F1': 0.7247543780750472,
+ 'eval_Micro F1': 0.7281031703385277,
+ 'eval_Macro F1': 0.7268519957485492,
+ 'eval_Weighted Recall': 0.7281031703385277,
+ 'eval_Micro Recall': 0.7281031703385277,
+ 'eval_Macro Recall': 0.7310833557439055,
+ 'eval_Weighted Precision': 0.7319188411210771,
+ 'eval_Micro Precision': 0.7281031703385277,
+ 'eval_Macro Precision': 0.732869407033253,
+ 'eval_runtime': 83.3066,
+ 'eval_samples_per_second': 22.339,
+ 'eval_steps_per_second': 0.708,
+ 'epoch': 9.98}
+```
+### Model description
+This model predicts the emotion of the person speaking in the audio sample.
+For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/tree/main/Audio-Projects/Emotion%20Detection/Speech%20Emotion%20Detection
+### Training and evaluation data
+Dataset Source: https://www.kaggle.com/datasets/dmitrybabko/speech-emotion-recognition-en