File size: 1,457 Bytes
5851ff3
5e30413
5851ff3
 
 
 
 
 
 
 
 
871f4fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5851ff3
871f4fb
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
title: wav2vec2-ser
emoji: 🦀
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 2.8.13
app_file: app.py
pinned: false
---

Wav2Vec2 For Speech Emotion Recognition 

Emotion is an important aspect for the human nature, and understanding it is critical for catering to human services better in this era of digital communication, where speech has been transformed through texts and messages and calls. Speech Emotion Recognition creates a way to classify emotions embedded in speech through careful analysis of lexical, visual, and acoustic features.

Link to the main reference: https://github.com/m3hrdadfi/soxan

Evaluation Scores

Emotions  precision	recall	f1-score	accuracy
anger 0.82	1.00	0.81	
disgust	0.85	0.96	0.85	
fear	0.78	0.88	0.80	
happiness	0.84	0.71	0.78	
sadness	0.86	1.00	0.79
Overall Accuracy: 0.806 or 80.6%

The Wav2Vec2.0 is a pretrained model for Automatic Speech Recognition, and the Wav2Vec2 for Speech Recognition used is fine-tuned using Connectionist Temporal Classification or CTC, to train neural networks for sequential problems mainly including ASR. 

Google Colab Link: https://colab.research.google.com/github/m3hrdadfi/soxan/blob/main/notebooks/Emotion_recognition_in_Greek_speech_using_Wav2Vec2.ipynb#scrollTo=y0xJwDkA3QQR

Competition board for Common Voice: https://paperswithcode.com/dataset/common-voice

---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference