YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SentimentSound

Overview

This is a deep learning model for Speech Emotion Recognition that can classify audio clips into different emotional states. The model is trained on a dataset of speech samples and can identify emotions such as neutral, calm, happy, sad, angry, fearful, disgust, and surprised.

Model Details

  • Model Type: Hybrid Neural Network (CNN + LSTM)
  • Input: Audio features extracted from 3-second wav files
  • Output: Emotion classification

Supported Emotions

  • Neutral
  • Calm
  • Happy
  • Sad
  • Angry
  • Fearful
  • Disgust
  • Surprised

Installation

Clone the Repository

git clone https://github.com/Vishal-Padia/SentimentSound.git

Dependencies

pip install -r requirements.txt

Usage Example

python emotion_predictor.py

Model Performance

  • Accuracy: 85%
  • Evaluation Metrics: Confusion matrix below

Image

Training Details

  • Feature Extraction:
    • MFCC
    • Spectral Centroid
    • Chroma Features
    • Spectral Contrast
    • Zero Crossing Rate
    • Spectral Rolloff
  • Augmentation: Random noise and scaling applied
  • Training Techniques:
    • Class weighted loss
    • AdamW optimizer
    • Learning rate scheduling
    • Gradient clipping

Limitations

  • Works best with clear speech recordings
  • Optimized for 3-second audio clips
  • Performance may vary with different audio sources

Acknowledgments

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .