🍼 Baby Cry Analyzer

A machine learning model that classifies baby crying sounds into 11 categories to help parents understand what their baby needs.

📊 Model Performance

Metric	Score
Accuracy	86.38%
F1 Score	85.43%
Precision	86.07%
ROC AUC	0.9864

🏷️ Classes (11 cry types)

Label	Emoji	Meaning
belly pain	🤢	Baby has stomach pain
burping	💨	Baby needs to burp
cold_hot	🌡️	Baby is too cold or hot
discomfort	😣	Baby is uncomfortable
hungry	🍼	Baby is hungry
laugh	😄	Baby is happy/laughing
lonely	🥺	Baby wants attention
noise	🔊	Background noise detected
scared	😨	Baby is startled/scared
silence	🤫	No crying detected
tired	😴	Baby is sleepy

🧠 Model Architecture

Type: Tuned Stacking Ensemble
Base learners: SVM + KNN + ExtraTrees + XGBoost + MLP
Meta learner: SVM (RBF kernel)
Features: MFCC (40) + Chroma (12) + Mel Spectrogram (128) = 180 features
Preprocessing: StandardScaler + SMOTE balancing

📁 Dataset

Source: Baby Cry Pattern Archive
Total samples: 1,450 audio files (after SMOTE: 4,367)
Format: WAV, OGG, 3GP audio files

🚀 How to Use

import joblib
import librosa
import numpy as np
from huggingface_hub import hf_hub_download

model  = joblib.load(hf_hub_download("Nerdy37/baby-cry-analyzer", "best_model.pkl"))
scaler = joblib.load(hf_hub_download("Nerdy37/baby-cry-analyzer", "scaler.pkl"))

label_names = [
    'belly pain', 'burping', 'cold_hot', 'discomfort',
    'hungry', 'laugh', 'lonely', 'noise', 'scared', 'silence', 'tired'
]

def predict(audio_path):
    audio, sr = librosa.load(audio_path, sr=22050, duration=5, mono=True)
    mfcc   = np.mean(librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=40).T, axis=0)
    chroma = np.mean(librosa.feature.chroma_stft(y=audio, sr=sr).T, axis=0)
    mel    = np.mean(librosa.feature.melspectrogram(y=audio, sr=sr).T, axis=0)
    features = scaler.transform([np.hstack([mfcc, chroma, mel])])
    pred = model.predict(features)[0]
    prob = model.predict_proba(features)[0]
    return label_names[pred], f"{max(prob)*100:.1f}%"

label, confidence = predict("baby_cry.wav")
print(f"Prediction: {label} ({confidence})")

⚠️ Limitations

hungry class has lower accuracy (28.7%) due to acoustic similarity with discomfort, burping, and cold_hot
Best used as a support tool, not a medical diagnosis
Works best on clean audio with minimal background noise

👤 Author

Made by Nerdy37 as part of an end-to-end ML pipeline project.

Downloads last month: -

Space using Nerdy37/baby-cry-analyzer 1

Evaluation results

Accuracy on Baby Cry Pattern Archive
self-reported

86.380
F1 Score on Baby Cry Pattern Archive
self-reported

85.430
ROC AUC on Baby Cry Pattern Archive
self-reported

98.640