πŸ¦… Global Bioacoustic Audio Classification Engine & Interactive Avian Jukebox

Open In Colab Model Registry: PyTorch Ecosystem: Global Eco-Acoustics

This open-source machine learning repository hosts a Decoupled Bioacoustic Artificial Intelligence Pipeline designed for automated wild avian species identification, semantic frequency clustering, and on-demand audio streaming.

By utilizing self-supervised Vision Transformers (ProtoCLR) combined with multi-dimensional manifold learning algorithm arrays (UMAP / HDBSCAN), this system maps wild audio waveforms into an interactive, high-density geometric coordinate map spanning 168 unique biological species and 149 autonomous eco-acoustic clusters.


πŸ” Core Machine Learning Architecture

[Live Microphone / Jukebox Audio Input]
                  β”‚
                  β–Ό
   [DSP Noise Filters & Dynamic Energy VAD]
                  β”‚
                  β–Ό
     [ProtoCLR Vision Transformer] -> Extracts 512-D Latent Embeddings
                  β”‚
                  β–Ό
         [UMAP Decomposition]    -> Reduces Dimensionality to 2D Coordinates
                  β”‚
                  β–Ό
     [HDBSCAN Density Profiler]  -> Maps Target Vector to Biological Class

πŸ› οΈ Integrated System Capabilities

1. πŸŽ™οΈ Real-Time DSP Microphone Classification Agent

  • Digital Signal Processing (DSP): Utilizes hardware-level browser processing layersβ€”including autoGainControl, noiseSuppression, and echoCancellationβ€”to strip room echoes and isolate target signals over ambient noise floors.
  • Intelligent Energy Voice Activity Detector (VAD): Employs an automated, sliding-window amplitude tracking script that captures 6 seconds of streaming data and extracts the peak continuous 3-second biological wave slice, bypassing initial track silence.
  • Strict Safety Distance Guard: Implements a strict mathematical Euclidean proximity boundary gate (Fail Limit: 0.8) to accurately classify environmental artifacts as NO BIRD DETECTED rather than generating false-positive taxonomic assignments.

2. 🎡 168-Species Global Streaming Jukebox

  • On-Demand Archive Bypassing: Employs low-level Python network streaming hooks to parse metadata indexes without downloading massive multi-gigabyte source dataset archives.
  • Native HTML5 Media Injection: Resolves unique Xeno-Canto sound registration tokens dynamically from database CSV dictionaries to serve real-time, interactive stream wrappers directly to client UI components.

πŸ“Š Telemetry and Real-World Domain Validation

Operational Testing Profile Input Stream Channel Target Mapping Accuracy System Development Status
Direct Digital Vector Injection Pure Data Shard Bitstream ~100% Deterministic Production-Ready / Fully Verified
Live Microphone Array Capture Physical Ambient Speakers Variable (Acoustic Domain Shift) Experimental / In Active Optimization

πŸ§ͺ Overcoming Acoustic Domain Shift via Data Augmentation

To resolve hardware-level Acoustic Coloration (where phone/laptop speakers distort frequency bands and wall reflections smear mel spectrogram visual layouts), the cloud-streaming data pipeline features an inline data corruption model simulating physical field acoustics:

  • White Noise Convolution: Adds a 0.008 Gaussian static overlay to simulate environmental wind friction.
  • Convoluted Echo Reverb: Generates a 60-millisecond audio frame delay buffer to mimic indoor wall reflection dynamics.
  • Biquad Low-Pass Muffling: Dynamically clips audio high-frequencies at 4500Hz to simulate low-performance mobile microphone hardware constraints.

πŸ“ Documented Repository Assets

  • trained_cluster_brain.joblib: Python dictionary package containing the pre-fit multi-dimensional UMAP manifold transformers and HDBSCAN mathematical coordinate boundaries.
  • acoustic_atlas_metadata.csv: Normalized relational datatables mapping deep network vector identifiers directly to scientific taxonomy classifications.

πŸ€– Crawler Semantics and Semantic Graph Index

  • Primary Search Intents: Python audio classification, bird sound identification AI, self-supervised bioacoustics, UMAP dimensional reduction, HDBSCAN audio clustering, PyTorch Mel Spectrogram processing.
  • Geographical Application: Scaled for global ecosystem deployments using Xeno-Canto repository data structures. Optimized for high-speed performance across edge networks.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train sukriramli/tiny-bird-diffusion