YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Clip-Worthy Audio Detector

This model analyzes audio segments to determine if they contain "clip-worthy" or potentially viral content. The model is fine-tuned from the HuBERT base model on audio clips labeled as clip-worthy vs. non-clip-worthy.

Model Description

  • Model type: Audio classification model based on HuBERT
  • Task: Binary classification (clip-worthy vs. not clip-worthy)
  • Training data: Audio clips from livestreams labeled for virality potential
  • Input: 15-second audio clips (will automatically center-crop longer clips)
  • Output: Classification prediction with confidence scores

Usage

API Inference Endpoint

Send a POST request with your audio file in one of these formats:

# Example with raw audio bytes
import requests

API_URL = "https://api-inference.huggingface.co/models/YOUR_USERNAME/clip-worthy-detector"
headers = {"Authorization": f"Bearer {API_TOKEN}"}

def query(filename):
    with open(filename, "rb") as f:
        data = f.read()
    response = requests.post(API_URL, headers=headers, data=data)
    return response.json()

output = query("audio_sample.wav")

Response Format

{
  "prediction": 1,
  "label": "clip-worthy",
  "confidence": 0.92,
  "probabilities": {
    "not-clip-worthy": 0.08,
    "clip-worthy": 0.92
  }
}

Limitations

  • Optimized for audio segments of 15 seconds (longer audio will be center-cropped)
  • Expects 16kHz sample rate (will automatically resample if different)
  • Performance varies based on audio quality and content type
Downloads last month
3
Safetensors
Model size
94.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support