Edit model card

OpenAI Whisper Inference Endpoint example

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.

For more information about the model, license and limitations check the original repository at openai/whisper.

This repository implements a custom handler task for automatic-speech-recognition for 🤗 Inference Endpoints using OpenAIs new Whisper model. The code for the customized pipeline is in the pipeline.py.

There is also a notebook included, on how to create the handler.py


The endpoint expects a binary audio file. Below is a cURL example and a Python example using the requests library.


# load audio file
wget https://cdn-media.huggingface.co/speech_samples/sample1.flac

# run request
curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: audio/x-flac' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@sample1.flac'


import json
from typing import List
import requests as r
import base64
import mimetypes


def predict(path_to_audio:str=None):
    # read audio file
    with open(path_to_audio, "rb") as i:
      b = i.read()
    # get mimetype
    content_type= mimetypes.guess_type(path_to_audio)[0]

    headers= {
        "Authorization": f"Bearer {HF_TOKEN}",
        "Content-Type": content_type
    response = r.post(ENDPOINT_URL, headers=headers, data=b)
    return response.json()

prediction = predict(path_to_audio="sample1.flac")


expected output

{"text": " going along slushy country roads and speaking to damp audiences in draughty school rooms day after day for a fortnight. He'll have to put in an appearance at some place of worship on Sunday morning, and he can come to us immediately afterwards."}
Downloads last month
Inference API
Inference API (serverless) has been turned off for this model.