In this section, we will take a closer look at the
Interface class, and understand the
main parameters used to create one.
You’ll notice that the
Interface class has 3 required parameters:
Interface(fn, inputs, outputs, ...)
These parameters are:
fn: the prediction function that is wrapped by the Gradio interface. This function can take one or more parameters and return one or more values
inputs: the input component type(s). Gradio provides many pre-built components such as
outputs: the output component type(s). Again, Gradio provides many pre-built components e.g.
For a complete list of components, see the Gradio docs . Each pre-built component can be customized by instantiating the class corresponding to the component.
For example, as we saw in the previous section,
instead of passing in
"textbox" to the
inputs parameter, you can pass in a
Textbox(lines=7, label="Prompt") component to create a textbox with 7 lines and a label.
Let’s take a look at another example, this time with an
As mentioned earlier, Gradio provides many different inputs and outputs.
So let’s build an
Interface that works with audio.
In this example, we’ll build an audio-to-audio function that takes an audio file and simply reverses it.
We will use for the input the
Audio component. When using the
you can specify whether you want the
source of the audio to be a file that the user
uploads or a microphone that the user records their voice with. In this case, let’s
set it to a
"microphone". Just for fun, we’ll add a label to our
Audio that says
In addition, we’d like to receive the audio as a numpy array so that we can easily
“reverse” it. So we’ll set the
"type" to be
"numpy", which passes the input
data as a tuple of (
data) into our function.
We will also use the
Audio output component which can automatically
render a tuple with a sample rate and numpy array of data as a playable audio file.
In this case, we do not need to do any customization, so we will use the string
import numpy as np import gradio as gr def reverse_audio(audio): sr, data = audio reversed_audio = (sr, np.flipud(data)) return reversed_audio mic = gr.Audio(source="microphone", type="numpy", label="Speak here...") gr.Interface(reverse_audio, mic, "audio").launch()
The code above will produce an interface like the one below (if your browser doesn’t ask you for microphone permissions, open the demo in a separate tab.)
You should now be able to record your voice and hear yourself speaking in reverse - spooky 👻!
Let’s say we had a more complicated function, with multiple inputs and outputs. In the example below, we have a function that takes a dropdown index, a slider value, and number, and returns an audio sample of a musical tone.
Take a look how we pass a list of input and output components, and see if you can follow along what’s happening.
The key here is that when you pass:
- a list of input components, each component corresponds to a parameter in order.
- a list of output coponents, each component corresponds to a returned value.
The code snippet below shows how three input components line up with the three arguments of the
import numpy as np import gradio as gr notes = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"] def generate_tone(note, octave, duration): sr = 48000 a4_freq, tones_from_a4 = 440, 12 * (octave - 4) + (note - 9) frequency = a4_freq * 2 ** (tones_from_a4 / 12) duration = int(duration) audio = np.linspace(0, duration, duration * sr) audio = (20000 * np.sin(audio * (2 * np.pi * frequency))).astype(np.int16) return (sr, audio) gr.Interface( generate_tone, [ gr.Dropdown(notes, type="index"), gr.Slider(minimum=4, maximum=6, step=1), gr.Textbox(type="number", value=1, label="Duration in seconds"), ], "audio", ).launch()
So far, we have used the
launch() method to launch the interface, but we
haven’t really discussed what it does.
By default, the
launch() method will launch the demo in a web server that
is running locally. If you are running your code in a Jupyter or Colab notebook, then
Gradio will embed the demo GUI in the notebook so you can easily use it.
You can customize the behavior of
launch() through different parameters:
inline- whether to display the interface inline on Python notebooks.
inbrowser- whether to automatically launch the interface in a new tab on the default browser.
share- whether to create a publicly shareable link from your computer for the interface. Kind of like a Google Drive link!
We’ll cover the
share parameter in a lot more detail in the next section!
Let’s build an interface that allows you to demo a speech-recognition model. To make it interesting, we will accept either a mic input or an uploaded file.
As usual, we’ll load our speech recognition model using the
pipeline() function from 🤗 Transformers.
If you need a quick refresher, you can go back to that section in Chapter 1. Next, we’ll implement a
transcribe_audio() function that processes the audio and returns the transcription. Finally, we’ll wrap this function in an
Interface with the
Audio components for the inputs and just text for the output. Altogether, the code for this application is the following:
from transformers import pipeline import gradio as gr model = pipeline("automatic-speech-recognition") def transcribe_audio(mic=None, file=None): if mic is not None: audio = mic elif file is not None: audio = file else: return "You must either provide a mic recording or a file" transcription = model(audio)["text"] return transcription gr.Interface( fn=transcribe_audio, inputs=[ gr.Audio(source="microphone", type="filepath", optional=True), gr.Audio(source="upload", type="filepath", optional=True), ], outputs="text", ).launch()
If your browser doesn’t ask you for microphone permissions, open the demo in a separate tab.
That’s it! You can now use this interface to transcribe audio. Notice here that
by passing in the
optional parameter as
True, we allow the user to either
provide a microphone or an audio file (or neither, but that will return an error message).
Keep going to see how to share your interface with others!